It is rather clichéd to say that two heads are better than one and, thus, two cores must be better than one. And yes, we are referring only to desktop usage in this article and not servers. Now, why would a typical desktop need two cores? Consider what's running on the same system today. Other than the primary software the user needs for his role (word processor, spreadsheet, accounting software, etc), there would be e-mail and instant messaging clients along with anti virus and firewall programs. Corporate monitoring and update software clients are another set of applications that run on the system. And all the software we just mentioned have grown in resource usage in the last few years-they need faster and more efficient processors to say the least. As a user, you too have grown intolerant of having to wait while something is being processed.
|
Server vendors solved the problem by letting you add more processors to the box. But this is far too high-end (or is it?) for typical desktop workloads. The easier thing to do then was to add more logical cores into one physical processor and, thus, increase the quantity and speed of what could be accomplished. The Celeron D, Pentium D and the Athlon64 FX and Athlon64 X2 are products that aimed to deliver this. Each of them is a single physical 'die' within which are two connected logical processing blocks. While the Intel line up seeks to build around its 32-bit technology, AMD takes the 64-bit route. The latest is the Intel Core 2, that we have reviewed in this issue (refer Page 136). Each evolution has successively raised the bar of what can be delivered and the expectations from the next iteration.
Has it gotten better?
So, has the inclusion of the second logical core into the processor made things any better? Speeds certainly have not doubled, since each core of dual core is clocked lower than what it would be running as a single-core processor. But performance is not a direct function of processor clock speed. It is a combination of many factors such as the instruction queue length, cache memory, bus speeds and the efficiency of various algorithms used within the processor, such as optimization, pre-parsing, pre-fetching, path prediction.
A multi-core CPU contains logical processor 'cores'. In this dual core version, the second core is cast in a mirror-image of the first one |
Let's take the latest CPU (the Core 2 from Intel) and see what makes it go so fast. There are several improvements, but two big ones are that the instruction queue length has been reduced and the wideness of the instruction pipeline is also bigger. The instruction queue is what holds the various commands to be processed by the CPU. The longer it is, the more the wait times as the pre-processors and pre-parsers try to simplify and optimize the instructions. This was 31 in the P4s and this has now come down to just 14-that's a huge theoretical gain of 45%.
But this, in itself, is not enough to improve things. Most processors, including the P4s and the Athlons, can process only 3 instructions in one clock cycle. The Core 2 takes this a notch higher to 4 per cycle. This is why the Core 2 is said to have a '4-wide' architecture. This adds another theoretical 33% over the P4 and Athlon CPUs. Now, several factors can still make things go slower than optimal. For instance, there is an optimization algorithm that combines instructions that can be combined into one to make things go faster. If the prediction here is wrong, the performance can take a hit while the commands and data are re-loaded and re-executed.
CPUs have, since a long time, carried an on-die memory cache (called L1, L2 and L3). This is a piece of memory within the processor (running at or near processor speeds) that can hold some instructions and data and save the time it would otherwise spend waiting to load this information from the system RAM (that generally runs at a lower speed).
The Intel Core 2 has two cores and a common L2 cache nicknamed the 'Advanced SmartCache' |
Now, when you consider a dual core CPU has having two logical processors inside, it is natural to draw that picture with each core having its own cache. The problem with that picture is that if you load a piece of data (not instructions) into a cache block in one of the cores and an instruction in the other core requires access to it, then you have a problem. AMD's dual core processors use this model, along with an on-die memory controller that tries to reduce the delays incurred.
The FSB can also play the bottleneck and make the CPU waste cycles waiting for things to come its way. In our tests, we found that if you overclock your FSB to put it at 1333 MHz, you would see a performance gain in the range of 5 to 15% depending on the application being run. Intel's Core 2 combines the L2 cache and makes it common for both cores and this reduces the wait times. As a result, instructions and data are pre-fetched into a common L2 cache, where they are pre-parsed and optimized and then sent to one of the two cores. This speeds things up quite a bit. But the current Intel architecture does not use an on-die memory controller.
More cores
If this trend continues, it seems to imply that somewhere down the line, two cores will be insufficient and we will need more. For this, we will need to wait and watch what the software ecosystem that will run on these future systems will be like. In fact, what the future computer is going to be like too. For instance, virtualization that used to be the preserve of the server world is now big at the desktop too. The Core 2 duo supports Intel's VT technology. They are also including at the hardware level much more control and support for software virtualization layers (like VMware and MS Virtual Server). This is definitely going to increase the workload and the complexity of the workload at the desktop level. This definitely implies that some five years from now, this article would be talking to you about the pros and cons of six or eight core processors for your next desktop purchase.
Should you upgrade?
The technology is still new and applications are still few. Only the new OSs and versions of application software have support for all the new things the multi-cores can do. Therefore, we would say there is no rush at the moment to replace your existing computer fleet with shiny dual cores. But you will have to consider the 32-bit dual cores, if not the 64-bit ones, every time from now that you're at the negotiating table to buy. This technology can only get better and cheaper. Your IT can do more and we do not need to spell out what this can entail for the rest of your organization. That is our bottom line for you.
Source: PC Quest