Reducing the amount of L2 cache has been a popular way to reduce manufacturing costs and how much the size of L2 cache affects performance has been discussed for a long time. In this article we will dig deeper with a long series of benchmarks that will give us clearcut evidence of when and how much the L2 cache of the Core architecture plays a significant role.

The amount of second level cache, also known as L2 cache, has been varying quite a lot through the past few years. A large amount of this kind of cache improves the performance of the processor as the processor doesn’t have to request information from the RAM as often, which is many times slower. In the stone age of computers this memory was located on the motherboard, and was then, as the demand for lower latencies and speed increased, moved closer and closer to the processor. Nowadays the L2 cache isn’t just a part of silicon, it makes up a majority of the total number of transistors. This is one of the reasons it is not economically feasible to put as much memory as possible on a processor – the yield of processors without defects goes down as the amount of transistors goes up. But at the same time the manufacturing processes are refined and improve the yield of fully functional processors. A positive result of this is when the manufacturers launch lower-range processors with more cache as well as the higher-end models.






To elaborate further, you can remove all of the cache and make a really cheap processor. It will not only be very cheap, but also extremely slow. But how much does the second level of cache really influence and affect the performance of today’s Core 2 Duo processors? We at NordicHardware have gathered three processors, with 4MB, 2MB and 1MB respectively, to investigate this more thoroughly.

We start with the test system.














































Test system
Hardware
Motherboard Abit IP35 Pro
Processors Intel Core 2 Duo E6320 (4MB)
Intel Core 2 Duo E6300

(2MB)
Intel Pentium E2140 (1MB)

Memory Corsair Dominator 8500C5DF (2048MB)
Graphics card NVIDIA GeForce 8800GTX
Power supply Silverstone Zeus 850W
Software
Operating system Windows XP (SP2)
Drivers Intel Chipset Driver 8.3.0.1013
NVIDIA Forceware 158.22
Benchmarks EVEREST Ultimate Edition 4.00.976
SuperPi 1.5
wPrime 1.52
Cinebench 9.5
Lame 3.97
WinRAR 3.70
3DMark2001 3.3.0

3DMark03 3.6.0

3DMark05 1.2.0

3DMark06 1.0.2

PCMark05 1.1.0

FarCry 1.33

Doom 3

Quake 4



The motherboard we’ve used throughout the tests is the Abit IP35 Pro,

based on the Intel P35 chipset. To make things as fair as possible we set the multiplier and FSB to the exact same value, no matter the stock speeds of different processors. All tests were performed at 7×333, which results in a clock frequency of 2.33GHz, which equals an Intel Core 2 Duo

E6550. This means that the only thing that separates the three processors is the amount of L2 cache.









:: Intel Core 2 Duo E6320 Validation ::


:: Intel Core 2 Duo E6300 Validation ::


:: Intel Pentium E2140 Validation ::

First we have a couple of synthetic benchmarks.













Everest’s CPU benchmarks don’t give rise to any shocking differences. Queen and Zlib are not at all influenced by the amount of L2 cache; while PhotoWorxx shows an improvement of about 8% due to the quadruple cache.


We move on to the well-known SuperPi.













With more available cache, SuperPi scales very well and we can see a clear difference already with the 1M test. The 2MB version performs well, while the 1MB CPU lags considerably.


We move on to wPrime and Cinebench.
















The idea behind wPrime is similar to that of SuperPi, but the difference is that the application is multithreaded. Since all cores share the total amount of L2 cache

we were hoping to see some quite large differences here, but as you can see in the tables above, there are only marginal differences, well within the margin of error. The performance differences in Cinebench are very small, but still show a small advantage for processors with larger caches.


We move on to some more practical tests.










In previous reviews we’ve concluded that it’s very hard to squeeze any performance out of Lame without raising the clock frequency, which also shows in this benchmark. WinRAR on the other hand really thirsts for bandwidth and can present rather extreme differences from minimal changes of settings. Moving from 1MB L2 cache to 4MB increase performance by about 14%.


We move on to the 3DMark series.
















3DMark2001 flies off with an astonishing 20% lead, while the rest of the benchmarks show a more healthy increase of 5 to 8%. We’re starting to see some tendencies of the more random benchmarks we run, the more effect the amount of cache has. We still have some benchmarks left to run to check if our thesis is correct.


3DMark06 CPU and PCMark05 are next.













Here we can clearly see that the amount of cache doesn’t just help in the more intensive CPU benchmarks, such as the 3DMark CPU test, but also in the other sub-tests. The difference in performance of the CPU test is only about 2% while the total score of 3DMark06 went up by 5%. The processor benchmark of PCMark05 presents minimal differences, while the memory benchmarks reveal an advantage of 3% with the 4MB model.


We finish with some game tests.













To believe that such small differences as a measly MB here or there makes anything but a marginal difference, in anything but synthetic tests, turns out to be completely foolish. Here we see performance differences of 15% to 22%. The 2MB model is relatively steady in between the other two, but the margin between the top and bottom model varies from game to game.

We conclude our impressions on the next page.


Performance

The effect of L2 cache on the overall system performance depends on the application. You can’t say that a certain kind of application performs better with more cache than another; it is strictly dependent on how the application is designed and what amount of data that is required for the calculations. What we do know and can say for certain is that the performance improvements that are achieved when moving from 1MB to 4MB isn’t just noticed in the most extreme of cases, but also in everyday applications, and both old and new games.



Product value

So far we haven’t discussed the product value of the processors. A direct comparison isn’t possible as there is no common speed in the product assortment. The perhaps best comparison would be between E6320, E4300 and E2160. The prices today are about 1430kr, 1026kr and 783kr. If we’re talking performance per buck, E2160 wins hands down, since it costs no more than half of E6320. On the other hand, the E6320 model adds value on the performance front in the form of a higher initial bus frequency, for those that don’t overclock. If we add overclocking to the discussion we can once again have to go back to recommending the cheaper models as they both have higher multipliers.



The future

At the moment we have a price war where both parties are awaiting the move of the other, and right now both of them are about to launch a new generation of processors. An interesting relation to this article is that Intel will equip its coming Core 2 Duo processor series with an additional 2MB of L2 cache, up to a total of 6MB. Even if we can’t expect a linear scaling of the performance from the 4MB models of today, we feel pretty confident when we can say that that a lot of 3DMark records will be blown away. Preliminary test results around the web strengthen our beliefs.



Conclusion

Intel has been broadening its assortment of Core 2 Duo processors constantly since the launch about a year ago. The assortment working at about 2GHz has never been broader when it comes to clock frequencies, bus speeds and L2 cache. When we fixate all of the variables except the L2 cache we can still see difference in performance of more than 20% in multiple benchmarks and practical tests. If the stock performance is what matters it’s the E6000 you should go for. If it’s the price it’s the E2000 series that you should buy. The E4000 series offers a mix of both and with its high multipliers it’s an attractive purchase for the price conscious overclocker.









Intel L2 cache comparison


Intel Core 2 Duo E6000 series

+ Best performance

+ High bus frequency



Intel Core 2 Duo E4000 series

+ Good compromise between price and performance

+ High multiplier



Intel Pentium E2000 series

+ Good value

+ High multiplier




We want to thank Intel Sweden and Overclockers.se for lending samples.

Leave a Reply

Please Login to comment
  Subscribe  
Notifiera vid