The mystery surrounding NVIDIA Kepler is starting to clear after a very long wait, and the first information regarding the GK104 chip is out. We see an architecture that is very different from Fermi, and potentially brings a lot more performance.
Kepler will be NVIDIA’s first architecture at 28nm, and while it is late to the party it looks like it will be worth waiting for. The circuit we are talking about is not the flagship of the Kepler family, GK110, but GK104 that will replace GF114 and GeForce GTX 560 Ti.
Big changes in the Kepler architecture
The biggest change with Kepler is that there is no shader frequency anymore, there is just GPU frequency. This is a compromise to make room for more CUDA cores in the Kepler architecture, which will still be clocked higher than before. Each Stream Multiprocessor will contain 96 CUDA cores, unlike 32 – 48 that Fermi had.
The change in layout of the CUDA cores and clock frequency is most likely a way for NVIDIA to get more performance from the circuit. GK104 will sport up to 1536 CUDA cores, which is a big boost from GF110 and GTX 580. This is without the Shader frequency, which will reduce the efficiency per core. The number of texture units have doubled, but only 32 raster units in GK104, unlike 48 in GF110. This doesn’t have to be bad though, since the raster units in Kepler can be more fficient than those in Fermi.
Iillustration of GK104 CUDA core arrangement
The clock frequency of GK104, that will most likely become GeForce GTX 660 (NVIDIA has not yet decided between GTX 660, 670 and 680), will be somewhere around 950 – 1,000 MHz. NVIDIA is tuning the BIOS and energy saving functions, so it is yet to be decided how high it can clock the GPU without the energy consumption going out of hand. There is talk about NVIDIA wanting to stay below 225W, but this is just rumors. Clock frequencies is the last thing decided though, and this should definitely be taken with a pinch of salt.
The graphics memory will be set to 5 GHz exactly, which means NVIDIA has solved the problems in Fermi where the memory controller couldn’t handle these speeds.
Model | GeForce GTX 580 | GeForce GTX 660(?) |
Architecture | Fermi | Kepler |
Circuit | GF110 | GK104 |
Node | 40nm | 28nm |
Size | 520 mm² | ~340 mm² |
Stream Multiprocessors | 16 | 16 |
CUDA cores |
512 | 1536 |
CLock frequencies |
772 MHz | 950 – 1000 MHz |
Shader frequency | 1544 MHz | – |
FLOPS (SP) | 1581 GFLOPS | 2918 – 3072 GFLOPS |
FLOPS (DP) | 790 GFLOPS | 486 – 512 GFLOPS* |
Texture units | 64 | 128 |
ROPs | 48 | 32 |
Memory bus |
384-bit | 256-bit |
Memory buffer |
1.5 GB GDDR5 | 2 GB GDDR5 |
Memory frequency |
1,002 MHz (4,008 MHz effectively) | 1,250 MHz (5,000 MHz effectively) |
Memory bandwidth |
192.4 GB/s | 160 GB/s |
TBP | Up to 250W | Up to 225W(?) |
*Only applies to GeForce. Tesla/Quadro models are 1/2 at SP GFLOPS
The result of the new architecture is a mixed blessing compared to the current flagship, but overall it has to be considered better seen to the potential performance. According to many sources it will be enough to compete with Radeon HD 7950, but even HD 7970 sometimes. This is very much like the position NVIDIA has with GTX 560 Ti against competing proucts.
NVIDIA has already initiated pre-sale activities, that is it has contacted partners and informe them about what is about to come. Besides giving partners time to prepare sales material, we hope it will get some time to develop custom boards for GK104. This further strengthens our belief that it will launch in March-April.
Bright Side Of News* seems confident about the above mentione, and says the information comes from another source 3DCenter that reported on similar specifications the day before. All in all, it all seems likely, but we still recommend our readers to take it in with a pinch of salt.
GK112 Here !!! 😮
http://www.arabpcworld.com/?p=3723