The mystery surrounding NVIDIA Kepler is starting to clear after a very long wait, and the first information regarding the GK104 chip is out. We see an architecture that is very different from Fermi, and potentially brings a lot more performance.

Kepler will be NVIDIA’s first architecture at 28nm, and while it is late to the party it looks like it will be worth waiting for. The circuit we are talking about is not the flagship of the Kepler family, GK110, but GK104 that will replace GF114 and GeForce GTX 560 Ti.

Big changes in the Kepler architecture

The biggest change with Kepler is that there is no shader frequency anymore, there is just GPU frequency. This is a compromise to make room for more CUDA cores in the Kepler architecture, which will still be clocked higher than before. Each Stream Multiprocessor will contain 96 CUDA cores, unlike 32 – 48 that Fermi had.

The change in layout of the CUDA cores and clock frequency is most likely a way for NVIDIA to get more performance from the circuit. GK104 will sport up to 1536 CUDA cores, which is a big boost from GF110 and GTX 580. This is without the Shader frequency, which will reduce the efficiency per core. The number of texture units have doubled, but only 32 raster units in GK104, unlike 48 in GF110. This doesn’t have to be bad though, since the raster units in Kepler can be more fficient than those in Fermi.

NVDA_Kepler_GK104_Mockup_68Iillustration of GK104 CUDA core arrangement

The clock frequency of GK104, that will most likely become GeForce GTX 660 (NVIDIA has not yet decided between GTX 660, 670 and 680), will be somewhere around 950 – 1,000 MHz. NVIDIA is tuning the BIOS and energy saving functions, so it is yet to be decided how high it can clock the GPU without the energy consumption going out of hand. There is talk about NVIDIA wanting to stay below 225W, but this is just rumors. Clock frequencies is the last thing decided though, and this should definitely be taken with a pinch of salt.

The graphics memory will be set to 5 GHz exactly, which means NVIDIA has solved the problems in Fermi where the memory controller couldn’t handle these speeds.

Model GeForce GTX 580 GeForce GTX 660(?)
Architecture Fermi Kepler
Circuit GF110 GK104
Node 40nm 28nm
Size 520 mm² ~340 mm²
Stream Multiprocessors 16 16
CUDA cores
512 1536
CLock frequencies
772 MHz 950 – 1000 MHz
Shader frequency 1544 MHz
FLOPS (SP) 1581 GFLOPS 2918 – 3072 GFLOPS
FLOPS (DP) 790 GFLOPS 486 – 512 GFLOPS*
Texture units 64 128
ROPs 48 32
Memory bus
384-bit 256-bit
Memory buffer
Memory frequency
1,002 MHz (4,008 MHz effectively) 1,250 MHz (5,000  MHz effectively)
Memory bandwidth
192.4 GB/s 160 GB/s
TBP Up to 250W Up to 225W(?)

*Only applies to GeForce. Tesla/Quadro models are 1/2 at SP GFLOPS

The result of the new architecture is a mixed blessing compared to the current flagship, but overall it has to be considered better seen to the potential performance. According to many sources it will be enough to compete with Radeon HD 7950, but even HD 7970 sometimes. This is very much like the position NVIDIA has with GTX 560 Ti against competing proucts.

NVIDIA has already initiated pre-sale activities, that is it has contacted partners and informe them about what is about to come. Besides giving partners time to prepare sales material, we hope it will get some time to develop custom boards for GK104. This further strengthens our belief that it will launch in March-April.

Bright Side Of News* seems confident about the above mentione, and says the information comes from another source 3DCenter that reported on similar specifications the day before. All in all, it all seems likely, but we still recommend our readers to take it in with a pinch of salt.

Leave a Reply

Please Login to comment
1 Comment threads
0 Thread replies
Most reacted comment
Hottest comment thread
1 Comment authors
S@M Recent comment authors
senaste äldsta flest röster
Notifiera vid