Next generation Radeon

0

We’ve been nagging about a reference FX for many months now. We’ve tried nVidia themselves as well as many larger manufacturers and all we get is “it’ll hopefully arrive soon”. It’s been like that since December 2002. I don’t say this to bad mouth the FX or nVidia (or their partners) what I’m meaning to say is that we’ve tried really hard to get a hold of an FX sample without luck.. Then “out of nowhere” the Radeon 9800 Pro arrives at our doorstep.

Early in January
the staff of NordicHardware made some jokes about R350, the "sequel"
to Radeon 9700 Pro, arriving in stores before GeForce FX. When we later on
got our invitation to ATi’s NDA Briefing in early March someone made a joke
about us receiving a reference R350 before we got our reference NV30.
We’ve been nagging about a reference FX for many months now. We’ve tried nVidia
themselves as well as many larger manufacturers and all we get is "it’ll
hopefully arrive soon". It’s been like that since December 2002. I don’t
say this to bad mouth the FX or nVidia (or their partners) what I’m meaning
to say is that we’ve tried really hard to get a hold of an FX sample without
luck.. Then "out of nowhere" the Radeon 9800 Pro arrives at our
doorstep.

R350, i.e. Radeon 9800,
Radeon 9800 Pro and Radeon 9800 Pro 256 MB, is not the only GPU ATi showed
us a month ago.
We were briefed on RV350 which is the basis for the two "mainstream"
products Radeon 9600 and Radeon 9600 Pro. And information about the successor
to the Radeon 9000 namely Radeon 9200 and Radeon 9200 Pro were also handed
to us.
We even got info on what ATi has in store for some months ahead and we’ll
be back when that NDA is lifted.

Since Radeon
9800 Pro is the card that ATi sent us for this preview we will mostly concentrate
on R350 and especially the 9800 Pro.


Since this is
only a reference sample we didn’t get any retail box nor accompanying software
etc.










































































Radeon 9800 Pro



Chip:

R350


Manufacturing process:

0.15-micron


Transistors:

~115 mil.


Core clock speed:

378 MHz


Memory clock speed:

675 MHz / 21.1 GB/s



Pixel Shader:

2.0


Vertex Shader:

2.0


Pixel
Pipelines/Pixel Fillrate:

8
/ 3024 MP/s


TMU’s/Texel Fillrate:

1
/ 3024 MT/s


RAMDAC:

(2) 400 MHz


Amount of memory:

128 MB


Type of memory and interface:

256-bit, DDR-SDRAM


In- and outputs:

VGA, DVI-I, S-Video Ut


Extra
peripherals:



Software:



Full version applications:



Estimated price:

399 USD

Only looking
at these specifications might lead one to believe that this is only an overclocked
Radeon 9700 Pro. The clock frequencies are within the limits of what an overclocked
R300 would reach, the question is what other tricks ATi has up their sleeves,
if any.










The fan

The chip is still built
with 0.15-micron technology since ATi chose to use the RV350 as a "guinea
pig" for 0.13 instead (mainly since the RV350 isn’t as complex as R350).
Still being on 0.15 doesn’t necessarily mean much in itself as the R300 proved,
though it does mean that it can’t reach crazy GPU clock frequency’s
like the FX 5800 Ultra. We were hoping, and speculating, that the R350 would
be available at frequencies of 400/800 MHz but apparently 378/675 MHz will
have to do. Personally I’m mostly disappointed in the "slow" memory.
I was expecting at least 700 MHz.
The 9800 Pro we had in our test labs is equipped with 128 MB DDR, ATi however
said that we’ll see a 256 MB version in April.
The fan cooling the new core it self has a new design. Presumably it cools
a bit better and of course is a bit better looking than the old black standard
ATi fan. On the negative side of things we have the noise, this new fan simply
sounds a bit more than those we’ve encountered on the 9700 Pro. I wouldn’t
say it’s a loud noise, actually its pretty standard when it comes to videocards
and of course nowhere near the reported 50-70 db of the GeForce FX 5800 Ultra.
EDIT: After looking at some other previews it seems as if some people got
cards clocked at the promised 381/680.










Back

675 MHz DDR-SDRAM on a 256 bits bus is nothing to sneeze at.
nVidia with their 1000 MHz 128 bit memory falls short here and the 9800 Pro
has almost 40 % higher bandwidth then the FX 5800 Ultra. One of NV30s advantages
is the texel fillrate though and 9800 Pro doesn’t have the clock speeds to
challenge that.
ATi’s new high end product spews out 3000+ million texels per second while
the FX 5800 Ultra takes the cake with its 4000 millions.
When we start to talk pixel fillrate matter turn more controversial. Recent
findings points out that the FX isn’t a "real" 8×1 architecture.
It does in fact manage to process 8 pixels per clock cycle in certain situations,
but sometimes it does act just like a traditional 4×2 architecture. The results?
Well sometimes the FX has a pixel fillrate of “only” 2000 MP/s
but sometimes it’s 4000 MP/s. We’ll let you decide what conclusions there
are to be drawn from this.










Output

The reference R350 offers
the same available connection types as the R300: S-Video, DVI and VGA.
One thing worth mentioning is that the reference board we saw back in March
had a "slot" where you could put a daughter card. This option was
to be used for AIW versions. You can still see the marks of where it should
have been placed (look at the upper picture on this page, just "beneath"
the fan). The slot will probably be there on the AIW version but they decided
to remove it on the normal versions.
Having upgradeability options would have been cool though. I.e. you go out
and buy a 9800 Pro today and later on when/if an AIW version is available
you can purchase the add-on card separately. Well so much for that idea. (To
make it clear: ATi did not indicate such plans themselves, I’m just speculating
here about what could have been done if they decided to leave the slot as
it were.)

Before we let
you have a look at the performance tests we thought we’d take a closer look
at the three new architectures and the 7 new products based on those. R350
is the chip that Radeon 9800, 9800 Pro and 9800 Pro 256 MB are based at, let’s
take a peek.


First let’s
take a closer look at the three R350-based products:

Unfortunately
ATi still hasn’t disclosed all the clock frequencies to us yet. However we
do have the 9800 Pro which is clocked at 378/675 MHz. When it comes to the
non-Pro and the 256 MB version they are going to be clocked at 325/620 and
400/920 respectively, according to FIC.
During the presentation someone from ATi staff mumbled something about 900
MHz DDRII so this does seem kind of likely. If this means that it will actually
have 920 Mhz memory remains to be seen.






Price and availability

  • Radeon
    9800 Pro 256 MB 499 USD, April
  • Radeon 9800
    Pro 128 MB 399 USD, March
  • Radeon 9800
    128 MB 349 USD, March

Prices seem to be pretty
much what we expected.
499 USD seems a bit too much for the 256 MB version though, but if it truly
has 920 Mhz memory it might well be worth it.






Architectural evolution

Before we continue
we’d like to note that the R350 is an improved R300, the architectural
differences aren’t all that great. That’s not saying that the few changes
are unimportant
but the chip itself is very similar to the R300:


  • Optimized
    memory controller (Smoothvision 2.1)
  • Optimized
    Z-cache (Hyper Z III+)
  • Improved
    Pixel Shaders thanks to a new fragment stream FIFO-buffer, unlimited
    set of PS-instructions (Smartshader 2.1)
  • Redesigned PCB
  • DDRII (256
    MB version only)

Judging by the
information ATi shared with us, Smoothvision 2.1 and Hyper Z III+ seem like
somewhat marginal
improvements over their predecessors. When it comes to Hyper Z III+ we learned
that it’s now highly optimized at dealing with stencil buffers which in turn
helps out with shadow volumes which should have a positive impact on Doom
III
performance for an example.
Smoothvision 2.1 doesn’t seem to have undergone any improvements at all in it
self. ATi however referred to their optimized memory controller which is now
tweaked to bring us even more AA and AF performance. Of course the benefits
from the upgraded Hyper Z III will also help out here.

The F-buffer enables the R350 to handle more complex shaders, in fact with a
virtually unlimited set of instructions, and without resulting in traditional
multipass rendering.
With a rasterization order F-buffer ATi can render much more efficiently then
they would otherwise, for an example there’s no need to process geometry more
than once.
Smartshader 2.1 is in short, and in my humble opinion, the largest architectural
advantage that R350 has over R300. We can make an example by comparing two identical
sequences, first in a single pass, then in two passes. The application used
is 3DMark03 and we’re able to "simulate" this situation by disabling
the Pixel Shader 1.4 in the drivers. First we do a run with Pixel Shader 1.4
which makes the card render the sequence in a single pass, then we force Pixel
Shader 1.1 level support (with Rage3DTweaker 3.8) which makes it impossible
to render the test in one pass. Thus these results from "Battle for Proxycon"
should at least give some hint of what the F-buffer can do for performance:

  • Pixel Shader
    1.4: 30.6 fps
  • Pixel Shader
    1.1: 24.6 fps

The card used for these
tests is a Radeon 9700 Pro and a 25 % increase in performance is surely nice.
The performance benefits will of course vary heavily between applications and
settings etc. On top of performance an F-buffer can also help out with some
problems that might occur with transparent surfaces when doing multipass rendering.
An F-buffer works by storing temporary results (outside the framebuffer) thus
this information can be accessed during subsequent passes without having to
reprocess things like geometry or otherwise unchanged information from earlier
passes.

Sadly
(for us that is) implementing F-buffer is not a straight forward process and
ATi hasn’t supplied much information on the subject other than the picture you
see above (the normal slew of white papers and specifications will of course
be published at ATi’s website in a near future though). However this picture
tells us that ATi chose to input the buffer data at the end of their pixel pipeline
(i.e. instead of inputting the data as texture color values).
Since the F-buffer is accessed in a linear fashion you’re able to store F-buffer
data in VRAM as well as in on-chip cache (you could even offload the GFX by
storing the information in normal system DRAM). Again, we don’t know the specifics
of ATi’s implementation but we intend to find out.


Even if we don’t
have the RV350
in our hands we can still disclose some interesting details about this
chip and the products that are built upon it.

Clock frequencies
for the 9600 Pro are to be set at 400/600 MHz and the little brother will
only run at 325/400 MHz. 400 MHz memory for the non-Pro version certainly
sounds like trouble when it comes to FSAA since the RV350 is only on a
128 bit memory bus.






Price and availability

  • Radeon
    9600 Pro 128 MB ~185 USD, April
  • Radeon 9600
    128 MB 169 USD, April
  • Radeon 9600
    64 MB 149 USD, April

RV350 seems to
have a pretty neat price range. The Pro model is to be cheaper than the Ti4800/4600
and even lower than most Ti4400/4800SE’s which certainly looks like a sweet
deal if the performance is right.






Architectural evolution

Just like the R350, RV350 is also basically an improved
R300, in this case a Radeon 9500 (non-Pro that is). On paper this doesn’t
look very impressive considering it’s a 4×1 architecture, but as nVidia
has shown: numbers aren’t everything (referring to the 8×1 vs 4×2
"incident"). Judging by the indications we get from ATi performance
will end up roughly where the Radeon 9500 Pro is today.


  • Optimized
    memory controller (Smoothvision
    2.1)
  • Redesigned
    PCB
  • 0.13-microns

Note that we didn’t
put Hyper Z III+ and Smartshader 2.1 on the list as we simply haven’t got
enough info from ATi yet.
The major addition here is that the RV350 is built
with 0.13 micron technology. This
makes
the card
need less power and produce less heat. Thus the clock frequencies can be
raised a whole lot compared to the 0.15 siblings. Since it requires less
power ATi
was also able to get rid of the pesky molex power connector.
ATi are even thinking about supplying passively cooled RV350 models, our guess
is that this only applies to the non-Pro version.






RV280

RV280, Radeon
9200 that is, isn’t the most exciting news in town. This card is simply
a RV250 i.e. Radeon 9000 with some small improvements.

ATi doesn’t
seem to have fully made up their mind yet when it comes to prices:


  • Radeon
    9600 Pro 128 MB 129-149 USD, April
  • Radeon 9200
    128 MB 79-129 USD, April
  • Radeon 9200
    64 MB 79-129 USD, April

No word on clock
frequencies as of yet.

The only real new addition
here is AGP 8x support which we wouldn’t exactly call noteworthy. It’s rather
just one of those check list items that the "average joe" looks
for, in reality however AGP 8x doesn’t actually provide any tangible performance
benefits over AGP 4x. We touched the possibility of passive cooling for RV350,
on the RV280 passive cooling will most likely be standard at least for the
non-Pro versions.

Well, I guess
it’s time to look at what you’ve all been waiting for; how the Radeon 9800
Pro performs.


Unfortunately
we still haven’t got a GeForce FX available here at NordicHardware so we can’t
give you any results from this card. (We’ll keep reminding our contacts that
we want one though.) This is somewhat sad since the FX is the main competitor
for the Radeon 9800 Pro. Comparing our results with other reviews might give
an indication on where the FX might fit in our graphs. And when we get our
hands on a card ourselves we’ll update you accordingly.


Test setup
Hardware
Processor:
AMD Athlon XP 2600+ (333) Mhz
Mainboard:
Soltek
SL-75FRN-L
(nForce2)
RAM:
768 MB DDR333 @ 2-5-2-2 Timings:
2x 256 MB Corsair TWINX512-3200LL DDR-SDRAM
256 MB
PC3200 OCZ PC3200 Rev2 DDR-SDRAM
Videocards:

Gainward GeForce4 Ti4600 (128 MB, 300/650)
Gigabyte Radeon 9700 Pro (128 MB, 325/620)
ATi Radeon 9800 Pro (128 MB, 375/700)

HDD:
80 GB Western Digital Caviar 7200 RPM Special Edition (8 MB cache)
Soundcard:
Creative
Soundblaster Audigy 2
Ethernet:
D-Link DFE-530TX 10/100
Software
Operating system:
Windows XP Professional (Service Pack 1)
Video drivers:
nVidia: Detonator 40 43.00
ATi: Catalyst 3.1 6307
Other drivers:
nVidia UDA Chipset Drivers v2.03
Benchmarks:

3DMark2001
SE b330
3DMark03
Aquamark 2.3
Dacris Benchmark 4.91
XPBench 1.03
Quake 3 1.32
UT2003 v2199

Codecreatures Benchmark Pro 1.0
Vulpine GL Mark 1.0.0.3
GL Excess 1.2
Fablemark
Cinebench 2003

Since time was
of the essence we had to somewhat compromise with the quantity of tests.
We’ve put a lot of concentration of testing different quality enhancing
settings such as FSAA and Anisotropic filtering.
We did some testing and didn’t find any differences when it comes
to AA/AF IQ. Sample patterns appear to be 100% identical etc. so we won’t
bore you
by posting a lot of pretty pictures that just looks identical.


Since this is
a reference board no real installation CD was available. A CD-R containing
the latest Catalyst driver was enough for our needs though. Catalyst 3.1 –
6307 (7.84) is the driver on the CD and it does not support any other cards
than R350. Since we wanted to use the same drivers for both the ATi cards
we decided on adding the necessary info in the INF of the new drivers in order
to support 9700 Pro too. After testing it out and making sure it worked properly
on the 9700 Pro compared to the latest official 3.1 drivers it was "all
systems go". After installing the new beauty we immediately started benching
the card.. or at least that was our intention.
After passing through the first Game Test in 3DMark03 things started to get
ugly as we encountered a problem from the past: namely "stuttering".
The fps was much higher than when we ran the benchmark on our 9700 Pro but
what we saw on screen was certainly not smooth. Choppy graphics galore, if
the fps-meter hadn’t been there I would have guessed it was running at 2-5
fps.
After disabling all unnecessary ports in the BIOS, all background applications
and all non-vital windows services plus getting rid of unneeded hardware we
were back on the track again. All signs of stuttering were completely gone.
After spending some time with the problem we narrowed it down to a conflict
with some background applications. We’ve had this problem with ATi hardware
conflicting with background applications before, but all of those problems
were solved with the introduction of Catalyst 3.0a.
In any case, stuttering was back. After testing some more we found two processes
that were involved in the problem: "CTSVCCDA.EXE" and "MsPMSPSv.exe"
both of which are part of the Audigy 2 program suite. We have never encountered
this problem using any older drivers so it doesn’t seem likely that this would
be hard to fix.
Anyway, not a great way to start the day.

When we started
benchmarking again we noticed yet some other abnormalities. Scores in certain
applications were simply too good to be true. It didn’t take long to figure
out that the Open GL part of the driver was playing tricks on us. When we
activated 6x FSAA for an example, it worked fine in 1024×768 and 1280×1024
but when we ran the test in 1600×1200 the driver somehow reverted to 4x FSAA.
The problem was simply solved by running the test again; still this problem
was evident throughout the whole testing period and happened from time to
time. We don’t see this as any bad problem however as this shouldn’t be hard
to fix.
One probable cause could be that the card/driver thinks that it’s out of VRAM
and thus it reverts to a less memory demanding FSAA mode. Similar problems
have been discussed regarding the 9700 Pro, though we never experienced it
ourselves. To our knowledge there was such a problem with earlier ATi drivers
but that it later on was fixed.

In short we had
two not so pleasant flashbacks right from the beginning.
Though we want to stress that the drivers used are not publicly available
nor are they WHQL certified. Both problems were also easy to solve so there
no need to panic here folks.







That’s not all










Well, now you must
be wondering what this garbled screen is all about, right? Let me tell
you. Since we want to provide our user with the complete picture here
we thought we’d downclock the card to 9800 non-Pro speeds. The result,
as you can see, wasn’t very pretty. If this underclocking is somehow blocked
in the BIOS/drivers or in PowerStrip is beyond our knowledge. In any case,
underclocking the card didn’t work as planned. Thus we can’t provide any
exact Radeon 9800 non-Pro scores. Overclocking ordinary DDR memory to
920 MHz like the 256 MB version supposedly will be equipped with is of
course not a possibility. The results are that we can only present Radeon
9800 Pro 128 MB results today. The problem was the core speed. Once we
clocked it down to 325 MHz we got the corruption but any other clock speed
was fine. As for the benchmarks where we compare the 9700 vs. 9800 at
similar clock speeds the 9800 Pro is actually running at 327/620 MHz unlike
the 9700 Pro which is running at 325/620 MHz.
Actually this was not the only "problem" related to changing
the clock frequencies as you’ll learn later on.




We open up the
session with some light weight 3DMark2001 benchmarking. As in no FSAA,
no Aniso.