[ad_1]

For AMD’s Radeon Technologies Group, 2018 was a bit of a breather year. After launching the Polaris architecture in 2016 and the Vega architecture in 2017, for 2018 AMD set about to enjoy their first full year of Vega. Instead of having to launch a third architecture in three years, the company would instead focus on further expanding the family by bringing Vega's laptop and server variants to market. And while AMD's laptop efforts have gone in an odd direction, their Radeon Instinct server efforts have put some pep back in their figurative step, giving company the claim to the first 7nm GPU.

After releasing a late-generation product refresh in November in the form of the Radeon RX 590, we had expected AMD's consumer side to be done for a while. Instead, AMD made a rather unexpected announcement at CES 2019 last month: the company would be releasing a new high-end consumer card, the Radeon VII (Seven). Based on their aforementioned server GPU and positioned as their latest flagship graphics card for gamers and content creators alike, Radeon VII would once again be AMD’s turn to court enthusiast gamers. Now launching today – on the 7th, appropriately enough – we're taking a look at AMD's latest card, to see how the Radeon VII measures up to the challenge.

On the surface, the Radeon VII would seem to be straightforward. The silicon underpinning the card is AMD's Vega 20 GPU, a derivative of the original Vega 10 that has been enhanced for scientific compute and machine learning, and built on TSMC's cutting-edge 7nm process for improved performance. An important milestone for AMD's server GPU efforts – it's essentially their first high-end server-class GPU since Hawaii all the way back in 2013 – AMD has been eager to show off Vega 20 throughout the later part of its bring-up, as this is the GPU the heart of AMD’s relatively new Radeon Instinct MI50 and MI60 server accelerators.

First and foremost designed for servers then, Vega 20 is not the class of GPU that could cheaply make its way to consumers. Or at least, would seem to be AMD's original thought. But across the aisle, something unexpected has happened: NVIDIA hasn't moved the meter very much in terms of performance-per-dollar. The new Turing-based GeForce RTX cards instead are all about features, looking to usher in a new paradigm of rendering games with real-time raytracing effects, and in the process allocating large parts of the already-large Turing GPUs to this purpose. The end result has been relatively high prices for the GeForce RTX 20 series cards, all the while their performance gains in conventional game are much less than the usual generational uplift.

Faced with a less hostile pricing environment than many were first expecting, AMD has decided to bring Vega 20 to consumers after all, duel with NVIDIA one of these higher price points. Hitting the streets at $699, the Radeon VII squares up with the GeForce GTX 2080 as the new flagship Radeon gaming card.

AMD Radeon Series Specification Comparison
  AMD Radeon VII AMD Radeon RX Vega 64 AMD Radeon RX 590 AMD Radeon R9 Fury X
Stream Processors 3840
(60 CUs)
4096
(64 CUs)
2304
(36 CUs)
4096
(64 CUs)
ROPs 64 64 32 64
Base Clock 1400MHz 1247MHz 1469MHz N/A
Boost Clock 1750MHz 1546MHz 1545MHz 1050MHz
Memory Clock 2.0Gbps HBM2 1.89Gbps HBM2 8Gbps GDDR5 1Gbps HBM
Memory Bus Width 4096-bit 2048-bit 256-bit 4096-bit
VRAM 16GB 8GB 8GB 4GB
Single Precision 13.8 TFLOPS 12.7 TFLOPS 7.1 TFLOPS 8.6 TFLOPS
Double Precision 3.5 TFLOPS
(1/4 rate)
794 GFLOPS
(1/16 rate)
445 GFLOPS
(1/16 rate)
538 GFLOPS
(1/16 rate)
Board Power 300W 295W 225W 275W
Reference Cooling Open-air triple-fan Blower N/A AIO CLC
Manufacturing Process TSMC 7nm GloFo 14nm GloFo/Samsung 12nm TSMC 28nm
GPU Vega 20
(331 mm2)
Vega 10
(495 mm2)
Polaris 30
(232 mm2)
Fiji
(596 mm2)
Architecture Vega
(GCN 5)
Vega
(GCN 5)
GCN 4 GCN 3
Transistor Count 13.2B 12.5B 5.7B 8.9B
Launch Date 02/07/2019 08/14/2017 11/15/2018 06/24/2015
Launch Price $699 $499 $279 $649

Looking at our specification table, Radeon VII ships with a "peak engine clock" of 1800MHz, while the official boost clock is 1750MHz. This compares favorably to RX Vega 64's peak engine clock, which was just 1630MHz, so AMD has another 10% or so in peak clockspeed to play with. And thanks to an open air cooler and a revised SMU, Radeon VII should be able to boost to and sustain its higher clockspeeds a little more often still. So while AMD's latest card doesn't add more ROPs or CUs (it's actually a small drop from the RX Vega 64), it gains throughput across the board.

However, if anything, the biggest change compared to the RX Vega 64 is that AMD has doubled their memory size and more than doubled their memory bandwidth. This comes courtesy of the 7nm die shrink, which sees AMD's latest GPU come in with a relatively modest die size of 331mm2. The extra space has given AMD room on their interposer for two more HBM2 stacks, allowing for more VRAM and a wider memory bus. AMD has also been able to turn up the memory clockspeed up a bit as well, from 1.89 Gbps/pin on the RX Vega 64 to a flat 2 Gbps/pin for the Radeon VII.

Interestingly, going by its base specifications, the Radeon VII is essentially a Radeon Instinct MI50 at heart. So for AMD, there's potential to cannibalize Instinct sales if the Radeon VII's performance is too good for professional compute users. As a result, AMD has cut back on some of the chip's features just a bit to better differentiate the products. We'll go into more a bit later, but chief among these is that the card operates at a less-than-native FP64 rate, loses its full-chip ECC support, and naturally for a consumer product, it uses the Radeon Software gaming drivers instead of the professional Instinct driver stack.

Of course any time you're talking about putting a server GPU in to a consumer or prosumer card, you're talking about the potential for a powerful card, and this certainly applies to the Radeon VII. Ultimately, the angle that AMD is gunning for with their latest flagship card is on the merit of its competitive performance, further combined with its class-leading 16GB of HBM2 memory. As one of AMD's few clear-cut specification advantages over the NVIDIA competition, VRAM capacity is a big part of AMD's marketing angle; they are going to be heavily emphasizing content creation and VRAM-intensive gaming. Also new to this card and something AMD will be keen to call out is their triple-fan cooler, replacing the warmly received blower on the Radeon RX Vega 64/56 cards.

Furthermore, as a neat change, AMD is throwing their hat into the retail ring as a board vendor and directly selling the new card at the same $699 MSRP. Given that AIBs are also launching their branded reference cards today, it's an option for avoiding inflated launch prices.

Meanwhile, looking at the competitive landscape, there are a few items to tackle today. A big part of the mix is (as has become common lately) a game bundle. The ongoing Raise the Game Fully Loaded pack sees Devil May Cry 5, The Division 2, and Resident Evil 2 included for free with the Radeon VII, RX Vega and RX 590 cards. Meanwhile the RX 580 and RX 570 cards qualify for two out of the three. Normally, a bundle would be a straightforward value-add against a direct competitor – in this case, the RTX 2080 – but NVIDIA has their own dueling Game On bundle with Anthem and Battlefield V. In a scenario where the Radeon VII is expected to trade blows with the RTX 2080 rather than win outright, these value-adds become more and more important.

The launch of the Radeon VII also marks the first product launch since the recent shift in the competitive landscape for variable refresh monitor technologies. Variable refresh rate monitors have turned into a must-have for gamers, and since the launch of variable refresh technology earlier this decade, there's been a clear split between AMD and NVIDIA cards. AMD cards have supported VESA Adaptive Sync – better known under AMD's FreeSync branding – while NVIDIA desktop cards have only supported their proprietary G-Sync. But last month, NVIDIA made the surprise announcement that their cards would support VESA Adaptive Sync on the desktop, under the label of 'G-Sync Compatibility.' Details are sparse on how this program is structured, but at the end of the day, adaptive sync is usable in NVIDIA drivers even if a FreeSync panel isn't 'G-Sync Compatible' certified.

The net result is that while NVIDIA's announcement doesn't hinder AMD as far as features go, it does undermine AMD's FreeSync advantage – all of the cheap VESA Adaptive Sync monitors that used to only be useful on AMD cards are now potentially useful on NVIDIA cards as well. AMD of course has been quite happy to emphasize the "free" part of FreeSync, so as a weapon to use against NVIDIA, it has been significantly blunted. AMD's official line is one of considering this a win for FreeSync, and for freedom of consumer choice, though the reality is often a little more unpredictable.

The launch of the Radeon VII and its competitive positioning against the GeForce RTX 2080 means that AMD also has to crystalize their stance on the current feature gap between their cards and NVIDIA's latest Turing machines. To this end, AMD's position has remained the same on DirectX Raytracing (DXR) and AI-based image quality/performance techniques such as DLSS. In short, AMD's argument goes along the lines that they believe that the performance hit and price premium for these features isn't worth the overall image quality difference. In the meantime, AMD isn't standing still, and along with DXR fallback drivers, they working on support for WinML and DirectML for their cards. The risk to AMD being, of course, is that if DXR or NVIDIA's DLSS efforts end up taking off quickly, then the feature gap is going to become more than a theoretical annoyance.

All told, pushing out a 7nm large gaming GPU for consumers now is a very aggressive move so early in this process' lifecycle, especially as on a cyclical basis, Q1 is typically flat-to-down and Q2 is down. But in context, AMD doesn't have that much time to wait and see. The only major obstacle would be pricing it to be acceptable for consumers.

That brings us to today's launch. For $699, NVIDIA has done the price-bracket shifting already, on terms of dedicated hardware for accelerating raytracing and machine learning workloads. For the Radeon VII, the terms revolve around 16GB HBM2 and prosumer/content creator value. All that remains is their gaming performance.

2/2019 GPU Pricing Comparison
AMD Price NVIDIA
  $1299 GeForce RTX 2080 Ti
(Game On Bundle)
Radeon VII
(Raise the Game Bundle)
$699/$719 GeForce RTX 2080
(Game On Bundle)
  $599  
Radeon RX Vega 64
(Raise the Game Bundle)
$499 GeForce RTX 2070
(Game On Bundle, 1 game)
Radeon RX Vega 56
(Raise the Game Bundle)
$429  
  $349 GeForce RTX 2060
(Game On Bundle, 1 game)

Though we’ve known of Vega 20 for some time with the launch of server-class Radeon Instinct MI50 and MI60, its arrival to the consumer space does mark the first 7nm gaming card. Rapidly moving down to a lower node – this time from GlobalFoundries 14nm LPP to TSMC 7nm (CLN7FF) – used to be an AMD/ATi hallmark, and once again AMD is pushing the envelope by bringing it so early to consumers. That being the case, all the attributes of Vega 20 look squarely to be for the professional/server crowd, though not devoid of benefits for gaming.

Of the many threads in the Radeon VII story, the fate of Vega as a whole has been nothing short of enigmatic. To start things off, the 7nm Vega 20 GPU making its way to consumers was a surprise to most, at least arriving this early. And while AMD only made mention of Vega 20 in reference to Radeon Instinct products – referring to the forthcoming Navi when speaking about 7nm GPUs for gamers – AMD now maintains that the plan was always to bring Radeon VII to market. Perhaps the same might have been said regarding Vega on 14nm+/12LP and the Vega 11 GPU (not to be confused with Ryzen 2400G’s 11 Vega compute units), though to be clear this is hardly unusual given the nature of semiconductor development.

To be fair, AMD has only been coy at best about Vega since the RX Vega launch, which didn’t quite land where AMD wanted it to. But even as a bit of Schrodinger’s silicon, the existence of Radeon VII does raise some interesting questions about 7nm. For one, AMD had already moved up their Vega 20 sampling and launch windows previously. So Radeon VII’s launch timing is realistically the earliest it could be for a consumer derivative of the Radeon Instinct. Even more so is that with a die size of 331mm2, these aren’t small mobile SoCs or comparable ‘pipecleaner’ silicon that we’ve seen so far on TSMC 7nm. Designed with compute/ML-oriented enhancements, equipped with 4 HBM2 stacks, and fabbed on a maturing and cutting-edge 7nm node, Vega 20 has nothing on paper to push for its viability at consumer prices. And yet thanks to a fortunate confluence of factors, here we are.

At a high level. Vega 20 combines an updated GCN 5 architecture with a 7nm process, coming out to 13.2B transistors on 331mm2 (versus 12.5B transistors on 496mm2 for Vega 10). Typically with die shrinks, these space-savings are often reinvested into more transistors – for a gaming card, that can mean anything from more CUs and functional blocks, to layout redesigns and hardened data paths for improved frequency tolerance. The latter, of course, is to enable higher clockspeeds, and this design choice was a big part of Vega 10, where a significant number of transistors were invested to meeting the requisite timing targets. In conjunction with power-savings of a smaller node, a chip can then get to those higher clocks without additional power.

For Vega 20 however, much of the saved space was left simply as that: more space. There’s several reasons of this, some obvious, and some less so. To start things off, as a relatively large high-performance GPU on a leading-edge 7nm node early in its life, development and production is already costly and likely lower-yielding, where going any larger would cost substantially more and yield less. And though TSMC’s 7nm process has only publicly been seen in mobile SoCs thus far, Vega 20 presumably makes good use of the HPC-oriented 7.5T libraries as needed, as opposed to using the 6T libraries intended for mobile SoCs.

But more importantly, the saved space allows room for two more stacks of HBM2 on a similarly-sized interposer. For current-generation HBM2 densities and capabilities, the limit for a two stack chip is 16GB of memory, using a pair of "8-Hi" stacks. But for a server-grade GPU – especially one targeting machine learning – a four stack configuration needed to happen in order to allow for 32GB of memory and a wider 4096-bit bus. For Vega 20, AMD has delivered on just this, and furthermore is producing both 32GB (8-Hi) and 16GB (4-Hi) versions of the chip.

Radeon VII, in turn, is tapping one of these 16GB chips for its core design. It should be noted that this isn't AMD's first 16GB Vega card – they also produced one early on with their early-adopted focused Vega Frontier Edition card – but since the Frontier Edition's retirement, this is the first (re)introduction of a 16GB card to AMD's Vega lineup.

Going with a 16GB for a consumer(ish) card is a bit of a gamble for AMD. And, I suspect, this is part of the reason we're also seeing AMD chase part of the professional visualization market with the Radeon VII. When it comes to workstation use and content creation tasks, more VRAM is an easy sell, as there are already datasets that can use all of that VRAM and more. But for gaming this is a harder sell, as games have more fixed VRAM requirements, and absent such a large card until now, developers haven't yet started targeting 16GB cards. On the flip side, however, the statement "this card has more than enough VRAM" has proven to be famous last words, and in 2019 a flagship enthusiast-grade gaming card ought to have that much anyway.

Getting back to Vega 20's design then, the other step AMD has taken to reduce the complications and cost  of 7nm is by sticking to a die-shrink of a known architecture. Here AMD has added optimizations over Vega 10, but they did not risk a large redesign. Basically it's the logic behind the ‘tick’ of Intel’s old ‘tick-tock’ strategy.

In fact Vega 20 is such a straightforward shrink of Vega 10 in this manner that, outside of the number of memory controllers, all of the other functional unit counts at the same. The GPU packs 64 CUs and 256 texture units segmented into 4 Shader Engines, which in turn are paired with 64 ROPs and AMD's multi-queue command processor.

(I should add that by going this route, AMD also neatly sidesteps the question of shader engine scaling. The nature of that 4 SE limitation has been left vague in recent years, but with Vega there were hints of a path beyond with improved load-balancing via the intelligent workgroup distributors (IWD). Regardless, it would be a complex task in itself to tweak and redesign a balanced 4+ SE configuration, which might be unnecessary effort if AMD has fundamental changes to GCN in the pipeline.)

So on an architectural level, Vega 20 is very much an evolutionary design. But with that said, there is a bit more evolution to it than just the die-shrink, the combination of which means that Vega 20 should in practice be a bit faster than Vega 10 on a clock-for-clock basis.

The big improvement here is all of that extra memory bandwidth; there's now over twice as much bandwidth per ROP, texture unit, and ALU as there was on Vega 10. The bodes particularly well for the ROPs, which have traditionally always been big bandwidth consumers. Not stopping there, AMD has also made some improvements to the Core Fabric, which is what connects the memory to the ROPs (among other things). Unfortunately AMD isn't willing to divulge just what these improvements are, but they have confirmed that there aren't any cache changes among them.

Another piece of the puzzle is that AMD has added some new instructions and data types that will speed up machine learning in certain cases. AMD hasn't given us the complete details here, but at a high level we know that they've added support for INT8 and INT4 data types, which are useful for some low-precision inference scenarios. AMD has also added a new FP16 dot product that accumulates as an FP32 result, a rather specific scenario that is helpful for some machine learning algorithms, as it produces a higher precision result than a FP16-in/FP16-out dot product.

Speaking of data types, AMD has also significantly ramped up their FP64 performance for Vega 20. As a core architecture, GCN allows a GPU to be built with a rate ranging from 1/2 to 1/16 the FP32 rate. For pure consumer GPUs this has always been 1/16, however for GPUs that have pulled double-duty as server-focused chips, they have enabled 1/4 and 1/2 rates in the past. Vega 20, in turn, is the first 1/2 rate FP64 GPU from AMD since Hawaii in 2013. This means that while its general FP32 performance gains over Vega 10 cards are somewhat limited, its FP64 gains are nothing short of massive – better than 8x over the RX Vega 64, on paper. Of course as a consumer card the Radeon VII doesn't quite get to enjoy these benefits – it's limited to 1/4 rate – but more on that later.

Meanwhile, for AMD's video and display controller blocks, there have only been minor, incremental updates. Officially, the display controller (DCE) is up to version 12.1, while we're on unified video decoder (UVD) 7.2, and video coding engine (VCE) 4.1. No additional encoding or decoding feature support have been added compared to Vega 10. For what it’s worth, we’ve already seen the successor blocks with Raven Ridge’s Display Core Next and Video Core Next 1.0, so this may be the last architecture that those ASICs grace us with their presence.

Wrapping up the functional blocks is a new style of SMU, discussed in recent Linux kernel patches. Orthogonal, but closely related, is improved thermal monitoring, where the number of temperature diodes has been doubled to 64 sensors. As a consequence, AMD is now fully utilizing junction temperature monitoring instead of edge temperature monitoring. Junction temperature measurements were used in Vega 10 (showing up as ‘hotspot’ temperature) but Vega 20 has made the full jump to junction temperature for the full suite of fanspeeds, clockspeeds, and such. The result is more accurate reporting, as well as minor 1-2% performance gains (via reduced throttling) that AMD cites from internal testing.

The updated SMU also brings with it a knock-on effect: temperature, clockspeed, and related GPU metrics are no longer read through registers, but instead grabbed straight from the SMU. Naturally, this breaks compatibility with third-party utilities (i.e. GPU-Z), and while AMD has already notified some developers of these changes, applications will still need to be updated to use AMD's new API calls.

Finally, as this is AMD's first new high-end Vega silicon since the original Vega 10 in 2017, there have been a few questions swirling around Vega’s forward-looking hardware features. AMD's communication hasn't always been clear here, and as a result these features have become sort of a perpetual source of consumer confusion.

To settle matters for Vega 20 then, AMD doesn’t seem to be changing the situation here. Which is to say that there's have been no further developments as far as AMD's primitive shaders are concerned. Primitive shaders will still require explicit developer support – something AMD has not enabled – and so the full capabilities of Vega 20's Next-Gen Geometry path aren't being utilized (though we should note that the Intelligent Workgroup Distributor portion has always been enabled).

Meanwhile, AMD's Draw Stream Binning Rasterizer (DSBR) was already up and working in Vega 10, so this hasn't changed; the feature is enabled for an unspecified list of games. And checking in quickly on Rapid Packed Math (fast FP16), this is in use for two known games: Far Cry 5 and Wolfenstein II.

Another item of note is enhanced compute performance with new ML operations and higher double precision (FP64) rate. The latter has gotten some mixed communication, as it was originally disclosed to us as 1/8 rate and disclosed elsewhere as 1/16. It’s safe to say that there has been some internal debate within AMD as to what it should be set as, leaving us in the cold with our previous statements. Looking to clear things up, AMD put out a statement:

The Radeon VII graphics card was created for gamers and creators, enthusiasts and early adopters. Given the broader market Radeon VII is targeting, we were considering different levels of FP64 performance. We previously communicated that Radeon VII provides 0.88 TFLOPS (DP=1/16 SP). However based on customer interest and feedback we wanted to let you know that we have decided to increase double precision compute performance to 3.52 TFLOPS (DP=1/4SP).

If you looked at FP64 performance in your testing, you may have seen this performance increase as the VBIOS and press drivers we shared with reviewers were pre-release test drivers that had these values already set. In addition, we have updated other numbers to reflect the achievable peak frequency in calculating Radeon VII performance as noted in the [charts].

The logic is, of course, to avoid cannibalization of professional products – MI50 and MI60 in this case – but also entice prosumers, who have long appreciated high FP64 throughput from past consumer AMD cards, like those based on Hawaii. Or looking at it another way, it's been some time since AMD has brought out a high FP64 throughput GPU.

In the same vein, ML operations are being throttled to some extent, though no more details are available at this time. The specific operations are there in the developer documentation, but at a high level they are deep-learning oriented intrinsics with INT8 and INT4 dot products, mixed precision FMAs, and similar. While there are ongoing efforts in WinML and DirectML, this functionality is currently not of use to consumers.

AMD Server Accelerator Specification Comparison
  Radeon VII Radeon Instinct
MI50
Radeon Instinct
MI25
FirePro S9170
Stream Processors 3840
(60 CUs)
3840
(60 CUs)
4096
(64 CUs)
2816
(44 CUs)
ROPs 64 64 64 64
Base Clock 1450MHz 1450MHz 1400MHz
Boost Clock 1750MHz 1746MHz 1500MHz 930MHz
Memory Clock 2.0Gbps HBM2 2.0Gbps HBM2 1.89Gbps HBM2 5Gbps GDDR5
Memory Bus Width 4096-bit 4096-bit 2048-bit 512-bit
Half Precision 27.6 TFLOPS 26.8 TFLOPS 24.6 TFLOPS 5.2 TFLOPS
Single Precision 13.8 TFLOPS 13.4 TFLOPS 12.3 TFLOPS 5.2 TFLOPS
Double Precision 3.5 TFLOPS
(1/4 rate)
6.7 TFLOPS 768 GFLOPS
(1/16 rate)
2.6 TFLOPS
(1/2 rate)
DL Performance ? 53.6 TFLOPS 12.3 TFLOPS 5.2 TFLOPS
VRAM 16GB 16GB 16GB 32GB
ECC No Yes (full-chip) Yes (DRAM) Yes (DRAM)
Bus Interface PCIe Gen 3 PCIe Gen 4 PCIe Gen 3 PCIe Gen 3
TDP 300W 300W 300W 275W
GPU Vega 20 Vega 10 Vega 10 Hawaii
Architecture Vega
(GCN 5)
Vega
(GCN 5)
Vega
(GCN 5)
GCN 2
Manufacturing Process TSMC 7nm TSMC 7nm GloFo 14nm TSMC 28nm
Launch Date 02/07/2019 09/2018 06/2017 07/2015
Launch Price (MSRP) $699 $999 $3999

First things first is the design and build, and for the AMD Radeon VII, we've already noticed the biggest change: an open air cooler. Keeping the sleek brushed metal look of the previous RX Vega 64 Limited Edition and Liquid variants, they've forgone the blower for a triple axial fan setup, the standard custom AIB configuration for high-end cards.

While NVIDIA's GeForce RTX series went this way with open-air dual-fan coolers, AMD is no stranger to changing things up themselves. Aside from the RX Vega 64 Liquid, the R9 Fury X's AIO CLC was also quite impressive for a reference design. But as we mentioned with the Founders Edition cards, moving away from blowers for open-air means adopting a cooling configuration that can no longer guarantee complete self-cooling. That is, cooling effectiveness won't be independent of chassis airflow, or lack thereof. This is usually an issue for large OEMs that configure machines assuming blower-style cards, but this is less the case for the highest-end cards, which for pre-builts tend to come from boutique system integrators.

The move to open-air does benefit higher TDP, and at 300W TBP the Radeon VII is indeed one for higher power consumption. While 5W more than the RX Vega 64, there's presumably more localized heat with two more HBM2 stacks, plus the fact that the same amount of power is being consumed but on a smaller die area. And at 300W TBP, this would mean that all power-savings from the smaller process were re-invested into performance. If higher clockspeeds are where the Radeon VII is bringing the majority of its speedup over RX Vega 64, then there would be little alternative to abandoning the blower.

Returning to the Radeon VII build, then, the card naturally has dual 8-pin PCIe connectors, but lacks the BIOS switch of the RX Vega cards that toggled a lower-power BIOS. And with the customary LEDs, the 'Radeon' on the side lights up, as does the 'R' cube in the corner.

In terms of display outputs, there are no surprises here with 3x DisplayPort and 1x HDMI.

A few teardowns of the card elsewhere revealed a vapor chamber configuration with a thermal pad for the TIM, rather than the usual paste. While lower-performing in terms of heat transfer, we know that the RX Vega cards ended up having molded and unmolded package variants, requiring specific instructions to manufacturers on the matter. So this might be a way to head off potential ASIC height difference issues.

To preface, because of the SMU changes mentioned earlier, no third party utilities can read Radeon VII data, though patches are expected shortly. AIB partner tools such as MSI Afterburner should presumably launch with support. Otherwise, Radeon Wattman was the only monitoring tool possible, except we observed that the performance metric log recording and overlay sometimes caused issues with games.

On that note, a large factor in this review was the instability of press drivers. Known issues include being unable to downclock HBM2 on the Radeon VII, which AMD clarified was a bug introduced in Adrenalin 2019 19.2.1, or system crashes when the Wattman voltage curve is set to a single min/max point. There are also issues with DX11 game crashes, which we also ran into early on, that AMD is also looking at.

For these reasons, we won't have Radeon VII clockspeed or overclocking data for this review. To put simply, these types of issues are mildly concerning; while Vega 20 is new to gamers, it is not new to drivers, and if Radeon VII was indeed always in the plan, then game stability should have been a priority. Despite being a bit of a prosumer card, the Radeon VII is still the new flagship gaming card. There's no indication that these are more than simply teething issues, but it does seem to lend a little credence to the idea that Radeon VII was launched as soon as feasibly possible.

Test Setup
CPU Intel Core i7-7820X @ 4.3GHz
Motherboard Gigabyte X299 AORUS Gaming 7 (F9g)
PSU Corsair AX860i
Storage OCZ Toshiba RD400 (1TB)
Memory G.Skill TridentZ
DDR4-3200 4 x 8GB (16-18-18-38)
Case NZXT Phantom 630 Windowed Edition
Monitor LG 27UD68P-B
Video Cards AMD Radeon VII
AMD Radeon RX Vega 64 (Air)
AMD Radeon R9 Fury X
NVIDIA GeForce RTX 2080
NVIDIA GeForce RTX 2070
NVIDIA GeForce GTX 1080 Ti
Video Drivers NVIDIA Release 417.71
AMD Radeon Software 18.50 Press
OS Windows 10 x64 Pro (1803)
Spectre and Meltdown Patched

Thanks to Corsair, we were able to get a replacement for our AX860i. While the plan was to utilize Corsair Link as an additional datapoint for power consumption, for the reasons mentioned above it was not feasible for this time.

Battlefield 1 returns from the 2017 benchmark suite, the 2017 benchmark suite with a bang as DICE brought gamers the long-awaited AAA World War 1 shooter a little over a year ago. With detailed maps, environmental effects, and pacy combat, Battlefield 1 provides a generally well-optimized yet demanding graphics workload. The next Battlefield game from DICE, Battlefield V, completes the nostalgia circuit with a return to World War 2, but more importantly for us, is one of the flagship titles for GeForce RTX real time ray tracing.

We use the Ultra preset is used with no alterations. As these benchmarks are from single player mode, our rule of thumb with multiplayer performance still applies: multiplayer framerates generally dip to half our single player framerates. Battlefield 1 also supports HDR (HDR10, Dolby Vision).

Battlefield 1 - 3840x2160 - Ultra QualityBattlefield 1 - 2560x1440 - Ultra QualityBattlefield 1 - 1920x1080 - Ultra Quality

 

Battlefield 1 - 99th Percentile - 3840x2160 - Ultra QualityBattlefield 1 - 99th Percentile - 2560x1440 - Ultra QualityBattlefield 1 - 99th Percentile - 1920x1080 - Ultra Quality

The latest title in Ubisoft's Far Cry series lands us right into the unwelcoming arms of an armed militant cult in Montana, one of the many middles-of-nowhere in the United States. With a charismatic and enigmatic adversary, gorgeous landscapes of the northwestern American flavor, and lots of violence, it is classic Far Cry fare. Graphically intensive in an open-world environment, the game mixes in action and exploration.

Far Cry 5 does support Vega-centric features with Rapid Packed Math and Shader Intrinsics. Far Cry 5 also supports HDR (HDR10, scRGB, and FreeSync 2). This testing was done without HD Textures enabled, a new option that was recently patched in.

Far Cry 5 - 3840x2160 - Ultra QualityFar Cry 5 - 2560x1440 - Ultra Quality Far Cry 5 - 1920x1080 - Ultra Quality

A veteran from both our 2016 and 2017 game lists, Ashes of the Singularity: Escalation remains the DirectX 12 trailblazer, with developer Oxide Games tailoring and designing the Nitrous Engine around such low-level APIs. The game makes the most of DX12's key features, from asynchronous compute to multi-threaded work submission and high batch counts. And with full Vulkan support, Ashes provides a good common ground between the forward-looking APIs of today. Its built-in benchmark tool is still one of the most versatile ways of measuring in-game workloads in terms of output data, automation, and analysis; by offering such a tool publicly and as part-and-parcel of the game, it's an example that other developers should take note of.

Settings and methodology remain identical from its usage in the 2016 GPU suite. To note, we are utilizing the original Ashes Extreme graphical preset, which compares to the current one with MSAA dialed down from x4 to x2, as well as adjusting Texture Rank (MipsToRemove in settings.ini).

Ashes of the Singularity: Escalation - 3840x2160 - Extreme QualityAshes of the Singularity: Escalation - 2560x1440 - Extreme QualityAshes of the Singularity: Escalation - 1920x1080 - Extreme Quality

 

Ashes: Escalation - 99th Percentile - 3840x2160 - Extreme QualityAshes: Escalation - 99th Percentile - 2560x1440 - Extreme QualityAshes: Escalation - 99th Percentile - 1920x1080 - Extreme Quality

id Software is popularly known for a few games involving shooting stuff until it dies, just with different 'stuff' for each one: Nazis, demons, or other players while scorning the laws of physics. Wolfenstein II is the latest of the first, the sequel of a modern reboot series developed by MachineGames and built on id Tech 6. While the tone is significantly less pulpy nowadays, the game is still a frenetic FPS at heart, succeeding DOOM as a modern Vulkan flagship title and arriving as a pure Vullkan implementation rather than the originally OpenGL DOOM.

Featuring a Nazi-occupied America of 1961, Wolfenstein II is lushly designed yet not oppressively intensive on the hardware, something that goes well with its pace of action that emerge suddenly from a level design flush with alternate historical details.

The highest quality preset, "Mein leben!", was used. Wolfenstein II also features Vega-centric GPU Culling and Rapid Packed Math, as well as Radeon-centric Deferred Rendering; in accordance with the preset, neither GPU Culling nor Deferred Rendering was enabled.

Wolfenstein II - 3840x2160 - Wolfenstein II - 2560x1440 - Wolfenstein II - 1920x1080 -

 

Wolfenstein II - 99th Percentile - 3840x2160 - Wolfenstein II - 99th Percentile - 2560x1440 - Wolfenstein II - 99th Percentile - 1920x1080 -

Upon arriving to PC earlier this, Final Fantasy XV: Windows Edition was given a graphical overhaul as it was ported over from console, fruits of their successful partnership with NVIDIA, with hardly any hint of the troubles during Final Fantasy XV's original production and development.

In preparation for the launch, Square Enix opted to release a standalone benchmark that they have since updated. Using the Final Fantasy XV standalone benchmark gives us a lengthy standardized sequence to utilize OCAT. Upon release, the standalone benchmark received criticism for performance issues and general bugginess, as well as confusing graphical presets and performance measurement by 'score'. In its original iteration, the graphical settings could not be adjusted, leaving the user to the presets that were tied to resolution and hidden settings such as GameWorks features.

Since then, Square Enix has patched the benchmark with custom graphics settings and bugfixes to be more accurate in profiling in-game performance and graphical options, though leaving the 'score' measurement. For our testing, we enable or adjust settings to the highest except for NVIDIA-specific features and 'Model LOD', the latter of which is left at standard. Final Fantasy XV also supports HDR, and it will support DLSS at some later date.

Final Fantasy XV - 3840x2160 - Ultra QualityFinal Fantasy XV - 2560x1440 - Ultra QualityFinal Fantasy XV - 1920x1080 - Ultra Quality

 

Final Fantasy XV - 99th Percentile - 3840x2160 - Ultra QualityFinal Fantasy XV - 99th Percentile - 2560x1440 - Ultra QualityFinal Fantasy XV - 99th Percentile - 1920x1080 - Ultra Quality

Now a truly venerable title, GTA V is a veteran of past game suites that is still graphically demanding as they come. As an older DX11 title, it provides a glimpse into the graphically intensive games of yesteryear that don't incorporate the latest features. Originally released for consoles in 2013, the PC port came with a slew of graphical enhancements and options. Just as importantly, GTA V includes a rather intensive and informative built-in benchmark, somewhat uncommon in open-world games.

The settings are identical to its previous appearances, which are custom as GTA V does not have presets. To recap, a "Very High" quality is used, where all primary graphics settings turned up to their highest setting, except grass, which is at its own very high setting. Meanwhile 4x MSAA is enabled for direct views and reflections. This setting also involves turning on some of the advanced rendering features – the game's long shadows, high resolution shadows, and high definition flight streaming – but not increasing the view distance any further.

Grand Theft Auto V - 3840x2160 - Very High QualityGrand Theft Auto V - 2560x1440 - Very High QualityGrand Theft Auto V - 1920x1080 - Very High Quality

 

Grand Theft Auto V - 99th Percentile - 3840x2160 - Very High QualityGrand Theft Auto V - 99th Percentile - 2560x1440 - Very High QualityGrand Theft Auto V - 99th Percentile - 1920x1080 - Very High Quality

Next up is Middle-earth: Shadow of War, the sequel to Shadow of Mordor. Developed by Monolith, whose last hit was arguably F.E.A.R., Shadow of Mordor returned them to the spotlight with an innovative NPC rival generation and interaction system called the Nemesis System, along with a storyline based on J.R.R. Tolkien's legendarium, and making it work on a highly modified engine that originally powered F.E.A.R. in 2005.

Using the new LithTech Firebird engine, Shadow of War improves on the detail and complexity, and with free add-on high resolution texture packs, offers itself as a good example of getting the most graphics out of an engine that may not be bleeding edge. Shadow of War also supports HDR (HDR10).

Shadow of War - 3840x2160 - Ultra QualityShadow of War - 2560x1440 - Ultra QualityShadow of War - 1920x1080 - Ultra Quality

Succeeding F1 2016 is F1 2018, Codemaster's latest iteration in their official Formula One racing games. It features a slimmed down version of Codemasters' traditional built-in benchmarking tools and scripts, something that is surprisingly absent in DiRT 4.

Aside from keeping up-to-date on the Formula One world, F1 2017 added HDR support, which F1 2018 has maintained; otherwise, we should see any newer versions of Codemasters' EGO engine find its way into F1. Graphically demanding in its own right, F1 2018 keeps a useful racing-type graphics workload in our benchmarks.

F1 2018 - 3840x2160 - Ultra QualityF1 2018 - 2560x1440 - Ultra QualityF1 2018 - 1920x1080 - Ultra Quality

 

F1 2018 - 99th Percentile - 3840x2160 - Ultra QualityF1 2018 - 99th Percentile - 2560x1440 - Ultra QualityF1 2018 - 99th Percentile - 1920x1080 - Ultra Quality

Last in our 2018 game suite is Total War: Warhammer II, built on the same engine of Total War: Warhammer. While there is a more recent Total War title, Total War Saga: Thrones of Britannia, that game was built on the 32-bit version of the engine. The first TW: Warhammer was a DX11 game was to some extent developed with DX12 in mind, with preview builds showcasing DX12 performance. In Warhammer II, the matter, however, appears to have been dropped, with DX12 mode still marked as beta, but also featuring performance regression for both vendors.

It's unfortunate because Creative Assembly themselves have acknowledged the CPU-bound nature of their games, and with re-use of game engines as spin-offs, DX12 optimization would have continued to provide benefits, especially if the future of graphics in RTS-type games will lean towards low-level APIs.

There are now three benchmarks with varying graphics and processor loads; we've opted for the Battle benchmark, which appears to be the most graphics-bound.

Total War: Warhammer II - 3840x2160 - Ultra QualityTotal War: Warhammer II - 2560x1440 - Ultra QualityTotal War: Warhammer II - 1920x1080- Ultra Quality

Shifting gears, we'll look at the compute aspects of the Radeon VII. Though it is fundamentally similar to first generation Vega, there has been an emphasis on improved compute for Vega 20, and we may see it here.

Beginning with CompuBench 2.0, the latest iteration of Kishonti's GPU compute benchmark suite offers a wide array of different practical compute workloads, and we’ve decided to focus on level set segmentation, optical flow modeling, and N-Body physics simulations.

Compute: CompuBench 2.0 - Level Set Segmentation 256Compute: CompuBench 2.0 - N-Body Simulation 1024KCompute: CompuBench 2.0 - Optical Flow

Moving on, we'll also look at single precision floating point performance with FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance.

Compute: Folding @ Home (Single and Double Precision)

Next is Geekbench 4's GPU compute suite. A multi-faceted test suite, Geekbench 4 runs seven different GPU sub-tests, ranging from face detection to FFTs, and then averages out their scores via their geometric mean. As a result Geekbench 4 isn't testing any one workload, but rather is an average of many different basic workloads.

Compute: Geekbench 4 - GPU Compute - Total Score

 

Compute: SiSoftware Sandra 2018 - GP Processing (OpenCL)

Compute: SiSoftware Sandra 2018 - Pixel Shader Compute (DX11)

 

Next up are synthetic tests.

Synthetic: TessMark, Image Set 4, 64x Tessellation

 

Synthetic: Beyond3D Suite - Pixel Fillrate

 

Synthetic: Beyond3D Suite - Integer Texture Fillrate (INT8)

Compute/ProViz: SPECviewperf 13 - 3dsmax-06Compute/ProViz: SPECviewperf 13 - catia-05Compute/ProViz: SPECviewperf 13 - creo-02Compute/ProViz: SPECviewperf 13 - energy-02Compute/ProViz: SPECviewperf 13 - maya-05Compute/ProViz: SPECviewperf 13 - medical-02Compute/ProViz: SPECviewperf 13 - showcase-02Compute/ProViz: SPECviewperf 13 - snx-03 (Siemens NX)Compute/ProViz: SPECviewperf 13 - sw-04 (Solidworks)

Compute/ProViz: LuxMark 3.1 - LuxBall and Hotel

Compute/ProViz: Cycles - Blender Benchmark 1.0b2Compute/ProViz: V-Ray Benchmark 1.0.8Compute/ProViz: Indigo Renderer 4 - IndigoBench 4.0.64

With the variety of changes from the Vega 10 powered RX Vega 64 to the new Radeon VII and its Vega 20 GPU, we wanted to take a look at performance and compute while controlling for clockspeeds. In this way, we can peek at any substantial improvements or differences in pseudo-IPC. There's a couple caveats here; obviously, because the RX Vega 64 has 64 CUs while the Radeon VII has only 60 CUs, the comparison is already not exact. The other thing is that "IPC" is not the exact metric measured here, but more so how much graphics/compute work is done per clock cycle and how that might translate to performance. Isoclock GPU comparisons tend to be less useful when comparing across generations and architectures, as like in Vega designers often design to add pipeline stages to enable higher clockspeeds, but at the cost of reducing work done per cycle and usually also increasing latency.

For our purposes, the incremental nature of 2nd generation Vega allays some of those concerns, though unfortunately, Wattman was unable to downclock memory at this time, so we couldn't get a set of datapoints for when both cards are configured for comparable memory bandwidth. While the Vega GPU boost mechanics means there's not a static pinned clockspeed, both cards were set to 1500MHz, and both fluctuated from 1490 to 1500MHZ depending on workload. All combined, this means that these results should be taken as approximations and lacking granularity, but are useful in spotting significant increases or decreases. This also means that interpreting the results is trickier, but at a high level, if the Radeon VII outperforms the RX Vega 64 at a given non-memory bound workload, then we can assume meaningful 'work per cycle' enhancements relatively decoupled from CU count.

Ashes of the Singularity: Escalation - 3840x2160 - Extreme QualityAshes of the Singularity: Escalation - 2560x1440 - Extreme QualityGrand Theft Auto V - 3840x2160 - Very High QualityGrand Theft Auto V - 2560x1440 - Very High QualityF1 2018 - 3840x2160 - Ultra QualityF1 2018 - 2560x1440 - Ultra QualityShadow of War - 4K and 1440p - Ultra QualityWolfenstein II - 3840x2160 - Wolfenstein II - 2560x1440 -

As mentioned above, we were not able to control for the doubled memory bandwidth.

Compute/ProViz: SPECviewperf 13 - 3dsmax-06Compute/ProViz: SPECviewperf 13 - catia-05Compute/ProViz: SPECviewperf 13 - creo-02Compute/ProViz: SPECviewperf 13 - energy-02Compute/ProViz: SPECviewperf 13 - maya-05Compute/ProViz: SPECviewperf 13 - medical-02Compute/ProViz: SPECviewperf 13 - showcase-02Compute/ProViz: SPECviewperf 13 - snx-03 (Siemens NX)

SPECviewperf is a slightly different story, though

Compute/ProViz: Cycles - Blender Benchmark 1.0b2Compute/ProViz: V-Ray Benchmark 1.0.8Compute/ProViz: Indigo Renderer 4 - IndigoBench 4.0.64

As always, we'll take a look at power, temperature, and noise of the Radeon VII.

Idle Power ConsumptionLoad Power Consumption - Battlefield 1Load GPU Temperature - FurMark

 

Idle GPU TemperatureLoad GPU Temperature - Battlefield 1Load GPU Temperature - FurMark

 

Idle Noise LevelsLoad Noise Levels - Battlefield 1Load Noise Levels - FurMark

While there are definitely more areas to investigate, what we've seen of the Radeon VII is still the first 7nm gaming GPU, and that is no small feat. But beyond that, bringing it to consumers allows a mid-generation option for buyers; and the more enthusiast-grade choices, the merrier. The Radeon VII may be a dual-use prosumer/gaming product at heart, but it still has to measure up to being the fastest gaming card of the Radeon stack.

At the risk of being redundant, I can’t help but emphasize how surprised both Ryan and I are that this card is even here at this time. We’re still very early into the 7nm generation, and prior to last month, AMD seemed content to limit the Vega 20 GPU to their server-grade Radeon Instinct cards. Instead a confluence of factors has come into place to allow AMD to bring a chip that, by their own admission was originally built for servers, to the consumer market as a mid-generation kicker. There isn’t really a good precedent for the Radeon VII and its launch, and this makes things quite interesting from tech enthusiast point of view.

Kicking off our wrap-up then, let's talk about the performance numbers. Against its primary competition, the GeForce RTX 2080, the Radeon VII ends up 5-6% behind in our benchmark suite. Unfortunately the only games that it takes the lead are in Far Cry 5 and Battlefield 1, so the Radeon VII doesn't get to ‘trade blows’ as much as I'm sure AMD would have liked to see. Meanwhile, not unlike the RTX 2080 it competes with, AMD isn't looking to push the envelope on price-to-performance ratios here, so the Radeon VII isn't undercutting the pricing of the 2080 in any way. This is a perfectly reasonable choice for AMD to make given the state of the current market, but it does mean that when the card underperforms, there's no pricing advantage to help pick it back up.

Comparing the uplift over the original RX Vega 64 puts Radeon VII in a better light, being about 24% faster at 1440p and 32% faster at 4K. Reference-to-reference, this might even be grounds for an upgrade rather than a side-grade. But the fact of the matter is that its predecessor was competing against the second-tier GTX 1080, and now with the Radeon VII, Vega is still looking to match the performance of last generation’s flagship, the GTX 1080 Ti. The positioning is still set on pure gaming terms too, and power efficiency isn’t one of the allures of the Radeon VII.

Where it does have some interesting potential is on the compute and professional visualization side, though given our limited tests there’s little conclusive to say. So where does this leave AMD? Their situation is improved, but the overall competitive landscape hasn’t been significantly changed. The renewed AMD option to high-performance graphics is still important in terms of maintaining FreeSync possibilities. As an upgrade choice, the Radeon VII offers itself better as a high-VRAM prosumer card for gaming content creators, and of course that is intended justification for its $699 cost, a price point initially carved out by its competitor. For pure gamers, then the price is really the point of contention.

Let’s block ads! (Why?)

[ad_2]

Source link

Load More By admin
Load More In Tech

Leave a Reply

Your email address will not be published. Required fields are marked *

Check Also

IPhone owners can sue Apple for monopolizing App Store, Supreme Court rules – CNN

[ad_1] Justice Brett Kavanaugh, in the majority opinion, said that when “retailers e…