Comments Locked

66 Comments

Back to Article

  • Cellar Door - Tuesday, June 27, 2017 - link

    Ryan - is this all we get, no review sample for you guys.

    Feels like a lot of smoke and mirrors going on here from AMD.
  • Ryan Smith - Tuesday, June 27, 2017 - link

    Correct, we were not sampled. To the best of my knowledge, no one was.

    However I expect these cards to completely sell out, so from AMD's perspective I doubt they saw a need to sample cards.
  • Cellar Door - Tuesday, June 27, 2017 - link

    Thanks for letting us know. Looking forward to your review in the future.
  • Alexvrb - Tuesday, June 27, 2017 - link

    Like you said in the article, "all the more reason for AMD to get cards out now so that developers can get started". The consumer version will be much more interesting, hopefully by the time it comes out mining demand will have died down a lot.
  • Lbolaji3 - Tuesday, June 27, 2017 - link

    Ryan,
    Thanks for the article very well covered. What leads you to think the card will sell out? Pricing? 13.1TF?
  • Ryan Smith - Wednesday, June 28, 2017 - link

    "What leads you to think the card will sell out?"

    A small supply combined with developers who will need to get started on the card to optimize their programs.
  • Santoval - Wednesday, June 28, 2017 - link

    The 13.1 TF number is at the boost clock, not at the base clock. The question is how long the boost clock can be sustained at a time. If the effective TDP at 1,600 Mhz is ~350W probably only for a few tens of seconds (in the air-cooling version).
  • ash9 - Thursday, June 29, 2017 - link

    Value of the year - Frontier Edition

    FE $999
    FE beat $13,000 P100 by 33% in DeepBench, deep learning benchmark
    FE. comparable to $4000/$2000- P6000/P5000
    FE visual professionals 4K/8K
    FE gamers too- early drivers, par with GTX 1080
    FE $999.
  • shabby - Tuesday, June 27, 2017 - link

    Less memory bandwidth than fury x? Is amd going forward or backwards?
  • Despoiler - Tuesday, June 27, 2017 - link

    That's just the way HBM and HBM2 differ. HBM2 is more dense so it requires less stacks than HBM for a given capacity. Less stacks = more narrow bus = less bandwidth.
  • Yojimbo - Tuesday, June 27, 2017 - link

    NVIDIA had been including 16 GB of HBM2 with 720 GB/s bandwidth for over a year, and they are just releasing a card with 900 GB/s bandwidth. The bandwidth limitations have more to do with cost control than the stacking topology of HBM versus HBM2.

    AMD chose to keep costs down and then subsequently chose to have more, slower VRAM rather than less, faster VRAM. With HBM1 having more than 4GB of VRAM was not a possibility, so such a choice was not available to AMD then. NVIDIA uses HBM2 on higher-priced cards and so can afford to have both large capacity and high bandwidth VRAM on them.
  • Stuka87 - Tuesday, June 27, 2017 - link

    Yojimbo you are entirely wrong here. The GP100 shipped in March, not over a year ago. And it cost almost $9000, this is in NO WAY comparable to what AMD is offering here.
  • nevcairiel - Tuesday, June 27, 2017 - link

    Actually the Tesla P100 was available since around august 2016, so thats almost a year. Certainly its much more expensive, but that only underlines the argument that AMD choose the cheaper route to setup their HBM, going for capacity/price instead of bandwidth.
  • CiccioB - Tuesday, June 27, 2017 - link

    GP100 is in production since October 2015 and first samples used for internal server at Nvidia. The point is that HBM2 with that density and business width has been available for 18 months now.
    The use of just 2 stacks instead of 4 is for cost containment. The available bandwidth is not a technical constriction, just an economical one.
    The fact that Nvidia can sell a similar card at 9 times the price of this one clearly indicates their different aims despite being similar in used resources and capabilities.
    One has to wonder why AMD has to constantly discount their products to make them interesting to anyone wanting to use them.
    Maybe HW, transistors, bandwidth and ALUs are not the only things that determine prices (though they determine costs, and in this case, very high costs).
    I'm really interested in knowing what are the margins (or negative gains) for each Vega sold as a gaming card. Which is the reason I see for AMD to make the few of them, just to have benchmarks published, some extra positive comments but not real availability.
  • eek2121 - Tuesday, June 27, 2017 - link

    You guys are confused. The first nvidia chip to use HBM2 was the P100 and it shipped in Q4 2016 in VERY limited quantities. There wasn't an HBM2 part prior to that as HBM2 WAS NOT AVAILABLE. Need sources? Try this very website for starters. Also, the Tesla P100 and beyond are NOT competitors to the Vega FE. They cost more than 10X as much for starters.
  • CiccioB - Wednesday, June 28, 2017 - link

    GP100 production started in October 2015. There are samples dated 43/2015 and they are not engineering samples. Since January 2016 produced Tesla cards where used to assemble a server in Nvidia (which appears on Top 500 faster HPC servers list) and in March 2016 Nvidia made available already assembled boards with P100 cards installed. Probably HBM2 was not available for anyone else, but Nvidia had its products done.

    For the comparison not being suitable due to the price, sorry, it's not how that works. Vega has been built with the same computing capabilities which are in GP100 (FP16, INT8, INT32 and fast with these even though less fast with FP64) . GP102 to which you want to compare has any of those capabilities. It's just that it arrived late, slower, more energy hungry and most of all without SW support.
    That's why Vega can't be sold at Nvidia premium prices and margins.

    Like Fiji, Vega arrived too late with too low performances. Good for patching the market segment for a while, at discounted price. Until Volta is released, after which even more discounted.
    Being a generation behind does not help in selling at premium prices even though you build your GPU first the same work as the competition.
  • Yojimbo - Wednesday, June 28, 2017 - link

    The P100 is the card we are talking about. You're wrong about when it first existed. The big internet companies were buying them up for internal use. Yes the card's volume grew as time went on, but it existed over a year ago.

    For the third time, the difference in price between the Tesla P100 and the Vega FE was my point. Despoiler said, "That's just the way HBM and HBM2 differ. HBM2 is more dense so it requires less stacks than HBM for a given capacity. Less stacks = more narrow bus = less bandwidth." But the P100 shows that it doesn't have to be that way. It is entirely technically possible for a card using HBM2 to have both 16GB of VRAM and bandwidth much higher than 483 GB/s.

    AMD chose to use two 8-core-high stacks of HBM2 with 8 GB per stack for a 2048-bit bus and a total of 16 GB. They could have chosen to use four 4-high stacks of HBM2 with 4 GB per stack for a 4096-bit bus and the same total of 16 GB. They made the choice they did precisely because it's cheaper, not because the other choice is technically infeasible, unlike what Despoiler implied.

    One more time for good measure: To make the bandwidth of the Vega FE lower than the bandwidth of the Fury X was a choice AMD made. It's not that it has to be that way because of the density of HBM2. Why is that so hard to understand?
  • CiccioB - Wednesday, June 28, 2017 - link

    <blockquote>Why is that so hard to understand?</blockquote>
    Because fanboys need to address poor performances to undefined causes which have not to be related to AMD choices, skills or strategy. These ones are always perfect and it is the environment outside that is hostile for them to succeed.
    Think about it. Whatever AMD problem, the cause is always by others.
  • T1beriu - Tuesday, June 27, 2017 - link

    Flawed logic.
  • DanNeely - Tuesday, June 27, 2017 - link

    Half the stacks because twice as dense, unfortunately not quite twice as the clock speed so slightly less BW. Fewer stacks makes manufacturing easier because in makes the interposer somewhat simpler. NVidia's HBM 2 in Telsa cards is only running at 1.4/1.75GHz so I'm inclined to blame the HMB makers for not getting the clocks as high as were expected not AMD for having their memory controller fall short of design targets.
  • CiccioB - Tuesday, June 27, 2017 - link

    More stacks and TVS doesn't allow for raising clocks too much. That was expected. It's not a surprise AMD didn't know.
    That's why Nvidia developed a 4 stacks board and not a simple 2 stacks one. They payed more for them, included a bigger interposer, but could use less dense memory stack already available running at lower clock, again, already available.
    In short, 18 months advantage over the concurrent that had to spare as much as possible on production costs seen they cannot sell the board at premium prices as they wanted to cover also the high end consumer market with a HBM2 enabled board (like shooting flyers with cannons, still they will have long colored bars in benches).
  • bill.rookard - Tuesday, June 27, 2017 - link

    Wonder how this would do at F@H. Or maybe Etherium mining.
  • beginner99 - Tuesday, June 27, 2017 - link

    Honestly I hope it sucks at mining. Else good look buying one for gaming...
  • bronan - Wednesday, June 28, 2017 - link

    nope it will be very good at mining but be glad that the mining market is dropping like a rocket.
    So miners will not buy these expensive cards because they will not make enough to cover the price fast enough.
    Regarding the latest top pro nvidia cards being so heavy priced is primary because of the cuda programming making it possible to do some more complex calculations.
    The raw compute power of AMD is higher but does not have a similar solution. Thats why you see that some companies need nvidia and others can do more with AMD.
    You can clearly see in the crypto mining world that in most situations where less accurate results and a little less complex calculation the AMD cards rule in performance with a little less power.
    But some crypto currency need the cuda language to get the result and there we see nvidia being the preffered one. Overall amd is the winner in most of the situations which is for consumers who wanted the smaller rx470 for the lower prices they expected the problem.
    One other thing is important miners do not care about noise ;)
    Anyway i do want to repeat both have a completely different design and i hope that AMD will make enough profit to get enough R&D going to make a better programmable GPGPU in the near future.
  • cfenton - Tuesday, June 27, 2017 - link

    It will probably do very well from an absolute MH/s perspective, but not so well in a MH/$ perspective. If RX 480s were still selling at sane prices, this thing would cost between 4x-5x as much. It's certainly not going to be 4x-5x faster. Even with the insane prices at the moment, it would still have to be 2x-3x faster, which I doubt.
  • PixyMisa - Wednesday, June 28, 2017 - link

    Probably 2x. I'd be very surprised if it was even close to 3x.
  • CiccioB - Wednesday, June 28, 2017 - link

    It may be a bit more than 2x as it has about 2x computing capabilities but HBM may help with low latency to get more NH/s, seen that Ethereum algorithm is quite sensible to memory latency (see performance with 1080 and 1080Ti),
  • Pork@III - Tuesday, June 27, 2017 - link

    $500 difference only by type of cooling WTF?
  • T1beriu - Tuesday, June 27, 2017 - link

    Bling is expensive.
  • DanNeely - Tuesday, June 27, 2017 - link

    AMD felt jealous of how much money NVidia was printing with Titan cards; so they launched their halo cards even higher with a larger fanboi tax for the marginally faster absolute top end card. AMD needs the money badly; hopefully their hardest core fans will be willing to pay up.
  • FourEyedGeek - Tuesday, June 27, 2017 - link

    Whoa so much bait thrown out.
  • bronan - Wednesday, June 28, 2017 - link

    typical nviida fan boy comment
  • bronan - Wednesday, June 28, 2017 - link

    Yes if you have seen what cooling does on high performance cards...
    AMD is the first to make this move and release this card with water cooling.
    Better cooling means higher performance for a longer time period.
    nividia never has sold any water cooled solutions factory made.
    I think its a smart move because companies never will goto a shop buy aftermarket coolers and build and place them into workstations.
    The fact that they make a complete solution which probably will run more stable at a premium price probably will sell to those who constant push the cards to its limits.
  • masouth - Wednesday, June 28, 2017 - link

    "AMD is the first to make this move and release this card with water cooling."
    "nividia never has sold any water cooled solutions factory made."

    That is either ignorance or disingenuous. nVidia has had reference designed AiO boards since at least the 9 series so this isn't a first and nVidia certainly has.
  • extide - Wednesday, June 28, 2017 - link

    Those have always been made by partners -- never nVidia built reference boards. AMD has sold many cards directly with watercooling. Most of the recent dual GPU cards, and FuryX, plus this.
  • masouth - Wednesday, June 28, 2017 - link

    no, nVidia has built actual AIO reference board designs for partners to use. Again, you do not have to go any further back than the 9 series to see it. Partner AIO cards are also still considered factory cards, you are not installing the watercooling.

    Speaking of nVidia making cards, they aren't really in the business of actually producing/ distributing cards in the first place wether air or water cooled.
  • Santoval - Wednesday, June 28, 2017 - link

    You pay $500 more not just for water cooling but also for longer boost clocks, thus getting longer sustained 13.1 TFLOPs (SP), at the moment the higher in the market (even higher than Titan Xp's 12 SP TFLOPs and Tesla P100's 9.3 to 10.6 SP TFLOPs). And that matters a lot for professional creators, because it means shorter rendering times (to provide one key example). So, at least until Volta is released Vega 10 is the king of single and half precision - in boost clock TFLOPs anyway. But if you want double precision -and later, when Volta is released, INT8 for matrix multiplication in AI- then you need Nvidia.
  • crashtech - Tuesday, June 27, 2017 - link

    Didn't know this had posted, as it's not on the front page. Perhaps the dearth of good info is keeping it on the inside pages?
  • DanNeely - Tuesday, June 27, 2017 - link

    It's a one page blurb not an in depth article, so pipeline is the appropriate place. The main page is for in depth articles. If Anandtech has a card on pre-order (AMD isn't sampling these to reviewers) a few days after it ships we should see a performance article.
  • testbug00 - Tuesday, June 27, 2017 - link

    they also come with professional drivers....
    Which makes them way more valuable than the Titan cards.
  • webdoctors - Tuesday, June 27, 2017 - link

    Can anandtech just buy one and review it?

    Is it too expensive or is it my adblocking SW causing a lack of funds?
  • nikon133 - Tuesday, June 27, 2017 - link

    All your fault. ;)
  • lefty2 - Tuesday, June 27, 2017 - link

    Apparently, they are impossible to buy. No units available anywhere
  • Pork@III - Tuesday, June 27, 2017 - link

    When virtual coins be banned you'll see how many videocardds have in the markets :D
  • nevcairiel - Tuesday, June 27, 2017 - link

    No sane miner would buy such an expensive card for mining. The mid-range cards are where the cost effective cards are going to miners.
  • PixyMisa - Wednesday, June 28, 2017 - link

    Saw one Eth miner post that they had 150 on pre-order.
  • bronan - Wednesday, June 28, 2017 - link

    lol he probably have to end his order quick because the eth dropped to 10 cents in a few days
  • cybertec69 - Tuesday, June 27, 2017 - link

    Here is a review of the card, if you want to call it a review
    https://youtu.be/D5GcpYA7_wY
  • jjj - Tuesday, June 27, 2017 - link

    In terms of W per TFLOP ,it doesn't seem that AMD has made huge progress, at least at shipping clocks -seems they are somewhere between 25 and 30W per TFLOP at board level, depending on clocks.
    Maybe there is more progress when it comes to perf per W in gaming but the fact that they are pushing clocks and the huge delays are not good signs.
  • FourEyedGeek - Tuesday, June 27, 2017 - link

    I wonder if a slight under clock could allow for a significant under volt.
  • jjj - Wednesday, June 28, 2017 - link

    Most likely.
    Would be nice if the lowest end Vega RXSKU would be clock limited instead of cores disabled but that doesn't work from a yield perspective.
    Maybe they should make a Green SKU that keeps clocks lower LOL.Not sure the world is ready for that but maybe paired with Radeon Chill, it could be marketable.
  • extide - Wednesday, June 28, 2017 - link

    You mean, like the Nano..? They already did that and it was reasonably successful, at least after they dropped the price.
  • jjj - Thursday, June 29, 2017 - link

    The marketing there was about form factor.
  • edzieba - Wednesday, June 28, 2017 - link

    Not only is it not huge progress, it's zero progress: for FP32, 8.6/13.1 TFLOP = 1600/1050 MHz. It's a direct scaling with frequency due to the same number of SPs. FP16 doubles from that number due to the use of packed math as seen in later Polaris chips. This very much appears to be a similar case to the Rx-4xx > Rx-5xx series clock bumps, though Vega also has a memory controller update (half the bus, at a bit less than double the pin data rate). It would not be unexpected to see real-world performance scale similarly from the Fury (around 1.52x), though the mention of 'Typical Clock' rather than 'Base Clock' may mean poorer scaling if power/temperature limiting is a factor.
  • jjj - Wednesday, June 28, 2017 - link

    We only have TDP numbers so we'll see.

    Typical Clock might indicate a variable base clock based on sensor input.
  • CiccioB - Wednesday, June 28, 2017 - link

    You can't evaluate architectural improvements using maximum TFLOPS values, as these are calculated synthetically by multiplying number of ALU * 2 * frequency to simulate the maximum throughput of FMA instructions (that executes 2 operations in a single cycle).
    Architectural improvements are those that allow general algorithms that are a mix and more complex fluxes of instructions, that are not all FMA and so cannot reach the peak TFLOPS, to be as close as possible to that.
    With number on paper you cannot know the improvements, that's why you need different benches with different scenery and algorithms (computing tasks are different to graphic tasks and both are different to gaming tasks),
  • psychobriggsy - Wednesday, June 28, 2017 - link

    That's how peak TFLOPS calculations work.

    Vega 10 will be smaller than Fiji due to the process shrink, despite the FP16 support that Polaris does not have (unless you count the PS4 Pro variant).

    What we do not know is if the IPC (efficiency) has improved with Vega. It is one of AMD's claims, so it should have.

    Whilst the HBM2 is half the memory bus, it is running faster. Whilst the on-paper bandwidth is a little lower than the on-paper bandwidth of Fiji, in reality Fiji could not make use of it all, and estimates are of between 350-400GB/s in the real world. In addition, Vega has advanced memory bandwidth saving techniques, which will aid a lot as well.

    The mention of typical clock is a concern, I would be hoping to see the SKU hit full clocks most of the time, at least in gaming. I could understand different behaviour in the professional profile.
  • bronan - Wednesday, June 28, 2017 - link

    Agree i am concerned about the change they made from base to typical which makes the water cooled card probably a much better choice to make.
  • msroadkill612 - Wednesday, June 28, 2017 - link

    A key number u would think is idle~ power. Who cares about power when u need the grunt, but its nice to have an affordable, always on pc.
  • CiccioB - Wednesday, June 28, 2017 - link

    If you do nothing all day with your PC it is important to have a low idle power, but if you use your GPU then how much it consumes is important as well, especially when high power consumption = high noise.
    BTW, Watts difference on idle power is measured on a single hand, when working you may need few dozens of them.
  • msroadkill612 - Sunday, July 2, 2017 - link

    Some folks work w/ a jack hammer. u gotta make a living.

    Sure, ~tdp matters, just saying, i would like to know what it uses when i do emails or am asleep, which for many may be most of the time. Yet it seems rarely mentioned.
  • commissioneranthony - Wednesday, June 28, 2017 - link

    Does the Radeon Vega Frontier Edition have Solidworks certified drivers for realview? I tried asking on twitter and got nothing. Thanks!
  • jjj - Thursday, June 29, 2017 - link

    This is gonna get so messy for AMD.

    With GTX 1080 perf in gaming, folks are gonna assume that's all it can do and at the end of the day that's fair as AMD has not shown that it can do more.
    Ofc the GTX 1080 is only priced as a high end card but it should be 300-350$ really, after all it's just 314mm2 , 256 bit bus and 8GB of GDDR. And then there is power, the 1080 is not that big, has no reason to use all that much power while Vega ....

    For an entire month Vega will be considered the new Bulldozer, unless AMD can show that it is not.

Log in

Don't have an account? Sign up now