NVIDIA Releases Quadro RTX, Quadro T, and Quadro P620/P520 GPUs for Notebooks
by Ryan Smith on May 27, 2019 11:00 AM EST- Posted in
- GPUs
- Quadro
- NVIDIA
- Pascal
- Notebooks
- Quadro RTX
- Computex 2019
Along with today’s NVIDIA Studio branding announcement, NVIDIA is also using Computex to update their lineup of Quadro GPUs for notebooks and mobile workstations. Along with bringing some of the existing Quadro RTX desktop parts to the mobile space, the company is also launching a sub-series of parts under the Quadro T series, and finally a pair of new Quadro P series graphics adapters for the low-end.
Starting things off, we have the mobile Quadro RTX parts, which are all new for the mobile space. Like NVIDIA’s GeForce mobile counterparts, these Quadro RTX mobile parts are essentially the same chip configurations as their desktop siblings, but put into a mobile form factor and with their TDPs and clockspeeds turned down accordingly. As a result the mobile Quadro RTX parts pack all the features and VRAM of the desktop parts that NVIDIA has previously launched, while retaining a good deal of their performance and all of the Turing architecture's functionality.
NVIDIA Mobile Quadro RTX Spec Comparison | ||||||
RTX 5000 | RTX 4000 | RTX 3000 | P5200 | |||
CUDA Cores | 3072 | 2560 | 2304 | 2560 | ||
Boost Clock | ~1.53GHz | ~1.56GHz | ~1.39GHz | ~1.74GHz | ||
Memory Clock | 14Gbps GDDR6 | 14Gbps GDDR6 | 14Gbps GDDR6 | 8Gbps GDDR5 | ||
Memory Bus Width | 256-bit | 256-bit | 192-bit | 256-bit | ||
VRAM | 16GB | 8GB | 6GB | 16GB | ||
Single Precision Perf. | 9.4 TFLOPs | 8 TFLOPs | 6.4 TFLOPs | 8.9 TFLOPs | ||
Tensor Perf. (FP16) | 75.2 TOPs | 63.9 TOPs | 51.4 TOPs | N/A | ||
TGP Max Power | 80-110W | 80-110W | 60-80W | 150W | ||
GPU | TU104 | TU104 | TU106 | GP104 | ||
Transistor Count | 13.6B | 13.6B | 10.8B | 7.2B | ||
Architecture | Turing | Turing | Turing | Pascal | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 12nm "FFN" | TSMC 16nm | ||
Launch Date | 05/27/2019 | 05/27/2019 | 05/27/2019 | N/A |
Owing to the tighter TDPs of mobile, NVIDIA’s mobile Quadro RTX stack doesn’t go quite as high as it does on the desktop. For mobile the fastest part is the Quadro RTX 5000, which is based on the same TU104 GPU as the desktop version. This part replaces the Quadro P5200 as NVIDIA’s flagship mobile Quadro part. Meanwhile below that we have the Quadro RTX 4000 and RTX 3000, which appear to be based on a cut-down TU104 and full-fledged TU106 GPU respectively.
In terms of performance, the RTX 5000 will top out at 9.4 TFLOPs, followed by 8 TFLOPs for the RTX 4000 and 6.4 TFLOPs for the RTX 3000. NVIDIA’s peak clockspeeds seem to vary a bit depending on the processor – we’re estimating anywhere from 1.39GHz to 1.56GHz – though these are still fairly aggressive for a mobile part. Sustained performance will be lower, of course, with that varying with the cooling capabilities of the host laptop.
Meanwhile in terms of memory, the situation is again a mirror of the desktop. The RTX 5000 gets 16GB of GDDR6 – a full complement of memory for a mobile TU104 part – while RTX 4000 and RTX 3000 drop down to 8GB and 6GB respectively. NVIDIA continues to treat memory capacity as a feature differentiator between the Quadro and GeForce families and even among Quadro cards, so the 16GB RTX 5000 is a halo part in this respect. The flip side, however, is that RTX 5000 doesn’t improve on its predecessor here as far as capacity goes, as both the old and new cards are 16GB.
It is interesting to note that while performance has gone up and memory capacities have at least held even, power consumption is actually down generation-over-generation. Starting with the mobile Quadro RTX series, NVIDIA is providing a range of max power values instead a single value, but even at the top of this range, none of these cards passes 110W, well below the 150W that the older P5200 peaked at. The RTX 4000 and RTX 3000 parts don’t see quite the same savings as their own predecessors, but the range is still there. NVIDIA seems increasingly focused on getting high-end GPUs into ever thinner and lighter notebooks, so bringing down their TDPs is a huge component of how they’re going to get there.
Overall, the Quadro RTX series is the flagship series in terms of features. Of particular note here, all of these parts include NVIDIA’s ray tracing hardware acceleration – hence the RTX moniker – so they benefit the most from all of NVIDIA’s efforts to get ray tracing incorporated into various content creation applications. They also have a full tensor core complement for their size, which along with helping RT performance also means they can hold their own in neural network simulations and other tensor-related tasks.
Quadro T Series – T2000 and T1000
Also new to the mobile Quadro family are the Quadro T series parts, the Quadro T2000 and Quadro T1000. These parts slot in below the Quadro RTX parts in terms of performance, power consumption, and features, providing a clear progression downward in terms of price versus functionality.
NVIDIA Mobile Quadro T Spec Comparison | ||||||
T2000 | T1000 | P2000 | P1000 | |||
CUDA Cores | 1024 | 768 | 768 | 512 | ||
Boost Clock | ~1.71GHz | ~1.69GHz | ~1.56GHz | ~1.56GHz | ||
Memory Clock | 8Gbps GDDR5 | 8Gbps GDDR5 | 6Gbps GDDR5 | 6Gbps GDDR5 | ||
Memory Bus Width | 128-bit | 128-bit | 128-bit | 128-bit | ||
VRAM | 4GB | 4GB | 4GB | 4GB | ||
Single Precision Perf. | 3.5 TFLOPs | 2.6 TFLOPs | 2.4 TFLOPs | 1.6 TFLOPs | ||
TGP Max Power | 40-60W | 40-50W | 50W | 40W | ||
GPU | TU117 | TU117 | GP107 | GP107 | ||
Transistor Count | 4.7B | 4.7B | 3.3B | 3.3B | ||
Architecture | Turing | Turing | Pascal | Pascal | ||
Manufacturing Process | TSMC 12nm "FFN" | TSMC 12nm "FFN" | GloFo 14nm | GloFo 14nm | ||
Launch Date | 05/27/2019 | 05/27/2019 | N/A | N/A |
Both of the new Quadro T series parts are based on the same TU117 GPU, which is NVIDIA’s smallest Turing architecture GPU. As a result there’s a pretty significant gap in performance between the T2000 and RTX 3000; performance drops by around 45%. At peak clockspeeds, this translates to around 3.5 TFLOPs and 2.6 TFLOPs of FP32 performance respectively.
In terms of memory, both cards come with 4GB of GDDR5, which is clocked at 8Gbps and attached to a 128-bit memory bus. These are very much low-end cards, so it looks like NVIDIA is aiming to be cost-efficient rather than offer more memory, which would start undercutting the RTX 3000 and its 6GB of VRAM. Meanwhile TDPs are down to a max of 60W for the T2000, and a max of 50W for the T1000. These are again ranges, depending on what the laptop OEM designs for, and performance will scale accordingly.
Overall these are Turing parts, but they are based on we’ve been calling NVIDIA’s “Turing Minor” GPUs. Turing Minor parts have the same core architecture as Turing – so all of the performance optimizations and new rasterization/shading features that come with the Turing architecture – however they forgo the ray tracing hardware acceleration and tensor cores. As a result these parts are leaner and meaner, however they are hardly the part of choice if ray tracing acceleration is needed. This is a level of feature differentiation that past generations of the Quadro family has lacked, since they’ve typically been based on a single, unified GPU architecture (outside of the very cheapest parts).
Quadro P Series Expanded – P620 and P520
Finally, bringing up the rear of the new mobile Quadro product stack are the Quadro P620 and Quadro P520. As hinted at by the name, these parts aren’t Turing based at all. Instead, they are minor refreshes of the existing Pascal-based P600/P500 parts. Since NVIDIA’s Turing GPU stack doesn’t go below the TU117 used in the T2000/T1000, for these smallest and cheapest of parts, NVIDIA instead relies on their bottom-tier Pascal GPUs.
NVIDIA Mobile Quadro P Spec Comparison | ||||||
P620 | P520 | P600 | P500 | |||
CUDA Cores | 512 | 384 | 384 | 256 | ||
Boost Clock | ~1.46GHz | ~1.43GHz | ~1.56GHz | ~1.46GHz | ||
Memory Clock | 6Gbps GDDR5 | 6Gbps GDDR5 | 5Gbps GDDR5 | 5Gbps GDDR5 | ||
Memory Bus Width | 128-bit | 64-bit | 128-bit | 64-bit | ||
VRAM | 4GB | 2GB | 4GB | 2GB | ||
Single Precision Perf. | 1.5 TFLOPs | 1.1 TFLOPs | 1.2 TFLOPs | 0.75 TFLOPs | ||
TGP Max Power | 25W | 18W | 25W | 18W | ||
GPU | GP107 | GP108 | GP107 | GP108 | ||
Transistor Count | 3.3B | 1.8B | 3.3B | 1.8B | ||
Architecture | Pascal | Pascal | Pascal | Pascal | ||
Manufacturing Process | GloFo 14nm | GloFo 14nm | GloFo 14nm | GloFo 14nm | ||
Launch Date | 05/27/2019 | 05/27/2019 | N/A | N/A |
Relative to their immediate predecessors, both the P620 and P520 do see some fairly decent performance bumps, thanks to NVIDIA enabling more CUDA cores this time around. Still, with the fastest part topping out at 1.5 TFLOPs, there’s a clear jump in performance between the P series parts and the new T series parts.
In terms of memory speeds, both the P620 and P520 are receiving GDDR5 clocked at 6Gbps, up from 5Gbps the generation prior. However the P5xx parts all retain their 64-bit memory bus, so unlike their fastest 4GB siblings, the cheapest P520 gets just 2GB of VRAM and all of 48Gbps of memory bandwidth. Meanwhile power consumption is being held constant from the last generation, at 25W and 18W respectively for the P620 and P520.
On the whole, the refreshed Quadro P series parts are meant to serve as the entry-level parts in NVIDIA’s mobile Quadro product stack, and it shows. They will be cheap and will become a common feature in low-end productivity laptops, but they bring equally limited performance, and they don’t come with any of Turing’s new features.
Want to keep up to date with all of our Computex 2019 Coverage? | ||||||
Laptops |
Hardware |
Chips |
||||
Follow AnandTech's breaking news here! |
Source: NVIDIA
5 Comments
View All Comments
Alsw - Monday, May 27, 2019 - link
Thanks for the breakdown, been waiting to find out what was going on with these. Looks like they should perform very well for our needs (CAD + GPU compute for iRay) only disappointment is larger memory tends to be limited to the very highest would have been nice to see RTX 3000 at 8GB, 4000 at 12GB and 5000 at 16GBWas there any mention on the desktop side as currently the lowest Turing gen quadro is the RTX 4000, Pascal is used on everything below that.
P.s. typo for the RTX 3000 in the first table that only has 6GB memory
jabbadap - Monday, May 27, 2019 - link
...And a little more typos on the tables, p2000 and p600 does not have gddr6 memory.12GB vram would mean 192bit bus, but yeah added memory buffer would have been a decent trade of over memory BW.
Ryan Smith - Monday, May 27, 2019 - link
Thanks!Bulat Ziganshin - Tuesday, May 28, 2019 - link
48Gbps -> 48GBps or rather 48GB/sMadManMark - Tuesday, May 28, 2019 - link
I presume the charts showing GP107 and GP108 are to be produced on GloFo 14nm process is another "typo?"