AMD's Future in Servers: New 7000-Series CPUs Launched and EPYC Analysis
by Ian Cutress on June 20, 2017 4:00 PM EST- Posted in
- CPUs
- AMD
- Enterprise CPUs
- EPYC
- Whitehaven
- 1P
- 2P
Power
As with the Ryzen parts, EPYC will support 0.25x multipliers for P-state jumps of 25 MHz. With sufficient cooling, different workloads will be able to move between the base frequency and the maximum boost frequency in these jumps – AMD states that by offering smaller jumps it allows for smoother transitions rather than locking PLLs to move straight up and down, providing a more predictable performance implementation. This links into AMD’s new strategy of performance determinism vs power determinism.
Each of the EPYC CPUs include two new modes, one based on power and one based on performance. When a system configured at boot time to a specific maximum power, performance may vary based on the environment but the power is ultimately limited at the high end. For performance, the frequency is guaranteed, but not the power. This enables AMD customers to plan in advance without worrying about how different processors perform with regards voltage/frequency/leakage, or helps provide deterministic performance in all environments. This is done at the system level at boot time, so all VMs/containers on a system will be affected by this.
This extends into selectable power limits. For EPYC, AMD is offering the ability to run processors at a lower or higher TDP than out of the box – most users are likely familiar with Intel’s cTDP Up and cTDP Down modes on the mobile processors, and this feature by AMD is somewhat similar. As a result, the TDP limits given at the start of this piece can go down 15W or up 20W:
EPYC TDP Modes | ||
Low TDP | Regular TDP | High TDP |
155W | 180W | 200W |
140W | 155W | 175W |
105W | 120W | - |
The sole 120W processor at this point is the 8-core EPYC 7251 which is geared towards memory limited workloads that pay licenses per core, hence why it does not get a higher power band to work towards.
Workload-Aware Power Management
One of AMD’s points about the sort of workloads that might be run on EPYC is that sporadic tasks are sometimes hard to judge, or are not latency sensitive. In a non-latency sensitive environment, in order to conserve power, the CPU could spread the workload out across more cores at a lower frequency. We’ve seen this sort of policy before on Intel’s Skylake and up processors, going so far as duty cycling at the efficiency point to conserve power, or in the mobile space. AMD is bringing this to the EPYC line as well.
Rather than staying at the high frequency and continually powering up and down, by reducing the frequency such the cores are active longer, latency is traded for power efficiency. AMD is claiming up to a 10% perf-per-Watt improvement with this feature.
Frequency and voltage can be adjusted for each core independently, helping drive this feature. The silicon implements per-core linear regulators that work with the onboard sensor control to adjust the AVFS for the workload and the environment. We are told that this helps reduce the variability from core-to-core and chip-to-chip, with regulation supported with 2mV accuracy. We’ve seen some of this in Carrizo and Bristol Ridge already, although we are told that the goal for per-core VDO was always meant to be EPYC.
This can not only happen on the core, but also on the Infinity Fabric links between the CPU dies or between the sockets. By modulating the link width and analyzing traffic patterns, AMD claims another 8% perf-per-Watt for socket-to-socket communications.
Performance-Per-Watt Claims
For the EPYC system, AMD is claiming power efficiency results in terms of SPEC, compiled on GCC 6.2:
AMD Claims 2P EPYC 7601 vs 2P E5-2699A V4 |
||
SPECint | SPECfp | |
Performance | 1.47x | 1.75x |
Average Power | 0.96x | 0.99x |
Total System Level Energy | 0.88x | 0.78x |
Overall Perf/Watt | 1.54x | 1.76x |
Comparing a 2P high-end EPYC 7601 server against Intel’s current best 2P E5-2699A v4 arrangement, AMD is claiming a 1.54x perf/watt for integer performance and 1.76x perf/watt on floating point performance, giving more performance for a lower average power resulting in overall power gains. Again, we cannot confirm these numbers, so we look forward to testing.
131 Comments
View All Comments
yuhong - Tuesday, June 20, 2017 - link
"Linus kernel"IanHagen - Tuesday, June 20, 2017 - link
Linux Tech Tips?Krysto - Tuesday, June 20, 2017 - link
That Linux guy makes some great videos.thomasg - Tuesday, June 27, 2017 - link
Believe it or not - for the older guys here the more well known Linus is Linus Torvalds, the creater of Linux (after whom it is named), not a YouTuber.For those that freudian typo isn't quite that hilarious.
Guspaz - Tuesday, June 20, 2017 - link
sudo modprobe socks_sandalsHollyJordan - Wednesday, June 21, 2017 - link
Do you have a pay_pal ? because you can generate an extra 1000 /week in your earnings only working at home for five hours a day... check.philehidiot - Tuesday, June 27, 2017 - link
Ooooh, I have Pay Pal and I'd love to earn 1000 Tasmanian Pesos a week for 5 hours a day from the comfort of my own home.I can imagine that would work out as all of $4/month.
OddFriendship8989 - Wednesday, June 28, 2017 - link
How about 1000 Bitcoins?satai - Tuesday, June 20, 2017 - link
1 socket 16 / 32 @ 2.9GHz max for $700+... it looks like 16 core Threadripper with reasonable frequencies for less then $999 looks reasonable.spikebike - Tuesday, June 20, 2017 - link
Typically desktop/workstation chips have higher clocks and less cores than the pure servers. But I'd be really surprised if the threadripper isn't significantly cheaper than the Epyc. Keep in mind a thread ripper has half the silicon dies, half the memory channels, and half the pci-e of the Epyc.