Last week I published an article looking at the power efficiency of 5th Gen AMD EPYC “Turin” using the modern AMD P-State driver and the Power Profile options. The AMD P-State driver being used by default now for the EPYC 9005 series processors on Linux 6.12+ and paired with Power Profile option tuning can deliver a nice boost to server energy efficiency with only modest impact to the performance. Today’s article are some complementary numbers I carried out recently on a Supermicro server looking at more of the Power Profile Selection options.
Today’s testing is a look at the Power Profile Selection tunables available on a retail Supermicro EPYC 9004/9005 series motherboard: the Supermicro H13SSL-N. This motherboard unfortunately though with the current BIOS lacks ACPI CPPC support and thus is unable to run with the AMD P-State driver even when using the newest versions of the Linux kernel. Thus the generic ACPI CPUFreq driver that’s long been the default for EPYC processors is still used. So keep that in mind that it’s less optimal for EPYC 9005 but nevertheless an interesting set of data points for those running on servers without ACPI CPPC support and/or relying on older enterprise Linux kernels where amd_pstate may not be available or in less ideal form if overriding it to be enabled for the EPYC 9005 processors.
These benchmarks were done on the Linux 6.12 kernel that was the newest at the time when I initiated these benchmarks on the AMD EPYC 9655 with the Supermicro H13SSL-N. The tested Power Profile Selection configurations paired with different ACPI CPUFreq governor options included:
- Balanced Core Memory Perf - acpi-cpufreq performance
- Balanced Core Memory Perf - acpi-cpufreq schedutil
- Balanced Core Perf - acpi-cpufreq performance
- Balanced Core Perf - acpi-cpufreq schedutil
- Balanced Memory Perf - acpi-cpufreq performance
- Balanced Memory Perf - acpi-cpufreq schedutil
- Default - High Perf - acpi-cpufreq performance
- Default - High Perf - acpi-cpufreq schedutil
- Efficiency Mode - acpi-cpufreq performance
- Efficiency Mode - acpi-cpufreq schedutil
- Maximum IO Perf - acpi-cpufreq performance
- Maximum IO Perf - acpi-cpufreq schedutil
The default Power Profile Selection is the “High Performance Mode”. For pre-6.12 kernels or platforms without ACPI CPPC, ACPI CPUFreq is the default CPU frequency scaling driver. Depending upon the Linux distribution or kernel modifications, the governor will either be the “performance” governor or the “schedutil” governor that makes use of scheduler utilization data. ACPI CPUFreq Schedutil is the default behavior on Ubuntu Linux.
Beyond looking at the impact to performance for each of these modes, the AC system power consumption (total “wall power”), CPU power consumption, CPU thermals, and CPU peak frequency were also recorded for each of these modes.
Again, these numbers are mainly being put out for reference purposes for those wondering about the impact of the Power Profile Selection either from the system BIOS or AMD HSMP utility for adjusting and this round was with the acpi-cpufreq driver.
When it comes to compiling the Linux kernel in a default x86_64 build, the results were basically split right down the middle. The default high performance mode, maximum I/O performance, and balanced memory performance all delivered similar build speeds. The balanced core memory performance, balanced core performance, and efficiency mode were meanwhile all grouped together and much slower than the other Power Profile configurations.
On a performance-per-Watt basis the different configurations leveled out more but at least for a Linux kernel build test the default behavior was tied with others for offering optimal efficiency on this Supermicro EPYC server.
When it came to compiling the LLVM compiler stack with the Ninja build system, the efficiency mode was yielding the slowest build speed but having a narrow lead in delivering the best power efficiency of the tested options.
It was a similar situation with the Node.js compilation with the efficiency mode indeed yielding the best efficiency for each build.
With some simple testing of the Nginx HTTPS web server, the Maximum I/O Performance power profile yielded slightly better performance than the default High Performance Mode. The Efficiency Mode had a small lead in delivering the best performance-per-Watt.
The PostgreSQL database server is one of the workloads where with ACPI CPUFreq there is a big difference between the performance and schedutil CPU frequency scaling governors.
For Memcached workloads here are the numbers for trading some performance for the best performance-per-Watt in the Efficiency Mode.
For the MariaDB database server, the performance governor with ACPI CPUFreq is very important. Here the balanced memory, maximum I/O performance, and default modes all yielded similar performance.
For those that may have been curious about the impact of the Power Profile Selection tunable on AMD EPYC 9005 series processors, hopefully this testing proved insightful.
Those wanting to see even more data from more than 170 benchmarks ran for this comparison, see this result file page for all of the raw data I collected during this round of benchmarking on the EPYC 9655 + Supermicro H13SSL-N server.
When taking the geometric mean of all the raw performance results, the default “high performance” power profile yielded the best performance with the ACPI CPUFreq performance governor. That though shouldn’t be much of a surprise. Running in the Efficiency Mode with the ACPI CPUFreq schedutil configuration led to 75% the performance overall of the defaults.
Here is a look at the AC “wall” system power consumption of this EPYC Turin 1P server across all of the benchmarks conducted. The total server power consumption was 70% that of the highest Balanced Memory Performance mode.
The CPU power consumption in the Efficiency Mode was at 64% the highest average, which is a significant savings while enjoying ~75% the performance.
For those curious what the peak CPU frequency looked like across the duration of all the benchmarks conducted.
Going in-step with the power savings, the CPU thermals were also lower in the Efficiency Mode and other tuned Power Profile states.
That’s the breakdown of data for those curious about AMD EPYC Turin with the Power Profile Selection options and using the ACPI CPUFreq driver. Again if you want to see even more workloads covered, there are 170+ benchmarks via this result file with all of the raw data collected. See last week’s article for numbers with AMD P-State using completely different processors and server platform.