Readit News logoReadit News
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
BrendanLong · 6 days ago
Fun data point though, I just ran three data points of the Phoronix nginx benchmark and got these results:

- Pinned to 6 cores: 28k QPS

- Pinned to 12 cores: 56k QPS

- All 24 cores: 62k QPS

I'm not sure how this applies to realistic workloads where you're using all of the cores but not maxing them out, but it looks like hyperthreading only adds ~10% performance in this case.

BrendanLong · 4 days ago
Here's results of the Nginx benchmark pinned to 1-24 cores: https://docs.google.com/spreadsheets/d/1d_OK_ckLT1zTA_fG4vkq...

At 51% reported CPU utilization, it's doing about 80% of the maximum requests per second, and it can't get above 80% utilization.

I also added a section: https://www.brendanlong.com/cpu-utilization-is-a-lie.html#bo...

BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
4gotunameagain · 6 days ago
What's up with Brendans and CPU utilisation concerns, any Brendan to shine some light ?
BrendanLong · 6 days ago
I'd love to explain, but you'd need to change your name to Brendan first.
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
BrendanLong · 6 days ago
The advantage of stress-ng is that it's easy to make it run with specific CPU utilization numbers. The tests where I run some number of workers at 100% utilization are interesting since they give such perfect graphs, but I think the version where I have 24 workers and increase their utilization slowly is more realistic for showing how production CPU utilization changes.
BrendanLong · 6 days ago
Fun data point though, I just ran three data points of the Phoronix nginx benchmark and got these results:

- Pinned to 6 cores: 28k QPS

- Pinned to 12 cores: 56k QPS

- All 24 cores: 62k QPS

I'm not sure how this applies to realistic workloads where you're using all of the cores but not maxing them out, but it looks like hyperthreading only adds ~10% performance in this case.

BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
pama · 6 days ago
Wait until you encounter GPU utilization. You could have two codes listing 100% utilization and have well over 100x performance difference from each other. The name of these metrics creates natural assumptions that are just wrong. Luckily it is relatively easy to estimate the FLOP/s throughput for most GPU codes and then simply compare to the theoretical peak performance of the hardware.
BrendanLong · 6 days ago
Yeah, the obvious thing with processors is to do something similar:

(1) Measure MIPS with perf (2) Compare that to max MIPS for your processor

Unfortunately, MIPS is too vague since the amount of work done depends on the instruction, and there's no good way to measure max MIPS for most processors. (╯°□°)╯︵ ┻━┻

BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
CCs · 6 days ago
Uses stress-ng for benchmarking, even though the stress-ng documentation says it is not suitable for benchmarking. It was written to max out one component until it burns. Using a real app, like Memcached or Postgres would show more realistic numbers, closer to what people use in production. The difference is not major, 50% utilization is closer to 80% in real load, but it breaks down faster. Stress-ng is nicely linear until 100%, memcached will have a hockey stick curve at the end.
BrendanLong · 6 days ago
The advantage of stress-ng is that it's easy to make it run with specific CPU utilization numbers. The tests where I run some number of workers at 100% utilization are interesting since they give such perfect graphs, but I think the version where I have 24 workers and increase their utilization slowly is more realistic for showing how production CPU utilization changes.
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
SirMaster · 6 days ago
What about 2 workloads that both register 100% CPU usage, but one workload draws significantly more power and heats the CPU up way more? Seems like that workload is utilizing more of the CPU, more of the transistors or something.
BrendanLong · 6 days ago
Some esoteric methods of measuring CPU utilizations are to calculate either the current power usage over the max available power, or the current temperature over the max operating temperature. Unfortunately these are typically even more non-linear than the standard metrics (but they can be useful sometimes).
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
tgma · 6 days ago
The way they refer to cores in their system is confusing and non-standard. The author talks about a 5900X as a 24 core machine and discusses as if there are 24 cores, 12 of which are piggybacking on the other 12. In reality, there are 24 hyperthreads that are pretty much pairwise symmetric that execute on top of 12 cores with two sets of instruction pipeline sharing same underlying functional units.
BrendanLong · 6 days ago
Thanks for the feedback. I think you're right, so I changed a bunch of references and updated the description of the processor to 12 core / 24 thread. In some cases, I still think "cores" is the right terminology though, since my OS (confusingly) reports utilization as-if I had 24 cores.
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
hinkley · 6 days ago
You also want to hit him with queueing theory.

Up to a hair over 60% utilization the queuing delays on any work queue remain essentially negligible. At 70 they become noticeable, and at 80% they've doubled. And then it just turns into a shitshow from there on.

The rule of thumb is 60% is zero, and 80% is the inflection point where delays go exponential.

The biggest cluster I ran, we hit about 65% CPU at our target P95 time, which is pretty much right on the theoretical mark.

BrendanLong · 6 days ago
A big part of this is that CPU utilization metrics are frequently averaged over a long period of time (like a minute), but if your SLO is 100 ms, what you care about is whether there's any ~100 ms period where CPU utilization is at 100%. Measuring p99 (or even p100) CPU utilization can make this a lot more visible.
BrendanLong commented on %CPU utilization is a lie   brendanlong.com/cpu-utili... · Posted by u/BrendanLong
hinkley · 6 days ago
How many times has hyperthreading been an actual performance benefit in processors? I cannot count how many times an article has come out saying you'll get better performance out of your <insert processor here> by turning off hyperthreading in the BIOS.

It's gotta be at least 2 out of every 3 chip generations going back to the original implementation, where you're better off without it than with.

BrendanLong · 6 days ago
To be fair, in most of these tests hyperthreading did provide a significant benefit (in the general CPU stress test, the hyperthreads increased performance by ~66%). It's just confusing that utilization metrics treat hyperthread usage the same as full physical cores.

u/BrendanLong

KarmaCake day185January 26, 2018View Original