On the one hand, having the optimizer save you from your own bad code is a huge draw, this is my desperate hope with SQL, I can write garbage queries and the optimizer will save me from myself.
But... Someone put that code there, spent time and effort to get that machinery into place with the expectation that it is doing something. and when the optimizer takes that away with no hint. That does not feel right either. Especially when the program now behaves differently when "optimized" vs unoptimized.
int factorial(int x) {
if (x < 0) throw invalid_input();
// compute factorial ...
}
This doesn't have any dead code in a static examination: at compilation-time, however, this function may be compiled multiple times, e.g., as factorial(5) or factorial(x) where x is known to be non-negative by range analysis. In this case, the `if (x < 0)` is simply pruned away as "dead code", and you definitely want this! It's not a minor thing, it's a core component of an optimizing compiler.This same pruning is also responsible for the objectionable pruning away of dead code in the examples of compilers working at cross-purposes to programmers, but it's not easy to have the former behavior without the latter, and that's also why something like -Wdead-code is hard to implement in a way which wouldn't give constant false-positives.
The most insane part here is that the AMD EPYC 4565p can beat the turin's used on the cloud providers, by as much as 2x in the single core.
Our tests took 2 minutes on GCP, 1 minute flat on the 4565p with its boost to 5.1ghz holding steady vs only 4.1ghz on the gcp ones.
GCP charges $130 a month for 8vcpus. ALSO this is for SPOT that can be killed at any moment.
My 4565p is a $500 cpu... 32 vcpus... racked in a datacenter. The machine cost under 2k.
i am trying hard to convince more people to rack themselves especially for CI actions. The cloud provider charging $130 / mo for 3x less vcpus you break even in a couple months, it doesn't matter if it dies a few months later. On top of that you're getting full dedicated and 2x the perf. Anyways... glad to see I chose the right cpu type for gcloud even though nothing comes close to the cost / perf of self racking
That is ... hard to believe for a CPU-bound task. Do you have any open benchmark which can reproduce that?