This is not the language problem, but its simply the nature of the problem no? It is like saying, full adders are complex, can't we design something simpler? No, full adders are the way they are because addition in binary is complicated.
What you are saying is that this problem is not your kind of problem, which is fine. Not everyone needs to face the complexity of optimizing full adders. And so we created abstractions. The question is, how good is that abstraction?
C++ is like using FP math to do binary addition.
If we use a random number generator then we will converge to 100% correct answers under pass@n in the limit.
A random number generator will eventually outperform or match all models (for large n) whenever top-p is less than 1 because the other models will most likely have some level of bias that makes correct CoTs mathematically impossible due to the tokens being too improbable and being filtered out by top-p, meaning that other models will asymptote to below 100% while the RNG will reach 100% in an almost surely sense.
Under this paper's logic doesn't that mean that the random number generator is a superior reasoner?