Yes, and the regulation should NOT be limiting passing or requiring the slower truck to brake
It should allow a "Push To Pass" button that allows a 10mph boost for enough seconds to make a pass in a reasonable amount of distance so as to not create problems for other traffic.
Current technology would allow these to be easily limited to X uses per hour/day and even geo-fence the usage for safe zones (use could even be limited to passing lanes so the truck being passed cannot start a drag race to stay ahead). They could even require connectivity and disable it in poor road conditions.
The real people being inconsiderate are not so much the truckers (particularly the slower trucker failing to yield and let the other one pass in a reasonable distance), as it is the regulators who created this mess.
The problem is that most LLM models answer it correctly (see the many other comments in this thread reporting this). OP cherry picked the few that answered it incorrectly, not mentioning any that got it right, implying that 100% of them got it wrong.
Absolutely! But there is some nuance, here. The failure mode is for an ambiguous question, which is an open research topic. There is no objectively correct answer to "Should I walk or drive?" given the provided constraints.
Because handling ambiguities is a problem that researchers are actively working on, I have confidence that models will improve on these situations. The improvements may asymptotically approach zero, leading to ever increasingly absurd examples of the failure mode. But that's ok, too. It means the models will increase in accuracy without becoming perfect. (I think I agree with Stephen Wolfram's take on computationally irreducibility [1]. That handling ambiguity is a computationally irreducible problem.)
EWD was right, of course, and you are too for pointing out rigorous languages. But the interactivity with an LLM is different. A programming language cannot ask clarifying questions. It can only produce broken code or throw a compiler error. We prefer the compiler errors because broken code does not work, by definition. (Ignoring the "feature not a bug" gag.)
Most of the current models are fine-tuned to "produce broken code" rather than "compiler error" in these situations. They have the capability of asking clarifying questions, they just tend not to, because the RL schedule doesn't reward it.
[1]: https://writings.stephenwolfram.com/2017/05/a-new-kind-of-sc...