The specific events that follow when asking a taxi driver where to go may not be exactly repeatable, but reality enforces physical determinism that is not explicitly understood by probabilistic token predictors. If you drive into a wall you will obey deterministic laws of momentum. If you drive off a cliff you will obey deterministic laws of gravity. These are certainties, not high probabilities. A physical taxi cannot have a catastrophic instant change in implementation and have its wheels or engine disappear when it stops to pick you up. A human taxi driver cannot instantly swap their physical taxi for a submarine, they cannot swap new york with paris, they cannot pass through buildings… the real world has a physically determined option-space that symbolic token predictors don’t understand yet.
And the reason humans are good at interpreting human intent correctly is not just that we’ve had billions of years of training with direct access to physical reality, but because we all share the same basic structure of inbuilt assumptions and “training history”. When interacting with a machine, so many of those basic unstated shared assumptions are absent, which is why it takes more effort to explicitly articulate what it is exactly that you want.
We’re getting much better at getting machines to infer intent from plain english, but even if we created a machine which could perfectly interpret our intentions, that still doesn’t solve the issue of needing to explain what you want in enough detail to actually get it for most tasks. Moving from point A to point B is a pretty simple task to describe. Many tasks aren’t like that, and the complexity comes as much from explaining what it is you want as it does from the implementation.
In biology or chemistry it's absurd to say that -- look at metal organic frameworks or all kinds of new synthetic chemistry or ionic liquids or metagenomics, RNA structure prediction, and unraveling of how gene regulation works in the "dark genome".
Progress in the 'symbolic AI' field that includes proof assistants is a really interesting story. When I was a kid I saw an ad for Feigenbaum's 3-volume "Handbook of AI" and got a used copy years later -- you would have thought production rules (e.g. "expert systems" or "business rules") were on track to be a dominant paradigm but my understanding was that people were losing interest even before RETE engines became mainstream and even the expert system shells of the early 1980s didn't use the kind of indexing structures that are mainstream today so that whereas people we saying 10,000 rule rule bases were unruly in the 1980s, 10,000,000 well-structured rules are no problem now. Some of it is hardware but a lot of it is improvements in software.
SAT/SMT solvers (e.g. part of proof assistants) have shown steady progress in the last 50 years, though not as much as neural networks because they are less parallelization. There is dramatically more industrial use of provers though business rules engines, complex event processing, and related technologies are still marginal in the industry for reasons I don't completely understand.
Translating between complex implicit intention in colloquial language and software and formal language used in proof assistants is usually very time consuming and difficult.
By the time you’ve formalized the rules, the context in which the rules made sense will have changed/a lot will be outdated. Plus time and money spent on formalizing rules is time and money not spent on core business needs.