A uniquely efficient hardware stack, for either training or inference, would be a great moat in an industry that seems to offer few moats.
I keep waiting to here of more adoption of Cerebras Systems' wafer-scale chips. They may be held back by not offering the full hardware stack, i.e. their own data centers optimized around wafer-scale compute units. (They do partner with AWS, as a third party provider, in competition with AWS own silicon.)
[1] https://www.sec.gov/Archives/edgar/data/2021728/000162828024...
in that sense, estimation should theoretically become a more reasonable endeavor. or maybe not, we just end up back where we are because the llm has produced unusable code or an impossible-to-find bug which delays shipment etc.