Probably yes, in much the same way as we rent housing, telecom plans, and cloud compute as the economy becomes more advanced.
For those with serious AI needs, maintaining migration agility should always be considered. This can include a small on-premises deployment, which realistically cannot compete with socialized production in all aspects, as usual.
The nature of the economy is to involve more people and more organizations over time. I could see a future where somewhat smaller models are operated by a few different organizations. Universities, corporations, maybe even municipalities, tuned to specific tasking and ingested with confidential or restricted materials. Yet smaller models for some tasks could be intelligently loaded onto the device from a web server. This seems to be the way things are going RE the trendy relevance of "context engineering" and RL over Huge Models.
The book at http://ostep.org gets into some details.
Another useful illuminator: https://gwern.net/computers
As much as possible is generally done using simulators before testing on any real hardware.