The direction we were trying to move with https://ayvri.com back in 2018 was to model the world from low-res imagery, and render higher-resolution views. At the time, the AI tech wasn't ready, but I think you can do this now. You don't need a prefect replica of the world, you want a convincing replica, so when you're drawing Tokyo, the architecture and elements in the scene match what would be expected.
This was a confusing landing for me on a mobile device.
I was working on world models / generative environments but without the training data available as an independent researcher, ended up focusing on building with existing geospatial data.
The same architecture of the '24 Genie paper's dynamics model is instead trained on historical data for risk analysis and creating a heatmap in the 2d map. I'll try to adapt this for a more generalizable urban mobility model as well.