That is impressive. The controls are essentially unresponsive, but the fact that it starts with an image and goes from there bodes well for generative game building.
For those wanting to see it in action, the wait times are wildly inaccurate. Wait five or six minutes and you'll probably get through.
- Infra/systems: I was able to connect to a server within a minute or two. Once connected, the displayed RTT (roundtrip time?) was around 70ms but actual control-to-action latency was still around ~600-700ms vs the ~30ms I'd expect from an on-device model or game streaming service.
- Image-conditioning & rendering: The system did a reasonable job animating the initial (landscape photo) image I provided and extending it past the edges. However, the video rendering style drifted back to "constrast-boosted video game" within ~10s. This style drift shows up in their official examples as well (https://x.com/DynamicsLab_AI/status/1958592749378445319).
- Controls: Apart from the latency, control-following was relatively faithful once I started holding down Shift. I didn't notice any camera/character drift or spurious control issues, so I guess they are probably using fairly high-quality control labels.
- Memory: I did a bit of memory testing (basically - swinging view side to side and seeing which details got regenerated) and it looks like the model can retain maybe ~3-5s of visual memory + the prompt (but not the initial image).
I was mildly amused (but not especially surprised) to see that the "Hunter's Vale" initial image includes what's pretty clearly a partial Skyrim HUD compass at the top.
The styles of Cyberpunk 2077 and Red Dead Redemption 2 are also dead givaways about their training data. There might also be a whiff of the Witcher 4 demo in one sequence.
The interesting possibility is that all you may need for the setting of a future AAA game is just a small bit of the environment to nail down the art direction. Then you can dispense with the army of workers to place 3D models on the map in just the right arrangment to create a level. The AI model can extrapolate it all for you.
Clearly the days of fiddly level creation with a million inscrutable options and checkboxes in something like Unreal, or Unity, or Godot editors are numbered. You just say what you want and how you want to tweak it, and all those checkboxes and menus are disposable. As a bonus that's a huge barrier to entry torn down for amateur game makers.
Super fun to try a playable world model for the first time! I picked a random picture and got ChatGPT to write a game description, then could move within that world. Very laggy and buggy, but very fun to try!
Hackers screenshot + file system context = Ideal navigation
I picked a Morrowind screenshot of Vivec city and after a few (laggy) frames, it teleported me to a lotr-looking forest and then quickly to a fallout landscape.
Might have potential, but I wasn't terribly impressed by the lack of consistency.
For those wanting to see it in action, the wait times are wildly inaccurate. Wait five or six minutes and you'll probably get through.
- Infra/systems: I was able to connect to a server within a minute or two. Once connected, the displayed RTT (roundtrip time?) was around 70ms but actual control-to-action latency was still around ~600-700ms vs the ~30ms I'd expect from an on-device model or game streaming service.
- Image-conditioning & rendering: The system did a reasonable job animating the initial (landscape photo) image I provided and extending it past the edges. However, the video rendering style drifted back to "constrast-boosted video game" within ~10s. This style drift shows up in their official examples as well (https://x.com/DynamicsLab_AI/status/1958592749378445319).
- Controls: Apart from the latency, control-following was relatively faithful once I started holding down Shift. I didn't notice any camera/character drift or spurious control issues, so I guess they are probably using fairly high-quality control labels.
- Memory: I did a bit of memory testing (basically - swinging view side to side and seeing which details got regenerated) and it looks like the model can retain maybe ~3-5s of visual memory + the prompt (but not the initial image).
The interesting possibility is that all you may need for the setting of a future AAA game is just a small bit of the environment to nail down the art direction. Then you can dispense with the army of workers to place 3D models on the map in just the right arrangment to create a level. The AI model can extrapolate it all for you.
Clearly the days of fiddly level creation with a million inscrutable options and checkboxes in something like Unreal, or Unity, or Godot editors are numbered. You just say what you want and how you want to tweak it, and all those checkboxes and menus are disposable. As a bonus that's a huge barrier to entry torn down for amateur game makers.
Starlight Village just has Scar the lion from Lion King right there at the bottom lol
Hackers screenshot + file system context = Ideal navigation
https://i.imgur.com/dBXdcd9.png
Deleted Comment
Might have potential, but I wasn't terribly impressed by the lack of consistency.
The tech alone of being able to take prefabs and just prompt your way to a world is amazing. Now to get that in blender...