I'm totally with you that it's better that this took so long, so we have things like PyTorch abstracting most of this away, but I'm looking forward to (in my non-existent free time :/ ) playing with this.
There's a paper from Norway that tried end-to-end, but their results were not spectacular. That's the aim of many though, including ECMWF. Note that ECMWF already has their AIFS in production, so AI weather prediction is pretty mainstream nowadays.
Google has a local nowcast model that uses raw observations, in production, but that's a different genre of forecasting than the medium-range models of Aardvark.
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro
Interesting take - I agree here somewhat.
But also, wouldn't you think a framework that has been from the ground-up designed around a specific, mature compiler stack be better able to integrate compilers in a more stable fashion than just shoe-horning static compilers into a very dynamic framework? ;)
PyTorch has better adoption / network effects. JAX has stronger underlying abstractions.
I use both. I like both :)
For example:
> MultiDiffusion remains confined to bounded domains: all windows must lie within a fixed finite canvas, limiting its applicability to unbounded worlds or continuously streamed environments.
> We introduce InfiniteDiffusion, an extension of MultiDiffusion that lifts this constraint. By reformulating the sampling process to operate over an effectively infinite domain, InfiniteDiffusion supports seamless, consistent generation at scale.
…but:
> The hierarchy begins with a coarse planetary model, which generates the basic structure of the world from a rough, procedural or user-provided layout. The next stage is the core latent diffusion model, which transforms that structure into realistic 46km tiles in latent space. Finally, a consistency decoder expands these latents into a high-fidelity elevation map.
So, the novel thing here is slightly better seemless diffusion image gen.
…but, we generate using a heirsrchy based on a procedural layout.
So basocally, tldr: take perlin noise, resize it, and then image-2-image use it as a seed to generate detailed tiles?
People have already been doing this.
Its not novel.
The novel part here is making the detailed tiles slightly nicer.
Eh. :shrug:
The paper obfuscates this, quite annoyingly.
Its unclear to me why you cant just use multi diffusion for this, given your top level input is already bounded (eg. User input) and not infinite.