But knowing a little bit about gaussian splatting, I can't think what manual steps requiring human assistance are even likely to be necessary?
Want PBS to stick around? Make it so anybody who's sticking on chat GPT gets great answers from PBS and every time ChatGPT scrapes it, PBS gets money.
Is it extremely difficult? Obviously. Will it work? Probably not, very few things do. Is it a great thing that some folks are doing it and trying to make it work so that we can have a functional media ecosystem in a post-social-media age? Absolutely.
On the flipside, far from figuring out GPU efficiency, most people with huge jobs are network bottlenecked. And that’s where the problem arises: solutions for collective comms optimization tend to explode in complexity because, among other reasons, you now have to package entire orchestrators in your library somehow, which may fight with the orchestrators that actually launch the job.
Doing my best to keep it concise, but Hopper is like a good case study. I want to use Megatron! Suddenly you need FP8, which means the CXX11 ABI, which means recompiling Torch along with all those nifty toys like flash attention, flashinfer, vllm, whatever. Ray, jsonschema, Kafka and a dozen other things also need to match the same glibc and glibc++ versions. So using that as an example, suddenly my company needs C++ CICD pipelines, dependency management etc when we didn’t before. And I just spent three commas on these GPUs. And most likely, I haven’t made a dime on my LLMs, or autonomous vehicles, or weird cyborg slavebots.
So what all that boils down to is just that there’s a ton of inertia against moving to something new and better. And in this field in particular, it’s a very ugly, half-assed, messy inertia. It’s one thing to replace well-designed, well-maintained Java infra with Golang or something, but it’s quite another to try to replace some pile of shit deep learning library that your customers had to build a pile of shit on top of just to make it work, and all the while fifty college kids are working 16 hours a day to add even more in the next dev release, which will of course be wholly backwards and forwards incompatible.
But I really hope I’m wrong :)
I don't think it's gonna happen instantly, but it will happen, and Mojo/Modular are really the only language platform I see taking a coherent approach to it right now.
I wonder if we'll see a macOS port soon - currently it very much needs an NVIDIA GPU as far as I can tell.
Even with Jax, PyTorch, HF Transformers, whatever you want to throw at it--the dx for cross-platform gpu programming that are compatible with large language models requirements specifically is extremely bad.
I think this may end up be the most important thing that Lattner has worked on in his life (And yes, I am aware of his other projects!)