In particular it’s important to get past the whole need-to-self-host thing. Like, I used to be holding out for when this stuff would plateau, but that keeps not happening, and the things we’re starting to be able to build in 2025 now that we have fairly capable models like Claude 4 are super exciting.
If you just want locally runnable commodity “boring technology that just works” stuff, sure, cool, keep waiting. If you’re interested in hacking on interesting new technology (glances at the title of the site) now is an excellent time to do so.
i can understand maybe if you’re spending hours setting it up but to me these are download and go
Maybe s.th. like a collective that buys the gpu's together and then uses them without leaking data can work.
maybe 128gb of vram becomes the new mid tier model and most llms can fit into this nicely and do everything one wants in an llm
given how fast llms are progressing it wouldn’t surprise me if we reach this point by 2030