[1] https://www.bbc.com/news/technology-23588202
[2] https://www.dkriesel.com/en/blog/2013/0810_xerox_investigati...
[1] https://www.amazon.com/Animators-Survival-Kit-Principles-Cla...
5070 Ti Super will also have 24GB.
Admittedly a little tempting to see how the 5070 Ti Super shakes out!
24 is the lowest I would go. Buy a used 3090. Picked one up for $700 a few months back, but I think they were on the rise then.
The 3000 series can’t do FP8fast, but meh. It’s the OOM that’s tough, not the speed so much.
For self-hosting, it's smart that they targeted a 16GB VRAM config for it since that's the size of the most cost-effective server GPUs, but I suspect "native MXFP4 quantization" has quality caveats.
I'd go for an ..80 card but I can't find any that fit in a mini-ITX case :(
The problem is that a lot of the time, bruteforce search can beat the snot out of any human player, but the real game design objective isn't "beat the player", it's "give the player enough of a challenge to make beating the AI fun".
In Civ's case, it might be theoretically optimal play for the computer to capitalize on rushing players with warriors before they have a chance to establish defenses, but it is also a great recipe for players to angrily request refunds after the tenth consecutive round of being crushed by Gandhi in turn 5. A lot of game AI development time goes into tweaking action probabilities or giving the player advantages to counteract the AI advantage - the reluctance to build military units you saw could have been the result of such a tweak.
As for why LLMs typically aren't applied as game opponents:
* They are quite compute intense, which is tricky when players expect at most 16 ms latency per frame (for 60 FPS), and get ornery if they have to wait more than a few seconds, but also do not like having always-online requirements imposed by cloud compute (or subscription costs to fund running LLMs for every player)
* The bridge between tokens and actions also means it's hard to tweak probabilities directly - while A* can let you specify that a certain path should be taken approx. 20% of the time, implementing this in an agent-LLM approach means you have to actively select and weight token probabilities during the beam search, which is a bit of a hassle, to put it mildly
* The issues with long-term coherence in LLMs, famously demonstrated by Vending-Bench [4], makes reliability and debugging harder
[1] https://en.wikipedia.org/wiki/A*_search_algorithm
[2] https://www.gamedeveloper.com/design/building-the-ai-of-f-e-...
[1] https://pine64.org/devices/pinetime/