I recently (as an experiment) exclusively vibe-coded an Asteroids clone (with a couple of nifty additions), all in a single HTML file, including a unit test suite, which also works on either desktop or mobile: https://github.com/pmarreck/vibesteroids/blob/yolo/docs/inde...
Playable version (deployed via github docs) is here: https://pmarreck.github.io/vibesteroids/ Hit Esc to pause and see instructions (pause should be automatic on mobile; can re-pause by tapping top center area).
Type shift-B (or shake mobile device... you also would have had to approve its ability to sense that on iOS) to activate the "secret" ability (1 per life)
No enemy UFO's to shoot (yet) but the pace does quicken on each level which feels fun.
It doesn't update itself, however... (and I just noticed it has a test fail, and a test rendering bug... LOL, well at least the tests are valid!)
For me happiness is a terrible life goal. Sure it's nice to be happy, but its such a vapid meaningless emotion. If I were to optimize for "happiness" I would just cash out, abandon my family, move to Vietnam, play video games and eat Hot Pockets all day. It doesn't take much to ride out the rest of my years.
But the life I choose is hard because doing hard things is good and fulfilling. I often willfully forgo happiness because, you know, I'm an adult. Maybe I'm just stupid?
My issue is that crawlers aren’t respecting robots.txt, they are capable of operating captchas, human verification check boxes, and can extract all your content and information as a tree in a matter of minutes.
Throttling doesn’t help when you have to load a bunch of assets with your page. IP range blocking doesn’t work because they’re lambdas essentially. Their user-agent info looks like someone on Chrome trying to browse your site.
We can’t even render everything to a canvas to stop it.
The only remaining tactic is verification through authorization. Sad.
In my mind, I’m comparing the model architecture they describe to what the leading open-weights models (Deepseek, Qwen, GLM, Kimi) have been doing. Honestly, it just seems “ok” at a technical level:
- both models use standard Grouped-Query Attention (64 query heads, 8 KV heads). The card talks about how they’ve used an older optimization from GPT3, which is alternating between banded window (sparse, 128 tokens) and fully dense attention patterns. It uses RoPE extended with YaRN (for a 131K context window). So they haven’t been taking advantage of the special-sauce Multi-head Latent Attention from Deepseek, or any of the other similar improvements over GQA.
- both models are standard MoE transformers. The 120B model (116.8B total, 5.1B active) uses 128 experts with Top-4 routing. They’re using some kind of Gated SwiGLU activation, which the card talks about as being "unconventional" because of to clamping and whatever residual connections that implies. Again, not using any of Deepseek’s “shared experts” (for general patterns) + “routed experts” (for specialization) architectural improvements, Qwen’s load-balancing strategies, etc.
- the most interesting thing IMO is probably their quantization solution. They did something to quantize >90% of the model parameters to the MXFP4 format (4.25 bits/parameter) to let the 120B model to fit on a single 80GB GPU, which is pretty cool. But we’ve also got Unsloth with their famous 1.58bit quants :)
All this to say, it seems like even though the training they did for their agentic behavior and reasoning is undoubtedly very good, they’re keeping their actual technical advancements “in their pocket”.
Ahmen! I attend this same church.
My favorite professor in engineering school always gave open book tests.
In the real world of work, everyone has full access to all the available data and information.
Very few jobs involve paying someone simply to look up data in a book or on the internet. What they will pay for is someone who can analyze, understand, reason and apply data and information in unique ways needed to solve problems.
Doing this is called "engineering". And this is what this professor taught.