For very modest definitions of playing. Perhaps it'd be more impressive if they recorded a demo file and let that play back without the realtime overhead? Even so it can only move in forward, back, turn, and fire. And only knows to face away from the wall it's collided with. This is so far below even basic Doom bots that I'd be afraid to call it playing.
The ASCII intermediate interpretation also seems unnecessary and very limiting. But perhaps that's to keep it near realtime, looks like 1 FPS?
And why run on a Mac? Why not a beefy PC with a GPU that can do the calculations faster?
Still, does seem like a fun challenge. Maybe with further tuning or training it can level up
It’s not. The fine tuning taught the LLM how to give single-character responses (move/fire keyboard controls) in response to a sequence of ASCII-art-ized frames of the game being played.
Funny. I made a captcha challenge of calculus problems for a comment section on my personal blog page. But 5 years after college, I couldn't remember how to even do them myself so I changed it :-/
You don't actually need much, for a form I used to get spam in I just added a "write 42 here" so anyone who actually cares to read would be able to fill it. spam fell to 0.
(for a site with a slightly higher profile this wouldn't be enough, but for a minor corner of the internet with no ill intent actually aimed at it that turned out to be enough to block the fuzzing "fill all the forms" spam)
As contrasting experience, I did that (a simple math problem) on our contact form and it did NOT drop spam to zero; our spammers were too smart for that. Even an actual reCAPTCHA didn't completely eliminate it (although it mostly did, enough that it's fine for us).
Similarly an empty input field that is css'd to be outside the viewport is often filled by spambots but not humans. But I like the edge case UX of your idea more.
The question I got was surprisingly simple: it asked to find "the least real root of the polynomial p(x) = (x+5)(x-4)(x+1)". A determined attacker can quickly hack together something with Tesseract and feed it into even GPT-3.5 to get the correct answer to questions like these.
I guess that means the captcha is doing its job, since running LLMs isn't very cheap or scalable. But any harder problem means you start filtering a significant chunk of human users. Based on the other replies to your comment, it seems that the questions at their current difficulty already stop a lot of human users, yet allow a determined attacker with the setup I described pass through easily.
Absolute banger.
But the auto-aim on vertical axis is missing. You should be able to have the crosshair under an enemy and still hit them.
But in any case, nicely done!
Funny enough, when I've tried to introduce (indoctrinate) friends to DOOM, "how do I aim up" has consistently been the biggest hangup.
This makes sense when I try to indoctrinate my teenager who grew up on Halo and Call of Duty. But I began noticing this hangup in the late 90s with friends my own age.
Doom is still under copyright protection last I knew. The source is GPL, but have the assets ever been liberally licensed? I think they're more abandonware.
I'm sure you could still do it, but personally I try to respect copyright strictly for any projects I'm going to share. It just feels annoying to have copyright nonsense hanging over me otherwise.
https://news.ycombinator.com/item?id=39813174
The ASCII intermediate interpretation also seems unnecessary and very limiting. But perhaps that's to keep it near realtime, looks like 1 FPS?
And why run on a Mac? Why not a beefy PC with a GPU that can do the calculations faster?
Still, does seem like a fun challenge. Maybe with further tuning or training it can level up
https://www.youtube.com/watch?v=bEXefdbQDjw
Reminded me of this one: http://random.irb.hr/signup.php
(for a site with a slightly higher profile this wouldn't be enough, but for a minor corner of the internet with no ill intent actually aimed at it that turned out to be enough to block the fuzzing "fill all the forms" spam)
I guess that means the captcha is doing its job, since running LLMs isn't very cheap or scalable. But any harder problem means you start filtering a significant chunk of human users. Based on the other replies to your comment, it seems that the questions at their current difficulty already stop a lot of human users, yet allow a determined attacker with the setup I described pass through easily.
Then I refreshed the page, and was hit with calculus involving trig functions.
-3 * 3 + (-3) = ?
Where's my calculator?
Edit: oh wait. It's "least". I really have no idea then :)
With it being so famously portable, I was expecting this to actually run Doom in the browser and complete a simple map.
This makes sense when I try to indoctrinate my teenager who grew up on Halo and Call of Duty. But I began noticing this hangup in the late 90s with friends my own age.
Easily achievable[0], thoroughly obnoxious[1]. Just like all captchas.
[0] God help you if you're on a touchscreen. [1] For most people. Especially after the novelty wears off.
I'm sure you could still do it, but personally I try to respect copyright strictly for any projects I'm going to share. It just feels annoying to have copyright nonsense hanging over me otherwise.
Or maybe a remote desktop into an OS with a sandboxed browser that runs a Windows VM that ...