heyitsguay (u/heyitsguay)

heyitsguay commented on Teaching GPT-5 to Use a Computer prava.co/archon/... · Posted by u/Areibman

thrown-0825 · 8 days ago

Imagine getting beat by a bot and it also has the capability to talk trash to you.

heyitsguay · 8 days ago

"Unfortunately, my content guidelines prohibit me from describing my activities with your mother last night"

heyitsguay commented on Tversky Neural Networks gonzoml.substack.com/p/tv... · Posted by u/che_shr_cat

throwawaymaths · 9 days ago

crawl walk run.

no sense spending large amounts of compute on algorithms for new math unless you can prove it can crawl.

heyitsguay · 9 days ago

It's the same amount of effort benchmarking, just a better choice of backbone that enables better choices of benchmark tasks. If the claim is that a Tversky projection layer beats a linear projection layer today, then one can test whether that's true with foundation embedding models today.

It's also a more natural question to ask, since building projections on top of frozen foundation model embeddings is both common in an absolute sense, and much more common, relatively, than building projections off of tiny frozen networks like a ResNet-50.

heyitsguay commented on Tversky Neural Networks gonzoml.substack.com/p/tv... · Posted by u/che_shr_cat

heyitsguay · 9 days ago

Seems cool, but the image classification model benchmark choice is kinda weak given all the fun tools we have now. I wonder how Tversky probes do on top of DINOv3 for building a classifier for some task.

heyitsguay commented on Diffusion language models are super data learners jinjieni.notion.site/Diff... · Posted by u/babelfish

thesz · 15 days ago

> I wonder how much of this is due to Diffusion models having less capacity for memorization than auto regressive models

Diffusion requires more computation resources than autoregressive models, compute excess is proportional to the length of sequence. Time dilated RNNs and adaptive computation in image recognition hint us that we can compute more with same weights and achieve better results.

Which, I believe, also hint at the at least one flaw of the TS study - I did not see that they matched DLM and AR by compute, they matched them only by weights.

heyitsguay · 15 days ago

Do you have references on adaptive methods for image recognition?

heyitsguay commented on Can Large Language Models Play Text Games Well? (2023) arxiv.org/abs/2304.02868... · Posted by u/willvarfar

willvarfar · 2 months ago

It's been a background thought of mine for a while:

* create a basic text adventure (or MUD) with a very spartan api-like representation

* use an LLM to embellish the description served to the user etc. With recent history in context the LLM might even kinda reference things the user asked previously etc.

* have NPCs implemented as own LLMs that are trying to 'play the game'. These might be using the spartan API directly like they are agents.

Its a fun thought experiment!

(An aside: I found that the graphical text adventure that I made for Ludum Dare 23 is still online! Although it doesn't render quite right in modern browsers.. things shouldn't have broken! But anyway https://williame.github.io/ludum_dare_23_tiny_world/)

heyitsguay · 2 months ago

I've done something along these lines! https://github.com/heyitsguay/trader

The challenge for me was consistency in translating free text from dialogs into classic, deterministic game state changes. But what's satisfying is that the conversations aren't just window dressing, they're part of the game mechanic.

heyitsguay commented on I counted all of the yurts in Mongolia using machine learning monroeclinton.com/countin... · Posted by u/furkansahin

sorokod · 2 months ago

"In total I found 172,689 yurts with a prediction score of greater than 40%."

How should one interpet the "prediction score"?

heyitsguay · 2 months ago

Object detectors output detection bounding boxes along with confidence scores. The higher the score, the more confident the model is that the associated bounding box is a correct detection.

When used in applications (like this one), the user typically establishes a confidence threshold and then every detection above that threshold is treated as a positive detection, the rest are discarded. The choice can be arbitrary or (sorta) principled.

heyitsguay commented on Is there a half-life for the success rates of AI agents? tobyord.com/writing/half-... · Posted by u/EvgeniyZh

peacebeard · 2 months ago

Very common to see in comments some people saying “it can’t do that” and others saying “here is how I make it work.” Maybe there is a knack to it, sure, but I’m inclined to say the difference between the problems people are trying to use it on may explain a lot of the difference as well. People are not usually being too specific about what they were trying to do. The same goes for a lot of programming discussion of course.

heyitsguay · 2 months ago

I've noticed this a lot, too, in HN LLM discourse.

(Context: Working in applied AI R&D for 10 years, daily user of Claude for boilerplate coding stuff and as an HTML coding assistant)

Lots of "with some tweaks i got it to work" or "we're using an agent at my company", rarely details about what's working or why, or what these production-grade agents are doing.

heyitsguay commented on 'Spiderweb' drone attack marks a new threat for top militaries businessinsider.com/opera... · Posted by u/petethomas

moi2388 · 3 months ago

If these drones are so successful, why are their defences in the billions?

Why don’t you have hunter drones targeting any potential drone coming in?

heyitsguay · 3 months ago

Defense needs them everywhere, offense needs one gap in the defense.

heyitsguay commented on 'Spiderweb' drone attack marks a new threat for top militaries businessinsider.com/opera... · Posted by u/petethomas

getcrunk · 3 months ago

It’s only a matter of time before governments with the money and the sense adopt automated anti drone radar/turrets.

It’s funny that they haven’t already. I mean it’s about “national security.” This threat has been looming for 10 maybe 15 years now

heyitsguay · 3 months ago

At Allen Control Systems, we're working on deploying automated anti drone turrets (radar or passive EO detection) right now!

Development in the space is happening at a breakneck pace. We're hiring pretty aggressively, if this sort of thing seems interesting, check it out!

https://www.allencontrolsystems.com/company#jobs