Readit News logoReadit News
heyitsguay commented on Scripts I wrote that I use all the time   evanhahn.com/scripts-i-wr... · Posted by u/speckx
chis · 2 months ago
I think it's more likely to say that this comes from a place of laziness than some enlightened peak. (I say this as someone who does the same, and is lazy).

When I watch the work of coworkers or friends who have gone these rabbit holes of customization I always learn some interesting new tools to use - lately I've added atuin, fzf, and a few others to my linux install

heyitsguay · 2 months ago
I went through a similar cycle. Going back to simplicity wasn't about laziness for me, it was because i started working across a bunch more systems and didn't want to do my whole custom setup on all of them, especially ephemeral stuff like containers allocated on a cluster for a single job. So rather than using my fancy setup sometimes and fumbling through the defaults at other times, i just got used to operating more efficiently with the defaults.
heyitsguay commented on Abundant Intelligence   blog.samaltman.com/abunda... · Posted by u/j4mie
Workaccount2 · 3 months ago
I think people have a lot of rosy glasses and fondness for those early days, combined with general usability benchmarks being mostly saturated now. GPT-3.5 would say Dallas was the capital of USA, but GPT-4 got it every time!

GPT-4 launched with 8k context. It hallucinated regularly. It was slow. One-shotting code was unheard of, you had to iterate and iterate. It fell over even doing basic math problems.

GPT-5 thinking on the other hand is so capable that the average person wouldn't be able to really test it's abilities. It's really only experts operating in their domain who can find it's stumbling blocks.

I think because we have seen these constant incremental updates that it creates a staircase with small steps, but if you really reflect and look back, you'll see the actual capability gap from 3.5 to 4 compared to 4 to 5 is way way smaller. This is echoed in benchmarks too, GPT-5 is solving problems so wildly beyond what GPT-4 was capable of.

heyitsguay · 3 months ago
What problems?
heyitsguay commented on Serverless Horrors   serverlesshorrors.com/... · Posted by u/operator-name
sbarre · 3 months ago
I feel that the likely answer here is that instrumenting real-time spending limit monitoring and cut-off at GCP/AWS scale is Complicated/Expensive to do, so they choose to not do it.

I suppose you could bake the limits into each service at deploy time, but that's still a lot of code to write to provide a good experience to a customer who is trying to not pay you money.

Not saying this is a good thing, but this feels about right to me.

heyitsguay · 3 months ago
Pass a law requiring cloud compute providers to accept a maximum user budget and be unable to charge more than that, and see how quickly the big cloud providers figure it out.
heyitsguay commented on A PM's Guide to AI Agent Architecture   productcurious.com/p/a-pm... · Posted by u/umangsehgal93
CuriouslyC · 4 months ago
I use agents to do so much stuff on my computer, MCPs are easy to roll so you can give them whatever powers you want. Being able to just direct agents to do stuff on my computer via voice is amazing. The direct driving still sucks so they're not a general UI yet, and the models need to be a bit more consistent/smarter in general, but it'll be there very soon.
heyitsguay · 4 months ago
What do you do with agents?
heyitsguay commented on Charting Form Ds to roughly see the state of venture capital “fund” raising   tj401.com/blog/formd/inde... · Posted by u/lemonlym
monero-xmr · 4 months ago
Not many successful vibe coded products
heyitsguay · 4 months ago
Are there any? Concretely. Genuinely curious.
heyitsguay commented on Teaching GPT-5 to Use a Computer   prava.co/archon/... · Posted by u/Areibman
thrown-0825 · 4 months ago
Imagine getting beat by a bot and it also has the capability to talk trash to you.
heyitsguay · 4 months ago
"Unfortunately, my content guidelines prohibit me from describing my activities with your mother last night"
heyitsguay commented on Tversky Neural Networks   gonzoml.substack.com/p/tv... · Posted by u/che_shr_cat
throwawaymaths · 4 months ago
crawl walk run.

no sense spending large amounts of compute on algorithms for new math unless you can prove it can crawl.

heyitsguay · 4 months ago
It's the same amount of effort benchmarking, just a better choice of backbone that enables better choices of benchmark tasks. If the claim is that a Tversky projection layer beats a linear projection layer today, then one can test whether that's true with foundation embedding models today.

It's also a more natural question to ask, since building projections on top of frozen foundation model embeddings is both common in an absolute sense, and much more common, relatively, than building projections off of tiny frozen networks like a ResNet-50.

heyitsguay commented on Tversky Neural Networks   gonzoml.substack.com/p/tv... · Posted by u/che_shr_cat
heyitsguay · 4 months ago
Seems cool, but the image classification model benchmark choice is kinda weak given all the fun tools we have now. I wonder how Tversky probes do on top of DINOv3 for building a classifier for some task.
heyitsguay commented on Diffusion language models are super data learners   jinjieni.notion.site/Diff... · Posted by u/babelfish
thesz · 4 months ago
> I wonder how much of this is due to Diffusion models having less capacity for memorization than auto regressive models

Diffusion requires more computation resources than autoregressive models, compute excess is proportional to the length of sequence. Time dilated RNNs and adaptive computation in image recognition hint us that we can compute more with same weights and achieve better results.

Which, I believe, also hint at the at least one flaw of the TS study - I did not see that they matched DLM and AR by compute, they matched them only by weights.

heyitsguay · 4 months ago
Do you have references on adaptive methods for image recognition?
heyitsguay commented on Can Large Language Models Play Text Games Well? (2023)   arxiv.org/abs/2304.02868... · Posted by u/willvarfar
willvarfar · 6 months ago
It's been a background thought of mine for a while:

* create a basic text adventure (or MUD) with a very spartan api-like representation

* use an LLM to embellish the description served to the user etc. With recent history in context the LLM might even kinda reference things the user asked previously etc.

* have NPCs implemented as own LLMs that are trying to 'play the game'. These might be using the spartan API directly like they are agents.

Its a fun thought experiment!

(An aside: I found that the graphical text adventure that I made for Ludum Dare 23 is still online! Although it doesn't render quite right in modern browsers.. things shouldn't have broken! But anyway https://williame.github.io/ludum_dare_23_tiny_world/)

heyitsguay · 6 months ago
I've done something along these lines! https://github.com/heyitsguay/trader

The challenge for me was consistency in translating free text from dialogs into classic, deterministic game state changes. But what's satisfying is that the conversations aren't just window dressing, they're part of the game mechanic.

u/heyitsguay

KarmaCake day1071January 5, 2017View Original