joshuamoyers (u/joshuamoyers)

joshuamoyers commented on Teaching GPT-5 to Use a Computer prava.co/archon/... · Posted by u/Areibman

I really like this approach. Nice job!

> We also plan to compile solved steps into micro‑policies. If you're running something like a RPA task or similar workflow as before, you can simply run the execution locally (with archon-mini running locally) and not have to worry about the planning. Over time, the planner is a background teacher, not a crutch.

Conceptually, I really like this - why re-do the work of reasoning about an already solved task? Just do it again. For some plausibly large majority of things, this could speed things up considerably.

> In the future we hope to run a streaming capture pipeline similar to Gemma 3. Consuming frames at 20–30 fps, emitting actions at 5–10 Hz, and verifying state on each commit.

I love targets like this. It makes you tune the architecture and abstractions to push the boundary of whats possible with a traditional agent loop.

The salience heat map compression is a great idea. I think you could take this a step further and tune a model so that it compresses an image into a textual semantic/interactive element hierarchy. This is effectively what browser-use is doing, just using javascript instead of a vision model.

This seems like a task that would benefit from narrow focus. I'm aware of the "Bitter Lesson," but my intuition seems to tell me that chaining together fit to purpose classification as an input to an intelligent planning system is the way to go.

joshuamoyers commented on Dating Men in the Bay Area astralcodexten.com/p/your... · Posted by u/lukebechtel

joshuamoyers · 15 days ago

This was pretty wild. Veers deeply into broad generalizations that have the potential to be dangerous in some way I cannot name - but if you stop and consider each archetype as a vector along which you can accidentally trap yourself, its a thought provoking read at least. Unfortunately its also a list of undesirable damaged characters followed by some model of a "whole man" that is somehow infinitely attractive and stable. That's a lot of malarkey in my opinion. We're on a many-dimensional journey and all of us are some degree of lost. Some good guide markers in here though.

I think it does boil down to "try things a lot," especially creating real connections with other people, even though you will painfully fail many times. Drive yourself to have real conversations. Protect your health and keep yourself strong physically and mentally. That's a powerful base to be standing on. Then go find a blend of interest, purpose and duty - building a sense of dharma helps you wake up in the morning and move through the world feeling a little less "lost."

joshuamoyers commented on GPT-5: Overdue, overhyped and underwhelming. And that's not the worst of it garymarcus.substack.com/p... · Posted by u/kgwgk

joshuamoyers · 21 days ago

> For all that, GPT-5 is not a terrible model. I played with it for about an hour, and it actually got several of my initial queries right (some initial problems with counting “r’s in blueberries had already been corrected, for example). It only fell apart altogether when I experimented with images.

Spatial reasoning and world model is one aspect. Posting bicycle part memes does not a bad model make. The reality is its cheaper than Sonnet and maybe around as good at Opus at a decent number of tasks.

> And, crucially, the failure to generalize adequately outside distribution tells us why all the dozens of shots on goal at building “GPT-5 level models” keep missing their target. It’s not an accident. That failing is principled.

This keeps happening recently. So many people want to take a biblically black and white take on whether LLMs can get to human level intelligence. See recent interview with Yann LeCun (Meta Chief AI Scientist): https://www.youtube.com/watch?v=4__gg83s_Do

Nobody has any fucking idea. It might be a hybrid or a different architecture than current transformers, but with the rate of progress just within this field, there is absolutely no way you can make a prediction that scaling laws won't just let LLMs outpace the negative hot takes.

joshuamoyers commented on GPT-5: Overdue, overhyped and underwhelming. And that's not the worst of it garymarcus.substack.com/p... · Posted by u/kgwgk

Uehreka · 21 days ago

This is a genre of article I find particularly annoying. Instead of writing an essay on why he personally thinks GPT-5 is bad based on his own analysis, the author just gathers up a bunch of social media reactions and tells us about them, characterizing every criticism as “devastating” or a “slam”, and then hopes that the combined weight of these overtorqued summaries will convince us to see things his way.

It’s both too slanted to be journalism, but not original enough to be analysis.

joshuamoyers · 21 days ago

100% agree. I feel like this is a symptom of Dead Internet Theory as well - as a negative take starts to spiral out of control, we start to get an absolute deluge of a repurposing of the directionally negative sound bytes and it honestly feels like bot canvasing.

joshuamoyers commented on The current state of LLM-driven development blog.tolki.dev/posts/2025... · Posted by u/Signez

joshuamoyers · 21 days ago

> By being particularly bad at anything outside of the most popular languages and frameworks, LLMs force you to pick a very mainstream stack if you want to be efficient.

Almost like hiring and scaling a team? There are also benchmarks that specifically measure this, and its in theory a very temporary problem (Aider Polyglot Benchmark is one such).

joshuamoyers commented on An engineer's perspective on hiring jyn.dev/an-engineers-pers... · Posted by u/pabs3

joshuamoyers · 21 days ago

> companies often over-index on crystallized knowledge over fluid intelligence.

another way to say this: focus on aptitude. in my hiring funnel, this is a core tenet. you need to be able to capture polyglots and systems thinkers. its still pretty hard to design a process that balances this all very well. combine that with an absolute glut of applicants and you have a very challenging problem.

joshuamoyers commented on Did California's fast food minimum wage reduce employment? nber.org/papers/w34033... · Posted by u/lxm

DarkNova6 · 21 days ago

I fail to see the causality how this is caused by minimum wages.

joshuamoyers · 21 days ago

Its not at all imo. Franchised businesses are not in the habit of employing low skill workers as a public service. This data is interacting with both covid effects and infrastructure upgrade/rollover - in other words, it takes a while for companies to adopt affordable touch screen ordering systems and its been phased in at a ton of non-fast food (at least in my area) over the same period of time. Local health grocery store has touch screen ordering at their deli, as well as simultaneously going cashless. Most coffee shops too. Look at most international airports - almost all the kiosks have one or no attendants now.

joshuamoyers commented on Long-term exposure to outdoor air pollution linked to increased risk of dementia cam.ac.uk/research/news/l... · Posted by u/hhs

joshuamoyers · 21 days ago

Meanwhile, current administration gutting the clean air act: https://www.npr.org/2025/07/29/nx-s1-5463771/epa-greenhouse-...

joshuamoyers commented on Agents built from alloys xbow.com/blog/alloy-agent... · Posted by u/summarity

joshuamoyers · a month ago

two good points there are very intuitive - a fresh perspective yields better results and once you are stuck (e.g. 80 iterations) its better to just start fresh. i've seen the same thing anecdotally in coding sessions where context needs to be compacted multiple times. its usually just better to start a fresh conversation and re-seed the basics in the conversation.

joshuamoyers commented on · Posted by u/joshuamoyers

joshuamoyers · 2 months ago

In light of recent light controversy around Andrew Ng's talk, I thought I'd submit this long form article I wrote over the last few weeks. By contrast, its my very tactical and specific experience using agentic approach to high throughput software development in a >1M LOC codebase. I've recently also been doing roughly half my work with Claude Code vs. Cursor. I do think Andrew Ng's take on product management (and also design in our case) being the new bottleneck at current ratios is correct. I'm not sure what the best solution is to that in our team's case, but we are definitely feeling it.