I think it does boil down to "try things a lot," especially creating real connections with other people, even though you will painfully fail many times. Drive yourself to have real conversations. Protect your health and keep yourself strong physically and mentally. That's a powerful base to be standing on. Then go find a blend of interest, purpose and duty - building a sense of dharma helps you wake up in the morning and move through the world feeling a little less "lost."
> We also plan to compile solved steps into micro‑policies. If you're running something like a RPA task or similar workflow as before, you can simply run the execution locally (with archon-mini running locally) and not have to worry about the planning. Over time, the planner is a background teacher, not a crutch.
Conceptually, I really like this - why re-do the work of reasoning about an already solved task? Just do it again. For some plausibly large majority of things, this could speed things up considerably.
> In the future we hope to run a streaming capture pipeline similar to Gemma 3. Consuming frames at 20–30 fps, emitting actions at 5–10 Hz, and verifying state on each commit.
I love targets like this. It makes you tune the architecture and abstractions to push the boundary of whats possible with a traditional agent loop.
The salience heat map compression is a great idea. I think you could take this a step further and tune a model so that it compresses an image into a textual semantic/interactive element hierarchy. This is effectively what browser-use is doing, just using javascript instead of a vision model.
This seems like a task that would benefit from narrow focus. I'm aware of the "Bitter Lesson," but my intuition seems to tell me that chaining together fit to purpose classification as an input to an intelligent planning system is the way to go.