i do not submit things to hacker news unless its related to my favorite tool ever, literally, that i happened to have made. i made this out of being super lazy and wanted my copilot (works in all ai editors) to run my UI while its coding and validate it at the same time by using the apps. i don't know how to contain how good this is for me to use other than putting it here for people to look at. so using it with opus 4.5-4.6 its extremely good, however using it with gpt-5.3 its still good but you have to remind it to use the "autonomo help" when it forgets how to use it correctly sometimes.
anyways, please check it out if you are curious and want very fast efficient UI driven (multi app/web/desktop at the same time, agnostic) validation while you vibe. I just keep using it everyday but still waiting for something to just make this obsolete.
web page:
https://sebringj.github.io/autonomo/
github:
https://github.com/sebringj/autonomo
I'm not sure of the usefulness at the moment. Decorating every component with `useFillHandler('Login.Email', setEmail, { hint: 'Email input' });` is the same effort/value as decorating with `data-testid="email"`.
I can see myself using it if I didn't have to decorate, e.g. by having get_state return the react component tree, and then letting AI figure out which one is the "Login.email" input by itself. E.g. `<MyApp><LoginScreen><input name=email/> ...`
https://sebringj.github.io/autonomo/
Also, maybe I'm misunderstanding what this library is for, but is this really a good application of AI? I'm getting reminded of gherkin + cypress but without the test actually being embedded in the code. I feel like I'd rather use AI to write the BDD test using gherkin rather than prompt the AI to figure out what test behavior I actually want.
I realize I'm coming off a bit pessimistic here, but I'm just trying to explain where my thoughts are so you can hopefully clarify things a bit for me here because I feel like I'm missing something about what you're trying to accomplish with this project.
This sounds neat in practice, but what does this mean exactly?
Most devs I know integrate Opencode/Codex/etc through something like Playwright-MCP which handles testing, physically clicking on elements, and validation through a combination of DOM inspection, JS evals, and screenshots.
It's not the fastest thing in the world because it still has to spin up a browser (headless or otherwise) - but I don't see how that's avoidable if you want genuine E2E testing.