Show HN: AI agent that runs real browser workflows

fidorka · 2 days ago

Cool demo. The tricky bit with browser workflow agents is figuring out which workflows to automate in the first place. Most people don't even realize they're doing the same thing over and over - they just do it.

I've been building MemoryLane (https://github.com/deusXmachina-dev/memorylane) which comes at this from the other side - it records screen activity, spots repeated patterns with AI, and then tells you "hey you keep doing this, want to automate it?" Works as an MCP plugin for Claude/Cursor.

Feels like pattern detection (finding what to automate) + browser agents like yours (actually doing the automation) is the right combo. Are you thinking about the discovery side at all, or mostly focused on execution?

heavymemory · 2 days ago

Interesting. Part of why I built this was to avoid screen capture as the control layer. Once you’re taking screenshots, guessing what to click, moving the mouse, and repeating, it gets slow and brittle fast. Here the workflow is just described in text, executed in the browser, and saved for reuse.

abraxas · 2 days ago

I was looking for a similar produc/project the other day. Alas my need is a Linux native version. You may want to consider it as Mac seems to be overserved by the agent harness supply while Linux is the opposite

heavymemory · 2 days ago

linux and windows support is on the way, i’ve designed it in a decoupled way, so should be straight forward.

Just need to see if people find this version useful

june-jule · 2 days ago

Interesting demo, how are you thinking about prompt injection and security with web agents? Ive been facing this as well.

heavymemory · 2 days ago

Prompt injection is the same problem all agents face, ChatGpt Atlas, claude cowork, openclaw, all of them. It's a known unsolved problem across the industry.

I mitigate it by giving the agent a fixed action set (no scripts, no direct API calls), and breaking tasks into focused subtasks so no single agent has broad scope. The LLM prioritises its own instructions over page content, but if someone managed to hijack it, the agent can interact with authenticated sessions. Everything's visible in real time though, and all actions are logged, so you can see exactly what it's doing and kill it.

Practically speaking, I use it similar to how people use Zapier or n8n, you set up specific workflows and make sure you're only pointing it at sites you trust. If you're sending it to random unknown websites then yeah, there's more risk.

But even then, an attacker would need to know what apps you're authenticated with and what data the agent has access to. The chances of something actually happening are pretty low, but the risk is there. No one's fully solved this yet.