Show HN: We're building a desktop app for browser-based AI agents

Show HN: We're building a desktop app for browser-based AI agents meha.ai...

What's up HN!

This is Jared and Art. We met on HN and started building together.

Over the last few months we've been thinking a lot about how AI agents are going to impact the future. We want agents to be something that's actually useful for normal people as well as the 10x'ers. This lead us to building Meha over the last few months, our first swing at our vision! We saw OpenAI release Operators then we said f*k it let's post.

Meha is a desktop app that uses your Chrome browser to execute tasks in the background. It controls your installed Chrome browser and uses LLMs with playwright to plan and execute actions to accomplish your task. You get to see each planning step the bot is doing and have access to its long term memory.

Meha also uses its own file system and can export files for download. Another thing we've been focused on in multi-agent workflows and Meha can run many bots at the same time. One of the reasons why we can ship this for free in the mean time is because of how cheap the agents are. But we are planning to have a Pro version for power users. We prefer not to raise since we're against VC funding.

We have been influenced by a lot of concepts in probabilistic robotics and RL to develop a fairly robust 'agentic' framework. As well as an algorithm for efficiently converting/compressing large html pages into a semantic format. If you're interested we will open source this asap in an SDK (will work with all OpenAI API spec LLMs and with llama.cpp) let us know.

We're currently in beta and working on figuring out what this product will become and super stoked! Let us know what you think. To get access to Meha we have links on our discord to download (Both MacOS and Windows is available). Please give us all the feedback/criticism (even if you hate AI).

Link to Meha: https://meha.ai

Looked through their privacy policy, and they state the collect and use basically everything they can from your browser & system metadata, to the content you share and/or create. Not that different from every other attempt in the frothy AI space, but a real turn-off and hard no for me.

jawerty · a year ago

Thank you for the feedback. Personally besides using our API server, we would like to find another way to deploy to anyone who has an issue with this/wants to run everything local (not just the client). Also I think if we had a OSS plug and play version where you could enter in your API keys locally it would help us ship to more devs. Would you be interested in this?

imarkphillips · 10 months ago

I'm so impressed with the concept of this agent but sorry, I can't have you accessing all my corporate data and systems because I access them via browser.

Perhaps you could create both a Public and Corporate version of the extension, like Copilot does. The Corporate version could have access to all browser data but not share it beyond the bounds of the company.

1shooner · a year ago

Some analysis I've been reading on the implications of DeepSeek says that model optionality is probably here to stay. If so, I think incorporating model choice would be a valuable aspect of this kind of product. Conversely, I agree with parent: I'm not installing this software with that privacy policy in place.

idiotsecant · a year ago

Op, any comment on this?

stormfather · a year ago

> As well as an algorithm for efficiently converting/compressing large html pages into a semantic format.

For the love of humanity please open source this. This seems tremendously useful by itself.

pavelfeldman · 10 months ago

There is an open source alternative that might be even better: https://playwright.dev/docs/api/class-locator#locator-aria-s....

Oh damn I will definitely look into open sourcing it and making it a sdk

stormfather · 10 months ago

Awesome! I write LLM powered scrapers and stuff all the time and one of the biggest pain points is HTML is full of so much crap that isn't meaningful and overwhelms the context. And being a data science guy idk how to solve this.

skeeter2020 · a year ago

Kuinox · a year ago

I asked it to go on seloger.com, to find "some flats on paris below 400k". It went on some specific district of Paris, and didn't put a price citeria then responded how I could do it myself.

I then asked to create a CSV of the first 100 flats corresponding to my criteria, it created only 3 entries, purely hallucinated.

artabra · a year ago

We'll take a look and see if we can get those prompts working. Thanks for letting us know!

arjunchint · 10 months ago

Hey I am also building in the space and launched rtrvr.ai, but we went the route of a Chrome Extension so people don't have to worry about installing random software on their devices [also the reason that I am hesitant to try this out].

But let me know your thougths on rtrvr.ai, looks like we are targeting the same use cases of automation, scraping, research?

Deleted Comment

Hi everyone, this is Art!

Happy to hear all the thoughts for those who try the app out! Even if you just have ideas about how agents might look in their final form, there's so many avenue's this tech can take and we have a ton of wild ideas we'll be building so stay tuned. :D

iiJDSii · a year ago

Very cool! Any video demos for sample tasks? I didn't come across any on the website (browsing on mobile).

Those are still in the cooker, we'll throw them up asap once they're ready.

Some demos we will have are:

- Logging into twitter and tweeting

- Finding information from google maps of any nearby business whether that's for leads or finding local restaurant options.

- Scraping anything from wikipedia like current events etc.

- And more!

Those are good ones. I've fiddled with similar systems before, do you have a rough success rate? I know they can be finicky, especially as you execute through a chain-of-thought action plan, or however you're doing it.

sky2224 · a year ago

Interesting idea. With the web scraping utility, do I need to specify which websites I wish for the api to scrape from or do I essentially just say, "hey I want this data, go get it"?

If it's the latter, how do you go about making sure you're not about to download malicious data to my machine?

Great question, so right now you can do both. It does work better if you simply enter in the url for your task.

For the url generation we do we have safety checks for the urls however it's simply in the prompting. I would love to hear what sort of safety suggestions you have and/or concerns about this sort of experience. Right now we're still figuring out how best to enable people to utilize agents safely.