"One major caveat is that Project Mariner only works on a Chrome browser's foremost active tab, which means you can't use your computer for other things while the agent works in the background – you need to watch Gemini slowly click around."
Web / GUI agent implementations will have to be moved off the local device to ever be useful, otherwise they block the user's machine. I imagine eventually apps using web / GUI agents internally may abstract away the "browsing live view" entirely - instead of having users watch an agent work in real-time, the agent would run asynchronously in the cloud and just return the final outcome or report.
I'm working on an API for AI agent virtual desktops, so thinking through this a lot currently! https://www.agentstation.ai/
I just worked as an independent contractor / self-employed - didn't seem worth it for me to set up a company as I didn't have expenses to declare for tax purposes. As an independent contractor you just file Canadian taxes, its pretty easy.
I think vacation policy is all over the map - most probably won't recognize Canadian holidays but if its a smaller startup they may be more flexible. Also worth keeping in mind US companies offer fewer vacation days in general on avg. than Canadian companies.
Overall I would recommend it! Pay is better, complexities are not so bad.
Are you planning to eventually package this up? I'd love to have a Chrome extension to hide submissions with blocklist keywords or submission types (especially the Apple releases and the most egregious startup pornography...).
I am working on an API for devs to build Zoom agents currently, so definitely banking on this being the future! https://www.agentstation.ai/
We are providing the infrastructure for AI agent computers at scale, with an API to drive agent actions. You can use prompted workflows (ala the Claude Computer Use API), or use your own models to drive actions within the workstation.
Is this what you were thinking? Happy to answer any questions on it!
0. Some folks don't care about API Keys, that's okay! But for those of you who did respond and do care, we are updating our design based on your feedback.
1. When we got to work on making our API Keys, we looked for an obvious standard but didn't find one. So we decided on our approach quickly and put together uuidkey in an afternoon. We knew it was not going to be everyone’s preferred design, but we wrote up the article to share our thought process as well as generate some marketing. We are happy to see that the article did well and we got feedback! :)
2. The ability to double-click to copy, which was lost with the addition of dashes, was more important to developer commenters than we thought it'd be (even if only needed once). We heard you, so we've already updated https://github.com/agentstation/uuidkey to support a `WithoutHyphens` option for the `Encode` function so you can generate keys without dashes.
3. Some folks were worried that our resulting key after encoding has fewer bits of entropy compared to the original UUID. The Crockford base32 encoding does not reduce entropy, it is a 1:1 mapping.
4. One quality piece of feedback pointed out that the UUID spec warns against using UUIDv7 (only 74 bits of entropy) and even UUIDv4 (standard 122 bits of entropy) alone for API Keys. We plan on still supporting UUIDv7 and UUIDv4, but will add additional entropy bits to follow the official recommendation.
4. Lots of commenters like prefixes, which make it easier to identify & search for keys (particularly to ensure they don’t get accidentally committed to a repo). We plan to add an option for that. Worth mentioning that a few folks pointed us to Github's auth token implementation that includes prefixes, which is a pretty great standard - https://github.blog/engineering/platform-security/behind-git...
Thanks again for reading, debating, and giving us some good advice! We want a product that feels good for developers to use. :D