lawrencechen (u/lawrencechen)

lawrencechen commented on Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs cloudrouter.dev/... · Posted by u/austinwang115

killbot_2000 · 16 hours ago

Interesting approach. I've been going the opposite direction - building a local orchestration platform where 70+ agents share resources on my own machine. The isolation problem you mention is real. I've found that for many dev tasks, local-first avoids the latency and cost of cloud VMs, though GPU workloads are a different story. Curious how you handle agent state persistence across VM sessions?

lawrencechen · 14 hours ago

I also personally prefer running agents locally instead of in the cloud. For some reason, it feels easier to steer Claude Code when it's running in my terminal vs steering something in the cloud. Maybe part of it is the latency from typing into ssh'd TUIs, and this is something that a GUI can solve... but I still feel more at home with Claude Code/codex in the CLI vs something like Claude Code Web/Codex Cloud. Part of it is likely reliability and "time to first token that AI responded that I can see." But local has tradeoff of conflicting ports/other resources, lag (maybe it's time to upgrade my M1 Max 64gb...), and slight latency incurred since LLM calls have slightly more network latency.

Curious to hear more about your local orchestration platform, how did you solve resource sharing (mainly ports for web stuff tbh)? Or is it more intra-task vs inter-task parallelism?

lawrencechen commented on Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs cloudrouter.dev/... · Posted by u/austinwang115

kovek · 19 hours ago

I think railway deserves a mention here: https://docs.railway.com/ai/mcp-server

lawrencechen · 18 hours ago

Railway is awesome! Pretty different use cases though - Railway's MCP is for deploying and managing persistent services (git-push-to-deploy). CloudRouter is about ephemeral sandboxes: the agent spins up a throwaway VM, does its work, and tears it down.

We are definitely inspired by Railway though!

lawrencechen commented on Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs cloudrouter.dev/... · Posted by u/austinwang115

killingtime74 · 20 hours ago

Myself I already pre-provisioned a kubernetes cluster and it just makes new manifests and deploys there. Less dangerous, less things for it to fail at. The networking is already setup. The costs are known/fixed (unless you autoscale in the cloud). It's much faster to deploy.

lawrencechen · 19 hours ago

Fair enough, is this mainly for running services or do you use it for dev loop too?

lawrencechen commented on Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs cloudrouter.dev/... · Posted by u/austinwang115

hxseven · 20 hours ago

Thanks for sharing this interesting project and approach!

One suggestion for improvement: Add some more info to your website/GitHub about the need for a provider and which providers are compatible. It took me a bit to figure that out because there was no prominent info about it. Additionally, none of the demos showed a login or authentication part. To me, it seemed like the VMs just came out of nowhere. So at first, I thought "Cloudrouter" was a project/company that gave away free VMs/GPUs (e.g. free tier/trial thing). But that seemed too good to be true. Later, I noticed the e2b.app domain and then I also found the little note way down at the bottom of the site that says "Provider selection" and "Use E2B provider (default)". Then I got it. However, I should mention that I don't know much about this whole topic. I hadn't heard of E2B or Modal before. Other people might find it more clear.

For those that are wondering about this too, you will need to use a provider like https://e2b.dev/ or https://modal.com/ to use this skill, and you pay them based on usage time.

lawrencechen · 19 hours ago

Right now, all usage is routed through us, hence `cloudrouter login` is required for now. Plan on adding bring-your-own cloud/key, but having something that's fast to setup is hard since we need to pre-build a template that includes VNC, browser, VSCode, worker daemon, etc...

lawrencechen commented on Anthropic's original take home assignment open sourced github.com/anthropics/ori... · Posted by u/myahio

languid-photic · 25 days ago

Naively tested a set of agents on this task.

Each ran the same spec headlessly in their native harness (one shot).

Results:

    Agent                        Cycles     Time
    ─────────────────────────────────────────────
    gpt-5-2                      2,124      16m
    claude-opus-4-5-20251101     4,973      1h 2m
    gpt-5-1-codex-max-xhigh      5,402      34m
    gpt-5-codex                  5,486      7m
    gpt-5-1-codex                12,453     8m
    gpt-5-2-codex                12,905     6m
    gpt-5-1-codex-mini           17,480     7m
    claude-sonnet-4-5-20250929   21,054     10m
    claude-haiku-4-5-20251001    147,734    9m
    gemini-3-pro-preview         147,734    3m
    gpt-5-2-codex-xhigh          147,734    25m
    gpt-5-2-xhigh                147,734    34m

Clearly none beat Anthropic's target, but gpt-5-2 did slightly better in much less time than "Claude Opus 4 after many hours in the test-time compute harness".

lawrencechen · 24 days ago

codex cli + gpt-5-2-codex-xhigh got to 1606 with the prompt "beat 1487 cycles. go." ~53 minutes.

lawrencechen commented on Creators of Tailwind laid off 75% of their engineering team github.com/tailwindlabs/t... · Posted by u/kevlened

lawrencechen · a month ago

They were perfectly positioned to build a Lovable/Bolt/Replit back in the day... might not be too late now either.

They could sell training data too. Though, UIs are relatively solved. But great UIs and criticizing UIs aren't.

Learned a lot from Refactoring UI, and I know (from trying) that it's impossible to make a code review bot based on out of the box sota models today. Vision capabilities are lacking here, and I can see demand for more data here. And Adam's taste likely fits well here.

lawrencechen commented on Show HN: Cancer diagnosis makes for an interesting RL environment for LLMs · Posted by u/dchu17

lawrencechen · 3 months ago

I wonder if navigation plays a significant role in performance. If you just randomly select 15 frames (presumably with interesting pixels), will the model perform similarly well?

lawrencechen commented on Show HN: I made a heatmap diff viewer for code reviews 0github.com... · Posted by u/lawrencechen

smcleod · 3 months ago

It's when you first start the app it asks you to login using GitHub before you see anything else.

lawrencechen · 3 months ago

cmux desktop app currently requires signing in to GitHub. We will build out better support for local repositories and remove sign in requirement soon.