I have little doubt where things are going, but the irony of the way they communicate versus the quality of their actual product is palpable.
Claude Code (the product, not the underlying model) has been one of the buggiest, least polished products I have ever used. And it's not exactly rocket science to begin with. Maybe they should try writing slightly less than 100% of their code with AI?
That's a bummer. I was looking forward to testing this, but that seems pretty limiting.
My current solution uses Tailscale with Termius on iOS. It's a pretty robust solution so far, except for the actual difficulty of reading/working on a mobile screen. But for the most part, input controls work.
My one gripe with Termius is that I can't put text directly into stdin using the default iOS voice-to-text feature baked into the keyboard.
I’ve been doing this for a while [1], but ultimately settled on a building a thin transport layer for Telegram to accept and return media, and persistent channels, vastly improved messaging UX, etc. and ended up turning this into a ‘claw with a heartbeat and SOUL [2].
I've been using email and Cloudeflare email router. You don't get the direct feedback of a terminal, but it's much easier to read what's happening in html formatted email.
It also feels kind of nice to just fire off an email and let it do it's thing.
Exactly my experience, I know they vibe code features and that’s fine but it looks like they don’t do proper testing which is surprising to me because all you need bunch of cheap interns to some decent enough testing
No there is a wide gap between good and bad testers. Great testers are worth their weight in gold and delight in ruining programmer's days all day long.
IMO not a good place to skimp and a GREAT place to spend for talent.
They bring down production because the version string was changed incorrectly to add an extra date. That would have been picked up in even the most basic testing since the app couldn't even start.
First of all /remote-control in the terminal just printed a long url. Even though they advertise we can control it from the mobile app (apparently it should show a QR code but doesn't). I fire up the mobile app but the session is nowhere to be seen. I try typing the long random URL in the mobile browser, but it simply throws me to the app, but not the session. I read random reddit threads and they say the session will be under "Code", not "Chats", but for that you have to connect github to the Claude app (??, I just want to connect to the terminal Claude on my PC, not github). Ok I do it.
Now even though the session is idle on the pc, the app shows it as working... I try tapping the stop button, nothing happens. I also can't type anything into it. Ok I try starting a prompt on the pc. It starts the work on the PC, but on the mobile app I get a permission dialog... Where I can deny or allow the thing that actually already started on the pc because I already gave permission for that on the PC. And many more. Super buggy.
I wonder if they let Claude write the tests for their new features... That's a huge pitfall. You can think it works and Claude assures you all is fine but when you start it everything falls apart because there are lots of tests but none actually test the actual things.
I'm willing to bet most of their libraries are definitely vibe coded. I'm using the claude-agent-sdk and there are quite a few bugs and some weird design decisions. And looking through the actual python code it's definitely not what I would classify 'best practice'. Bunch of imports in functions, switching on strings instead of enums, etc.
I had to downgrade to an earlier release because an update introduced a regression where they weren't handling all of their own event types.
> - You can't interrupt Claude (you press stop and he keeps going!)
This is normal behavior on desktop sometimes its in the middle of something? I also assume there's some latency
> - At best it stops but just keeps spinning
Latency issues then?
> - It can get stuck in plan mode
I've had this happen from the desktop, and using Claude Code from mobile before remote control, I assume this has nothing to do with remote control but a partial outage of sorts with Claude Code sometimes?
I don't work for Anthropic, just basing off my anecdotal experience.
On top of that is something they should have had from earlier times. My biggest pain point is to not to be able to continue from my phone. I just use a service to pipe telegram to any cc session in the dev machine. This is the number 1 reason why I got excited by openclaw in the first place but its overkill to have it just to control cc
This is my general experience with the claude app, I don't know what they're smoking over at anthropic but their ability to touch mobile arch inappropriately with AI is reaching critical levels.
We’ve been building in this space for a while, and the issues listed here are exactly the hard parts: session connectivity, reconnection logic, multi-session UX, and keeping state in-sync across devices. Especially when it comes to long running tasks and the edge cases that show up in real use.
I have just today discovered zmx [1] which is like tmux but I always hated the tmux terminal emulation and how it hijacks scrolling, especially on Termius on my phone. It does session persistence but I think without the terminal emulator side of things, so scrolling works normally.
Been testing it today with Claude Code and it seems to work quite well switching between my laptop and phone.
I also hate how tmux uses alt mode and can never remember all the shortcuts, copy paste is a PITA and just today I had to look up how to dump the scrollback buffer to a file. Named sessions without window management makes a lot more sense these days. Similarly, I'm not a fan of all the ANSI escape codes that CC uses to jump the cursor around and rewrite the display to look like a GUI. I prefer a TUI that doesn't mutate rows after writing them, that's what alt mode is for. CC often clears whatever was in the scrollback buffer before you opened it, it hides bracketed paste, and goes crazy sometimes when content overflows the window and I have to resize the terminal or get blasted with a wall of glitching characters--extra annoying if I'm working from a low bandwidth link. I develop my own agent framework and code agent, and while some features aren't as polished as CC, one of my explicit goals is to preserve the traditional CLI feel, like the python REPL (that's what it's based around). I'll give zmx a try tonight :)
It can help to recognize that tmux combines three kinds of functionality: 1. process persistence; 2. client attachment; 3. view layout. If you don't like how tmux works, there are alternatives. I prefer Zellij [1]. (It also can be informative to take a peek at dtach [2] and abduco [3].)
tmux supports tabs so you can have multiple Claude Code sessions running concurrently. You do need to learn a few tmux keyboard shortcuts to use it effectively (e.g. opening/closing/switching tab).
Thanks for the tip. Other ppl are saying "most of us started out like this" but if you haven't played with tailscale etc. (like me). Then this is new and good for learning imo
Yes. Doing the same. What is the advantage of this new feature? Tmux/Tailscale/Termius give you full control of your terminal.
Or mainly to save the end user the hassle to set it up correctly?
Ease of setup is the biggest reason. I use this setup as well, but there are other UX niceties that would be a lot better with a dedicated mobile app: push notifications when Claude needs your input (I use a hook for this that connects to Pushover, but that's another service and extra setup), voice input, autocorrect that's right for this context, etc.
Opencode's 'web' command makes your local session run on the browser with same access rights as the cli. It's a pretty slick interface too. I sometimes use it instead of the cli even when I can access both.
You can test it right now if you want with the included free models.
It's changing super fast. I am using it on the desktop mostly and when I tried on my phone there were issues yes. But do try it out again in a few weeks.
(I am actually using zellij on the remote and using various CLIs more than I am using only opencode on the web. I was using wezterm mux until about a week ago but the current state of the terminal is not very good for this scenario. It seems like almost all the CLIs are choking because of nodejs ink library)
I was using this religiously but there’s a bug currently that makes the initialization fail and/or throws an error on the phone client.
Absolutely great piece of software otherwise, free, anonymous, encrypted and so on. Really hope the team can fix this soon - I would hate to switch back to tmux tunneling.
Set it up and never managed to have it work. Only thing it did was renaming my sessions on my main cc instance. Mobile did nothing, not even an error message.
I feel like a lot of folks are saying this kills the Code on your Phone opportunity some start-ups are building for. I don't agree. I feel like coding agents are like streaming services, we will subscribe to multiple and switch between them. So for one there's value in a universal control plane. The other is that mobile as a coding interface should offer more than a remote control to the desktop. I think there's still some space to cook, especially if people are investing 8 hours a day talking to agents, the interface surely matters.
I don't know a single person who is satisfied with the status quo on streaming services where you have to subscribe to multiple ones. Everyone is complaining that the landscape is 1) more fragmented than cable was, 2) costs more, 3) has even more ads than cable
I think people forgot how bad it was. It was much more fragmented before but instead of services it was fragmented by time. Sure you have access to Seinfeld, but you can watch one or two Seinfelds a night at 8pm and 11pm.
I also remember base cable without any movies was around $60 or something and with some movie channels is >$100. And that's not inflation adjusted. You can easily get 3 or 4 of the top services for $100 today.
Finally claiming there are more ads on these services is a joke. There was ~20m for every 30m of programming, meaning 1/3 of the time you're watching commercials. And not just any commercials, the same commercials over and over. There was even a case of shows being sped up on cable to show more commercials.
I get it, everyone wants everything seamlessly and for next to nothing, but claiming that 90s cable was even comparable is absurd.
Not that it is particularly relevant to agentic coding but how can anyone truly argue streaming costs more? Average cable packages were exceeding 125-150 USD a month (in 2000 dollars). Under no circumstances would I be sympathetic to the argument that streaming costs more.
You can get all 7 of the major streaming subs for less without even shopping out deals. That is 100s of times the volume and quality of content that was delivered on cable for far less. It is so much content realistically that no one I have ever met has subscribed to all of them at once.
The argument really is empty. The fragmentized experience is annoying, but it isn't more expensive...And it DEFINITELY has fewer ads.
I agree. I spend a lot of time working from my phone so I had to make my own workflow that works for me. I've been following all these bans and drama with the subscription keys and custom harnesses etc. I think there's room for a "universal control plan" that lets you leverage the CLI providers (and whatever crappy interfaces / apis they give you).
Weird all these companies struggle so much to support remote services, ssh has been working for me pretty seamlessly for like the 20 years I've been using it and has allowed me to remote-control any computer I own with relatively reliable authentication (with some hiccups that tend to be patched pretty rapidly when found) throughout that entire period. I hear tell it worked even before I was using computers professionally, too
SSHing into a terminal with your phone is terrible UX. Very low bandwidth. Voice input into a native app is not. We are not talking about fine grained control of your system by running explicit commands. We are talking about programming in plain English.
I can use ssh as the transport and authentication layer over any internet connection through a fairly easily learnable set of ssh flags that can be further simplified through aliasing. As a bonus it's e2ee. The overhead from that affects latency but not bandwidth. Let's set aside ssh for a second. Streaming voice over the internet is a long-solved problem, I've been able to host a mumble server on a toaster since forever ago. So if the local machine can recognize voice commands, talking through a phone shouldn't make a difference. Like, if this works locally, and it works on the phone or whatever, why is having it talk over the internet the hard part? Whatever you think of the application itself, this is a weird failure mode
Well it DOES have less storage than a Nomad (hence lame), but this way you don't need to pay for a public IP address, or for a VPS to run Wireguard on, or for a commercial VPN solution, and then install a terminal emulator on your phone and set up SSH keys.
People tried reinventing terminals, SSH, and tmux for phones. It's a pretty terrible experience using your thumbs. And it takes significant know-how to set up.
And in modern stacks, it almost necessitates a man in the middle - tailscale is common but it's still a central provider. So is it really the most inefficient way possible?
Claude Code is a good product, they should just keep on steadily improving it and improving the model. I am not sure why they are spraying in all directions like this..
Because they're not trying to build a leading coding agent - they're racing to AGI. That means cramming everything and anything, from every angle because general intelligence is happening next year and that'll connect all the dots, leading to hypergrowth and complete industry dominance, across all domains.
Seriously, if you listen to Dario talk, it's non-stop how there's a tsunami coming, people have no idea of what's ahead, how general intelligence or superintelligence is right around the corner. But also, this is super dangerous, and only Anthropic can save us from a doomsday scenario.
I see it on myself too. It feels too irresistible to start adding more features to software you develop with LLM agents. Everything feels like just a few prompts and will be done in half an hour. Why not add this too? Just another sentence in the prompt. Next thing you know you have more features than you remember and the AI starts to have a really hard job keeping it all functional.
Coding with AI requires immense restraint and strong scope limits.
Right now:
- You can't interrupt Claude (you press stop and he keeps going!)
- At best it stops but just keeps spinning
- The UI disconnects intermittently
- It disconnects if you switch to other parts of Claude
- It can get stuck in plan mode
- Introspection is poor
- You see XML in the output instead of things like buttons
- One session at a time
- Sessions at times don't load
- Everytime you navigate away from Code you need to wait for your session to reappear
I'm sure I'm missing a few things.
I thought coding was a solved problem Boris?
Claude Code (the product, not the underlying model) has been one of the buggiest, least polished products I have ever used. And it's not exactly rocket science to begin with. Maybe they should try writing slightly less than 100% of their code with AI?
You get a buggy electron app and they get billions in valuation.
Clearly no one values quality anymore. 1000% yolo
My current solution uses Tailscale with Termius on iOS. It's a pretty robust solution so far, except for the actual difficulty of reading/working on a mobile screen. But for the most part, input controls work.
My one gripe with Termius is that I can't put text directly into stdin using the default iOS voice-to-text feature baked into the keyboard.
[1] https://elliotbonneville.com/phone-to-mac-persistent-termina...
[2] https://elliotbonneville.com/claude-code-is-all-you-need/
Wrote a daemon + mobile app (similar to Happy, but fixed a lot of the problems) and baked in Tailscale support.
Will open source it soon and should have an official release in the next few weeks: https://getroutie.com/
It also feels kind of nice to just fire off an email and let it do it's thing.
They bring down production because the version string was changed incorrectly to add an extra date. That would have been picked up in even the most basic testing since the app couldn't even start.
https://news.ycombinator.com/item?id=46532075
The fix (not even a PR or commit message to explain) https://github.com/anthropics/claude-code/commit/63eefe157ac...
No root cause analysis either https://github.com/anthropics/claude-code/issues/16682#issue...
Sounds like a problem AI can easily solve!
First of all /remote-control in the terminal just printed a long url. Even though they advertise we can control it from the mobile app (apparently it should show a QR code but doesn't). I fire up the mobile app but the session is nowhere to be seen. I try typing the long random URL in the mobile browser, but it simply throws me to the app, but not the session. I read random reddit threads and they say the session will be under "Code", not "Chats", but for that you have to connect github to the Claude app (??, I just want to connect to the terminal Claude on my PC, not github). Ok I do it.
Now even though the session is idle on the pc, the app shows it as working... I try tapping the stop button, nothing happens. I also can't type anything into it. Ok I try starting a prompt on the pc. It starts the work on the PC, but on the mobile app I get a permission dialog... Where I can deny or allow the thing that actually already started on the pc because I already gave permission for that on the PC. And many more. Super buggy.
I wonder if they let Claude write the tests for their new features... That's a huge pitfall. You can think it works and Claude assures you all is fine but when you start it everything falls apart because there are lots of tests but none actually test the actual things.
I had to downgrade to an earlier release because an update introduced a regression where they weren't handling all of their own event types.
Dead Comment
This is normal behavior on desktop sometimes its in the middle of something? I also assume there's some latency
> - At best it stops but just keeps spinning
Latency issues then?
> - It can get stuck in plan mode
I've had this happen from the desktop, and using Claude Code from mobile before remote control, I assume this has nothing to do with remote control but a partial outage of sorts with Claude Code sometimes?
I don't work for Anthropic, just basing off my anecdotal experience.
Frequently chews through lots of expensive Opus tokens, then it just stops with no communication about why or what's next.
No way to tell what it's done, what's remaining to complete.
Only choice is to re-run everything and eat the cost of the wasted time and tokens.
Dead Comment
We’ve been building in this space for a while, and the issues listed here are exactly the hard parts: session connectivity, reconnection logic, multi-session UX, and keeping state in-sync across devices. Especially when it comes to long running tasks and the edge cases that show up in real use.
- - -
get tailscale (free) and join on both devices
install tmux
get an ios/android terminal (echo / termius)
enable "remote login" if on mac (disable on public wifi)
mosh/ssh into computer
now you can do tmux then claude / codex / w/e on either device and reconnect freely via tmux ls and tmux attach -t <id>
- - -
You can name tmux and resume by name via tmux new -s <feature> and tmux attach -t <feature>
Been testing it today with Claude Code and it seems to work quite well switching between my laptop and phone.
[1] https://github.com/neurosnap/zmx
[1]: https://zellij.dev and https://github.com/zellij-org/zellij
[2]: https://github.com/crigler/dtach and https://dtach.sourceforge.net
[3]: https://github.com/martanne/abduco/issues/70
Based on my experience many people don't know this is a thing you can do.
How do you deal with multiple concurrent sessions of CC with this setup?
How important is mosh? I wasn't able to get it set up the last time I tried... ran into a bunch of issues.
Depends- how good is your signal? Mosh has a great property that it buffers everything locally so there's no lag even if your connection sucks.
On ssh, every keystroke is a roundtrip
Could even use cc to check in on and/or "send-keys"
What wasn't working about mosh? Just install mosh and use mosh to connect
It's this.
Don't have a Dropbox moment ;) [1]
[1]: https://news.ycombinator.com/item?id=9224
You can test it right now if you want with the included free models.
https://opencode.ai/docs/web/
(I am actually using zellij on the remote and using various CLIs more than I am using only opencode on the web. I was using wezterm mux until about a week ago but the current state of the terminal is not very good for this scenario. It seems like almost all the CLIs are choking because of nodejs ink library)
A fork named Happier looks promising, but is alpha-stage and is also a mystery-meat vibe-coded security roulette.
I also remember base cable without any movies was around $60 or something and with some movie channels is >$100. And that's not inflation adjusted. You can easily get 3 or 4 of the top services for $100 today.
Finally claiming there are more ads on these services is a joke. There was ~20m for every 30m of programming, meaning 1/3 of the time you're watching commercials. And not just any commercials, the same commercials over and over. There was even a case of shows being sped up on cable to show more commercials.
I get it, everyone wants everything seamlessly and for next to nothing, but claiming that 90s cable was even comparable is absurd.
https://www.digitaltrends.com/home-theater/how-networks-spee...
You can get all 7 of the major streaming subs for less without even shopping out deals. That is 100s of times the volume and quality of content that was delivered on cable for far less. It is so much content realistically that no one I have ever met has subscribed to all of them at once.
The argument really is empty. The fragmentized experience is annoying, but it isn't more expensive...And it DEFINITELY has fewer ads.
I literally see no ads on my streaming subscription for close to a tenth of the price of cable.
The results are enough for me and I'm not doing things that allow me to differentiate the output between ChatGPT, Claude and, the others.
The agents are more like the radio in my car, whenever I want music, I switch channel until I find something good enough.
If I'm really in need of something special, I'll use Spotify on my phone.
And sometimes, I just drive with the radio off.
There's a comparison of the approaches as I see them here https://yepanywhere.com/subscription-access-approaches
I was starting to think I've really fallen behind or something. I feel relief AND horror.
And in modern stacks, it almost necessitates a man in the middle - tailscale is common but it's still a central provider. So is it really the most inefficient way possible?
Dead Comment
The daily “what broke and changed now” with claude code is wearing me out fast.
Seriously, if you listen to Dario talk, it's non-stop how there's a tsunami coming, people have no idea of what's ahead, how general intelligence or superintelligence is right around the corner. But also, this is super dangerous, and only Anthropic can save us from a doomsday scenario.
Coding with AI requires immense restraint and strong scope limits.
Dead Comment