Readit News logoReadit News
asabla commented on Running Claude Code dangerously (safely)   blog.emilburzo.com/2026/0... · Posted by u/emilburzo
molson8472 · a month ago
Once approval fatigue and ongoing permission management kicks in, the temptation is strong to run `--dangerously-skip-permissions`. I think that's what we all want - run agents in a locked-down sandbox where the blast radius of mistakes and/or prompt injection attacks is minimal/acceptable.

I started running Claude Code in a devcontainer with limited file access (repo only) and limited outbound network access (allowlist only) for that reason.

This weekend, I generalized this to work with docker compose. Next up is support for additional agents (Codex, OpenCode, etc). After that, I'd like to force all network access through a proxy running on the host for greater control and logging (currently it uses iptables rules).

This workflow has been working well for me so far.

Still fresh, so may be rough around the edges, but check it out: https://github.com/mattolson/agent-sandbox

asabla · a month ago
Very nice!

I've been experimenting with a similar setup. And I'll probably implement some of the things you've been doing.

For the proxy part I've been running https://www.mitmproxy.org/ It's not fully working for all workflows yet. But it's getting close

asabla commented on A guide to local coding models   aiforswes.com/p/you-dont-... · Posted by u/mpweiher
andix · 2 months ago
From my personal experience it's around 50:50 between Claude and Codex. Some people strongly prefer one over the other. I couldn't figure out yet why.

I just can't accept how slow codex is, and that you can't really use it interactively because of that. I prefer to just watch Claude code work and stop it once I don't like the direction it's taking.

asabla · 2 months ago
From my point of view, you're either choosing between instruction following or more creative solutions.

Codex models tend to be extremely good at following instructions, to the point that it won't do any additional work unless you ask it to. GPT-5.1 and GPT-5.2 on the other hand is a little bit more creative.

Models from Anthropics on the other hand is a lot more loosy goosy on the instructions, and you need to keep an eye on it much more often.

I'm using models interchangeably from both providers all the time depending on the task at hand. No real preference if one is better then the other, they're just specialized on different things

asabla commented on Fifty Shades of OOP   lesleylai.info/en/fifty_s... · Posted by u/todsacerdoti
rawgabbit · 3 months ago
Muratori traced the history of OOP to the original documents. Skip to the 1:18 mark if you want to skip to his findings.

https://youtu.be/wo84LFzx5nI

asabla · 3 months ago
This is such a good video. I really like the way he presents it as well.

His rant about CS historians is also a fun subject

asabla commented on AI can code, but it can't build software   bytesauna.com/post/coding... · Posted by u/nreece
jumploops · 4 months ago
I've been forcing myself to "pure vibe-code" on a few projects, where I don't read a single line of code (even the diffs in codex/claude code).

Candidly, it's awful. There are countless situations where it would be faster for me to edit the file directly (CSS, I'm looking at you!).

With that said, I've been surprised at how far the coding agents are able to go[0], and a lot less surprised about where I need to step in.

Things that seem to help: 1. Always create a plan/debug markdown file 2. Prompt the agent to ask questions/present multiple solutions 3. Use git more than normal (squash ugly commits on merge)

Planning is key to avoid half-brained solutions, but having "specs" for debug is almost more important. The LLM will happily dive down a path of editing as few files as possible to fix the bug/error/etc. This, unchecked, can often lead to very messy code.

Prompting the agent to ask questions/present multiple solutions allows me to stay "in control" over the how something is built.

I now basically commit every time a plan or debug step is complete. I've tried having the LLM control git, but I feel that it eats into the context a bit too much. Ideally a 3rd party "agent" would handle this.

The last thing I'll mention is that Claude Code (Sonnet 4.5) is still very token-happy, in that it eagerly goes above and beyond when not always necessary. Codex (gpt-5-codex) on the other hand, does exactly what you ask, almost to a fault. For both cases, this is where planning up-front is super useful.

[0]Caveat: the projects are either Typescript web apps or Rust utilities, can't speak to performance on other languages/domains.

asabla · 4 months ago
> The last thing I'll mention is that Claude Code (Sonnet 4.5) is still very token-happy, in that it eagerly goes above and beyond when not always necessary. Codex (gpt-5-codex) on the other hand, does exactly what you ask, almost to a fault.

I very much share your experience. As for the time being I like the experience with codex over claude, just because I find my self in a position where I know much sooner when to step in and just doing it manually.

With claude I find my self in a typing exercise much more often, I could probably get better of knowing when to stop ofc.

asabla commented on A proposal to add GC-less, unmanaged memory spaces to C#   axx83.substack.com/p/the-... · Posted by u/axx83
asabla · 5 months ago
I can't tell if this is satire or not. And some parts read like it was written by AI.

Either way, a more fine grained control over the GC is probably preferred over something like this.

asabla commented on GPT-OSS Reinforcement Learning   docs.unsloth.ai/new/gpt-o... · Posted by u/vinhnx
vlovich123 · 5 months ago
I have compared instruction following of stock gpt-OSS against stock qwen and the 20B outperformed all others, intelligently following instructions and reasoning about tool calling correctly for tools it hasn’t been trained on. Additionally, it performs like a 3B model because it uses 32 experts. I don’t know where this claim that it sucks comes from but my evaluation of similarly competitive models showed it leading the pack by a lot.
asabla · 5 months ago
I'm always so confused by those statements as well. Because just like you, I feel that the 20B version is really good at following instructions.

Some of the qwen models are too, but they seem to need a bit more handholding.

This is of course just anecdotal from my end. And I've been slacking on keeping up with evals while testing at home

asabla commented on Are OpenAI and Anthropic losing money on inference?   martinalderson.com/posts/... · Posted by u/martinald
ugh123 · 6 months ago
That might be the case, but inference times have only gone up since GPT-3 (GPT-5 is regularly 20+ seconds for me).
asabla · 6 months ago
And by GPT-5 you mean through their API? Directly through Azure OpenAI services? or are you talking about ChatGPT set to using GPT-5.

All of these alternatives means different things when you say it takes +20 seconds for a full response.

asabla commented on The issue of anti-cheat on Linux (2024)   tulach.cc/the-issue-of-an... · Posted by u/todsacerdoti
mitkebes · 6 months ago
I also always hear a lot of people complain about cheaters in Valorant, so all of that compromised personal security doesn't actually stop cheaters.

Honestly I feel like you should only use kernel anticheat on a dedicated machine that's kept 100% separate from any of your personal data. That's a lot to ask of people, but you really shouldn't have anything you don't consider public data on the same hardware.

asabla · 6 months ago
I fundamentally agree with you.

But anti-cheat hasn't been about blocking every possible way of cheating for some time now. It's been about making it as in convenient as possible, thus reducing the amount of cheaters.

Is the current fad of using kernel level anti-cheats what we want? hell nah.

The responsibility of keeping a multi-player session clean of cheaters, was previously shared between the developers and server owners. While today this responsibility has fallen mostly on developers (or rather game studios) since they want to own the whole experience.

asabla commented on From M1 MacBook to Arch Linux: A month-long experiment that became permanenent   ssp.sh/blog/macbook-to-ar... · Posted by u/articsputnik
makeitdouble · 6 months ago
> no one can replicate the smoothness of the Apple trackpad

There are newer trackpad for Windows, and the Surface line had pretty good trackpad as well (not Magic Trackpad levels, but perhaps 80% there ?

The more surprising part to me when I gave up on the Magic Trackpad moving to windows is I was over it in a week. I only ever used trackpads for a decade, but mouse's just work that much better on Windows/Linux, especially getting the extra buttons actual physical click helped a lot. The paradigms are just different enough that the Trackpad makes less sense than on macos.

asabla · 6 months ago
I think so far some of the surface devices and some of the Razer (yes, the one making computer mices, keyboards and such) had been the closest.
asabla commented on I run a full Linux desktop in Docker just because I can   howtogeek.com/i-run-a-ful... · Posted by u/redbell
pmontra · 6 months ago
Samsung DEX had a Linux desktop package in 2018. It was a lxd container based on Ubuntu 16.04. They developed it in collaboration with Canonical. Unfortunately they deprecated it shortly after, maybe already in 2018. The next Android update would remove it.

It worked but Android killed it mercilessly if it used too much memory or the rest of the system needed it.

asabla · 6 months ago
I still remember how much I liked the idea. Really tried to use it, but the experience with both browsers and vscode was....not that great.

Kinda hope they revisit this idea in a near future again

u/asabla

KarmaCake day242February 18, 2021View Original