thepoet (u/thepoet) - Readit News

thepoet commented on Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation github.com/gavrielc/nanoc... · Posted by u/jimminyx

thepoet · a month ago

One of the things that makes Clawdbot great is the allow all permissions to do anything. Not sure how those external actions with damaging consequences get sandboxed with this.

Apple containers have been great especially that each of them maps 1:1 to a dedicated lightweight VM. Except for a bug or two that appeared in the early releases, things seem to be working out well. I believe not a lot of projects are leveraging it.

A general code execution sandbox for AI code or otherwise that used Apple containers is https://github.com/instavm/coderunner It can be hooked to Claude code and others.

thepoet commented on Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents github.com/amlalabs/amla-... · Posted by u/souvik1997

thepoet · a month ago

This looks cool, congratulations. We investigated WASM for our use case but then turned to Apple containers which run 1:1 mapped to a microVM for local use here, which is being used by a bunch of folks https://github.com/instavm/coderunner

We are currently also building a solution InstaVM which is ideologically the same but for cloud https://instavm.io

thepoet commented on I want everything local – Building my offline AI workspace instavm.io/blog/building-... · Posted by u/mkagenius

sneak · 7 months ago

Halfway through he gives up and uses remote models. The basic premise here is false.

Also, the term “remote code execution” in the beginning is misused. Ironically, remote code execution refers to execution of code locally - by a remote attacker. Claude Code does in fact have that, but I’m not sure if that’s what they’re referring to.

thepoet · 7 months ago

The blog says more about keeping the user data private. The remote models in the context are operating blind. I am not sure why you are nitpicking, almost nobody reading the blog would take remote code execution in that context.

thepoet commented on Ask HN: What are you working on? (July 2025) · Posted by u/david927

thepoet · 7 months ago

I am working on a chess analytics tool, specifically a free and open source replacement of Chessbase in this age of LLMs that can run on all platforms. The idea is to lower the barrier of entry to use a chess improvement tool since Chessbase can be intimidating for a causal Chess.com beginner looking to go into serious chess prep. At present, it can do basic queries like H2H score of Magnus Carlsen vs Hikaru Nakamura, the top 10 juniors in the US, Magnus Carlsen's games with the London system opening and involving a queen sacrifice etc. Though getting it to work for advanced multi-step tactical patterns and finding games with certain imbalances in the query using natural language is getting challenging. DuckDB has helped a lot, along with modern LLMs for query generation with schema and some preprocessing of game PGNs and piece hashes. It can also import a user's Chess.com and Lichess games given the usernames and do similar queries as on Master level games.

I also used the tool to generate an Adult Chess improvers FIDE rank list for all federations around the world. Here are the July 2025 rankings though it still needs major improvements in filtering - https://chess-ranking.pages.dev

------------------

Another idea that I have been working on for sometime is connecting my Gmail which is a source of truth for all financial, travel, personal related stuff to a LLM that can do isolated code execution to generate beautiful infographics, charts, etc. on my travels, spending patterns. The idea is to do local processing on my emails while generating the actual queries blindly using a powerful remote LLM by only providing a schema and an emails 'fingerprint' kind of file that gives the LLM a sense of what country, region, interests we might be talking about without actually transmitting personal data. The level of privacy of the 'fingerprint' vs the quality of queries generated is something I have been very confused with.

thepoet commented on WASM Agents: AI agents running in the browser blog.mozilla.ai/wasm-agen... · Posted by u/selvan

indigodaddy · 8 months ago

For the Gemini-cli integration, is the only difference between code runner with Gemini-cli, and gemini-cli itself, is that you are just using Gemini-cli in a container?

thepoet · 8 months ago

No, Gemini-cli still is on your local machine, when it generates some code based on your prompt, with Coderunner, the code runs inside a container (which is inside a new lightweight VM courtesy Apple and provides VM level isolation), installs libraries requested, executes the generated code inside it and returns the result back to Gemini-cli.

This is also not Gemini-cli specific and you could use the sandbox with any of the popular LLMs or even with your local ones.

thepoet commented on WASM Agents: AI agents running in the browser blog.mozilla.ai/wasm-agen... · Posted by u/selvan

thepoet · 8 months ago

We looked at Pyodide and WASM along with other options like firecracker for our need of multi-step tasks that require running LLM generated code locally via Ollama etc. with some form of isolation than running it directly on our dev machines and figured it would be too much work with the various external libraries we have to install. The idea was to get code generated by a powerful remote LLM for general purpose stuff like video editing via ffmpeg, beautiful graphs generation via JS + chromium and stuff and execute it locally with all dependencies being installed before execution.

We built CodeRunner (https://github.com/BandarLabs/coderunner) on top of Apple Containers recently and have been using it for sometime. This works fine but still needs some improvement to work across very arbitrary prompts.

thepoet commented on I ask this chess puzzle to every new LLM gist.github.com/abhishek-... · Posted by u/thepoet

andix · a year ago

This test is worthless in a few weeks, it's now going into the training data. Even repeatedly posting it into LLM services (with analytics enabled) could lead to inclusion in the training data.

thepoet · a year ago

Interestingly, this test has been in the public domain for the last seven years, since it is part of all possible chess games with 7 or less pieces, which is solved and published. It is a huge file, but the five pieces games dataset with the FEN is less than a GB. I wonder if it even got included in the training data earlier, or if it will be.

thepoet commented on I ask this chess puzzle to every new LLM gist.github.com/abhishek-... · Posted by u/thepoet

danparsonson · a year ago

Is it reasonable to imagine that LLMs should be able to play chess? I feel like we're expending a whole lot of effort trying to distort a screwdriver until it looks like a spanner, and then wondering why it won't grip bolts very well.

Why should a language model be good at chess or similar numerical/analytical tasks?

In what way does language resemble chess?

thepoet · a year ago

It might be a reasonable ask for an LLM to 'remember' the endgame tablebase of solved games - which is less than a GB for all game with five or less pieces on the board. This puzzle specifically relies on this knowledge and the knowledge of how the chess pieces move.