Readit News logoReadit News
sudb commented on Show HN: Engine – A multi-LLM alternative to Codex   enginelabs.ai/... · Posted by u/sdspurrier
simvirdi · 9 months ago
Looks cool - do you have any benchmarks? How do you compare to other products out there?
sudb · 9 months ago
We last submitted a SWE-Bench verified result in November 2024 - at the time I believe we were in the top 5 entrants.

We expect Engine to be as good as the other code-writing agents out there at the moment - we understand almost everyone in the space to be using very similar base models and agent scaffolding.

sudb commented on Show HN: The Best Terminal Inspired Portfolio on the Internet™   kuber.studio/... · Posted by u/kuberwastaken
sudb · 9 months ago
the closest I could get to getting your LLM to identify itself was as LaMDA, which makes me think this is probably a Gemma model - am I close?
sudb commented on Run GitHub Actions locally   github.com/nektos/act... · Posted by u/flashblaze
Aurornis · 9 months ago
Same experience here. Edge cases everywhere, though most can be worked around.

You can specify different runners to use. The default images are a compromise to keep size down. There is a very large image that tries to include everything you might want. I would suggest trying that if you don’t mind the very large (15GB IIRC) image.

sudb · 9 months ago
I definitely remember considering the larger images - I think we ended up not using them since my work's usecase for act is running user github workflows on-demand on temporary VMs. The hope was that most usage is covered by the smaller images - and in fairness that has been true so far.
sudb commented on Show HN: Cyberdesk, API for computer agents to control a desktop (open source)   github.com/cyberdesk-hq/c... · Posted by u/sgtwompwomp
sudb · 9 months ago
Looks cool! If you're able to say - where/how do you run these virtual desktop instances?
sudb commented on Show HN: A MCP server to evaluate Python code in WASM VM using RustPython   github.com/tuananh/hyper-... · Posted by u/tuananh
digdugdirk · 9 months ago
Is there a list of these "code sandboxes" floating around somewhere? It seems like it's going to be more and more important with LLMs playing more of a factor in development moving forward.
sudb · 9 months ago
I know of https://modal.com/, which I believe is used by Codegen and Cognition.

Anecdotally-speaking, I hear that many companies in the LLM agent space roll their own sandbox solutions - I've heard of both Firecracker- and Kubernetes-based implementations.

sudb commented on Run GitHub Actions locally   github.com/nektos/act... · Posted by u/flashblaze
sudb · 9 months ago
I use this for work - but there are edge cases all over the place that I keep running into (e.g. Yarn being installed on Github-hosted runners, but not self-hosted ones or act - https://github.com/actions/setup-node/issues/182)

Apart from that it's been quite good!

sudb commented on Show HN: Engine – A multi-LLM alternative to Codex   enginelabs.ai/... · Posted by u/sdspurrier
sudb · 9 months ago
I worked on this! Happy to answer any questions anyone has.
sudb commented on Writing "/etc/hosts" breaks the Substack editor   scalewithlee.substack.com... · Posted by u/scalewithlee
sudb · 10 months ago
I had a problem recently trying to send LLM-generated text between two web servers under my control, from AWS to Render - I was getting 403s for command injection from Render's Cloudflare protection which is opaque and unconfigurable to users.

The hacky workaround which has been stably working for a while now was to encode the offending request body and decode it on the destination server.

u/sudb

KarmaCake day25February 14, 2023View Original