stillsut (u/stillsut)

stillsut commented on Microsoft Amplifier github.com/microsoft/ampl... · Posted by u/JDEW

thethimble · 2 months ago

There is a related idea called "alloying" where the 2-4 candidate solutions are pursued in parallel with different models, yielding better results vs any single model. Very interesting ideas.

https://xbow.com/blog/alloy-agents

stillsut · 2 months ago

Exactly what I was looking for, thanks.

I've been doing something similiar: aider+gpt-5, claude-code+sonnet, gemini-cli+2.5-pro. I want to coder-cli next.

A main problem with this approach is summarizing the different approaches before drilling down into reviewing the best approach.

Looking at a `git diff --stat` across all the model outputs can give you a good measure of if there was an existing common pattern for your requested implementation. If only one of the models adds code to a module that the others do not, it's usually a good jumping off point to exploring the differing assumptions each of the agents built towards.

stillsut commented on Microsoft Amplifier github.com/microsoft/ampl... · Posted by u/JDEW

stillsut · 2 months ago

I've actually written my own a homebrew framework like this which is a.) cli-coder agnostic and b.) leans heavily on git worktrees [0].

The secret weapon to this approach is asking for 2-4 solutions to your prompt running in parallel. This helps avoid the most time consuming aspect of ai-coding: reviewing a large commit, and ultimately finding the approach to the ai took is hopeless or requires major revision.

By generating multiple solutions, you can cutdown investing fully into the first solution and use clever ways to select from all the 2-4 candidate solutions and usually apply a small tweak at the end. Anyone else doing something like this?

[0]: https://github.com/sutt/agro

stillsut commented on Sampling and structured outputs in LLMs parthsareen.com/blog.html... · Posted by u/SamLeBarbare

dcreater · 3 months ago

i am not following how you encoded a BTC address into a poem. can you help explain?

stillsut · 3 months ago

I think the easiest explanation is to look at the table here: https://github.com/sutt/innocuous?tab=readme-ov-file#how-it-...

Watch how the "Cumulative encoding" row grows each iteration (that's where the BTC address will be encoded) and then look at the other rows for how the algorithm arrives at that.

Thanks for checking it out!

stillsut commented on Claude Sonnet 4.5 anthropic.com/news/claude... · Posted by u/adocomplete

manofmanysmiles · 3 months ago

I haven't shouted into the void for a while. Today is as good a day as any other to do so.

I feel extremely disempowered that these coding sessions are effectively black box, and non-reproducible. It feels like I am coding with nothing but hopes and dreams, and the connection between my will and the patterns of energy is so tenuous I almost don't feel like touching a computer again.

A lack of determinism comes from many places, but primarily: 1) The models change 2) The models are not deterministic 3) The history of tool use and chat input is not availabler as a first class artifact for use.

I would love to see a tool that logs the full history of all agents that sculpt a codebase, including the inputs to tools, tool versions and any other sources of enetropy. Logging the seed into the RNGs that trigger LLM output would be the final piece that would give me confidence to consider using these tools seriously.

I write this now after what I am calling "AI disillusionment", a feel where I feel so disconnected from my codebase I'd rather just delete it than continue.

Having a set of breadcrumbs would give me at least a modicum of confidence that the work was reproducible and no the product of some modern ghost, completely detached from my will.

Of course this would require actually owning the full LLM.

stillsut · 3 months ago

I've been building something like this, a markdown that tracks your prompts, and the code generated.

https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Check it out, I'd be curious of your feedback.

stillsut commented on Ask HN: What are you working on? (September 2025) · Posted by u/david927

stillsut · 3 months ago

Encoding / decoding hidden messages in LLM output.

https://github.com/sutt/innocuous

The traditional use-case is steganography ("hidden writing"). But I see more potential applications than just for spy stuff.

I'm using this project as a case study for writing CS-oriented codebases and keeping track of every prompt and generated code line in a markdown file: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

My favorite pattern I've found is to write encode implementations manually, and then AI pretty easily is able to follow that logic and translate it into a decode function.

stillsut commented on Sampling and structured outputs in LLMs parthsareen.com/blog.html... · Posted by u/SamLeBarbare

FlyingLawnmower · 3 months ago

I spent a couple years building a high performance, expressive library for structured outputs in LLMs. Our library is used by OpenAI for structured outputs on the hosted API. Happy to answer questions on how this works:

User friendly library that connects to lots of OSS model serving backends: https://github.com/guidance-ai/guidance/

Core Rust library written for high performance mask computation (written mostly by my collaborator @mmoskal): http://github.com/guidance-ai/llguidance

stillsut · 3 months ago

I'm also working on a library to steer the sampling step of LLM's but more for steganographic / arbitrary data encoding purposes.

Should work with any llama.cpp compatible model: https://github.com/sutt/innocuous

stillsut commented on Nostr nostr.com/... · Posted by u/dtj1123

littlecranky67 · 3 months ago

Glad to see Nostr on top of HN. It is in its infancy, but Nostr allows for "zapps" (basically sending instant micropayments via bitcoin-lightning) - so instead of using ads and dubious algorithms, you can show your appreciation to content creators by small payments. This is a model for an ad-free, decentralized social media system.

stillsut · 3 months ago

You can also earn zaps for pull requests working on Nostr clients.

We've been hosting some bounties like this one here: https://app.lightningbounties.com/issue/615dc5f7-ed91-4ecd-8...

stillsut commented on Vibe coding has turned senior devs into 'AI babysitters' techcrunch.com/2025/09/14... · Posted by u/CharlesW

jitl · 3 months ago

The worst is when I have to baby-sit someone else’s AI. It’s so frustrating to get tagged to review a PR, open it up, and find 400 lines of obviously incorrect slop. Some try to excuse by marking the PR [vibe] but like what the hell, at least review your own goddamn ai code before asking me to look at it. Usually I want to insta reject just for the disrespect for my time.

stillsut · 3 months ago

I've got some receipts for what I think is good vibe coding...

I save every prompt and associated ai-generated diff in a markdown file for a steganography package I'm working on.

Check out this document: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

In particular, under v0.1.0 see `decode-branch.md` prompt and it's associated generated diff which implements memoization for backtracking while performing decoding.

It's a tight PR that fits the existing codebase and works well, you just need a motivating example you can reproduce which can help me you quickly determine if the proposed solution is working. I usually generate 2-3 solutions initially and then filter them quickly based on a test case. And as you can see from the prompt, it's far from well formatted or comprehensive, just "slap dash" listing of potentially relevant information similar to what would be discussed at an informal whiteboard session.

stillsut commented on Defeating Nondeterminism in LLM Inference thinkingmachines.ai/blog/... · Posted by u/jxmorris12

dns_snek · 3 months ago

Why do you care about determinism in a probabilistic system? What difference does it make to the end user if the input "How do I X?" always produces the same deterministic output when semantically equivalent inputs "how do i x?", "how do I x", and "how do I X??" are bound to produce different answers that often won't even be semantically equivalent.

What LLMs need is the ability to guarantee semantically-equivalent outputs for all semantically-equivalent inputs, but that's very different from "determinism" as we understand it from other algorithms.

stillsut · 3 months ago

I'm actually working on something similar to this where you can encode information into the outputs of LLM's via steganography: https://github.com/sutt/innocuous

Since I'm really looking to sample the only the top ~10 tokens, and I mostly test on CPU-based inference of 8B models, there's probably not a lot of worries getting a different order of the top tokens based on hardware implementation, but I'm still going to take a look at it eventually, and build in guard conditions against any choice that would be changed by an epsilon of precision loss.

stillsut commented on AI might yet follow the path of previous technological revolutions economist.com/finance-and... · Posted by u/mooreds

1c2adbc4 · 3 months ago

I didn't realize that magic was the goal. I'm just trying to process unstructured data. Who's here looking for magic?

stillsut · 3 months ago

I think the "magic" that we've found a common toolset of methods - embeddings and layers of neural networks - that seem to reveal useful patterns and relationships from a vast array of corpus of unstructured analog sensors (pictures, video, point clouds) and symbolic (text, music) and that we can combine these across modalities like CLIP.

It turns out we didn't need a specialist technique for each domain, there was a reliable method to architect a model that can learn itself, and we could already use the datasets we had, they didn't need to be generated in surveys or experiments. This might seem like magic to an AI researcher working in the 1990's.