Readit News logoReadit News
minikomi commented on Ask HN: How are people doing AI evals these days?    · Posted by u/yelmahallawy
minikomi · 3 days ago
The more you can afford to build up your understanding of the problem space and define what inputs & outputs look like, the more flexible you can be with evals. Unfortunately, this is a lot of work and requires thinking and discussion with your team and those involved.

https://poyo.co/note/20260217T130137/

I wrote about general ideas I take towards simple single prompt features, but most of it is applicable to more involved agentic approaches too.

minikomi commented on How I use Claude Code: Separation of planning and execution   boristane.com/blog/how-i-... · Posted by u/vinhnx
dennisjoseph · 20 days ago
The annotation cycle is the key insight for me. Treating the plan as a living doc you iterate on before touching any code makes a huge difference in output quality.

Experimentally, i've been using mfbt.ai [https://mfbt.ai] for roughly the same thing in a team context. it lets you collaboratively nail down the spec with AI before handing off to a coding agent via MCP.

Avoids the "everyone has a slightly different plan.md on their machine" problem. Still early days but it's been a nice fit for this kind of workflow.

minikomi · 20 days ago
I agree, and this is why I tend to use gptel in emacs for planning - the document is the conversation context, and can be edited and annotated as you like.
minikomi commented on Evolving a Modular Dev Experience in Emacs   poyo.co/note/20260202T150... · Posted by u/minikomi
minikomi · 20 days ago
Given the recent surge of popularity in home-grown coding agents ala pi-coding-agent, I thought it would be nice to share my setup for semi-hands-on coding.

It's not a replacement for a more sophisitcated coding agent harnesses, rather an alternative interaction mode, and tools to support it.

Although the content runs fairly emacs-centric, consider it rather a nudge toward building your own small, self-owned tools.

minikomi commented on What if writing tests was a joyful experience? (2023)   blog.janestreet.com/the-j... · Posted by u/ryanhn
minikomi · a month ago
When writing Janet, i enjoy the judge[0] style of testing because it's so interactive. I added emacs helpers, one for running existing tests and one for updating the tests. It made for a nice repl-like experience, especially nice when writing grammars[1].

[0] https://github.com/ianthehenry/judge

[1] https://github.com/minikomi/advent-of-code/blob/d73e0b622b26...

minikomi commented on Cowork: Claude Code for the rest of your work   claude.com/blog/cowork-re... · Posted by u/adocomplete
bashtoni · 2 months ago
Hi Felix!

Simple suggestion: logo should be a cow and and orc to match how I originally read the product name.

minikomi · 2 months ago
minikomi commented on Show HN: Stop Claude Code from forgetting everything   github.com/mutable-state-... · Posted by u/austinbaggio
minikomi · 2 months ago
I use gptel[0] with my denote[1] notes, and a tool that can search/retrieve tags/grep/create notes (in a specific sub folder). It's been good enough as a memory for me.

0: https://github.com/karthink/gptel

1: https://protesilaos.com/emacs/denote

minikomi commented on Ask HN: Loneliness at 19, how to cope?    · Posted by u/yresting
yresting · 2 months ago
I like programming, electronics, reading, maths and am getting a bike this week so I can spend more time outside! I love talking about these things with other people, and from what I can gather from their body language and facial expressions they also enjoy hearing what I have to say about my interest. But I also enjoy letting them talk about what they like to do so I can get to know them!
minikomi · 2 months ago
Group rides, coffee outside, bike packing are all amazing ways to make friends. Shared adventures however small make long lasting bonds.
minikomi commented on The Illustrated Transformer   jalammar.github.io/illust... · Posted by u/auraham
energy123 · 3 months ago
An example of why a basic understanding is helpful:

A common sentiment on HN is that LLMs generate too many comments in code.

But comment spam is going to help code quality, due to the way causal transformers and positional encoding works. The model has learned to dump locally-specific reasoning tokens where they're needed, in a tightly scoped cluster that can be attended to easily, and forgetting about just as easily later on. It's like a disposable scratchpad to reduce the errors in the code it's about to write.

The solution to comment spam is textual/AST post-processing of generated code, rather than prompting the LLM to handicap itself by not generating as much comments.

minikomi · 3 months ago
An example of why a basic understanding is helpful:

A common sentiment on HN is that LLMs generate too many comments in code.

For good reason -- comment sparsity improves code quality, due to the way causal transformers and positional encoding work. The model has learned that real, in-distribution code carries meaning in structure, naming, and control flow, not dense commentary. Fewer comments keep next-token prediction closer to the statistical shape of the code it was trained on.

Comments aren’t a free scratchpad. They inject natural-language tokens into the context window, compete for attention, and bias generation toward explanation rather than implementation, increasing drift over longer spans.

The solution to comment spam isn’t post-processing. It’s keeping generation in-distribution. Less commentary forces intent into the code itself, producing outputs that better match how code is written in the wild, and forcing the model into more realistic context avenues.

u/minikomi

KarmaCake day3677March 12, 2011
About
Peace to every crease on your brain
View Original