I'm advocating for JJ to build a proper daemon that runs "checks" per change in the background. So you don't run pre-commit checks when committing. They just happen in the background, and when by the time you get to sharing your changes, you get all the things verified for you for each change/commit, effortlessly without you wasting time or needing to do anything special.
I have something a bit like that implemented in SelfCI (a minimalistic local-first Unix-philosophy-abiding CI) https://app.radicle.xyz/nodes/radicle.dpc.pw/rad%3Az2tDzYbAX... and it replaced my use of pre-commit hooks entirely. And users already told me that it does feel like commit hooks done right.
But what I didn't pick up for a quick scan of README is best pattern for integrating with git. Do you expect users to manually run (a script calling) selfci manually or is it hooked up to git or similar? When does the merge hooks come into play? Do you ask selfci to merge?
There is also the counter-intuitive phenomenon where training a model on a wider variety of content than apparently necessary for the task makes it better somehow. For example, models trained only on English content exhibit measurably worse performance at writing sensible English than those trained on a handful of languages, even when controlling for the size of the training set. It doesn't make sense to me, but it probably does to credentialed AI researchers who know what's going on under the hood.
To do well as an LLM you want to end up with the weights that gets furthest in the direction of "reasoning".
So assume that with just one language there's a possibility to get stuck in local optima of weights that do well on the English test set but which doesn't reason well.
If you then take the same model size but it has to manage to learn several languages, with the same number of weights, this would eliminate a lot of those local optima because if you don't manage to get the weights into a regime where real reasoning/deeper concepts is "understood" then it's not possible to do well with several languages with the same number of weights.
And if you speak several languages that would naturally bring in more abstraction, that the concept of "cat" is different from the word "cat" in a given language, and so on.
i.e. there is a lot of commonality between programming languages just as there is between human languages, so training on one language would be beneficial to competency in other languages.
I assumed that is what was catered for with "even when controlling for the size of the training set".
I.e. assuming I am reading it right: That it is better to get the same data as 25% in 4 languages, than 100% in one language.
Asking because I was looking at both Cloudflare and Bunny literally this week...and I feel like I don't know anything about it. Googling for it, with "hackernews" as keyword to avoid all the blogspam, didn't bring up all that much.
(I ended up with Cloudflare and am sure that for my purposes it doesn't matter at all which I choose.)
They are improving this use case too with their enhanced blame. I think it was mentioned in their latest update blog.
You'll be able to hover over lines to see if you wrote it, or an AI. If it was an AI, it will show which model and a reference to the prompt that generated it.
I do like Cursor quite a lot.
(if one already exists, someone needs to tell the public Cursor issue tracker)
We're making this better very soon! In the coming weeks hopefully.
I see in your public issue tracker that a lot of people are desperate simply for an option to turn that thing off ("Automatically accept all LLM changes"). Then we could use any kind of plugin really for reviews with git.
Personally, I found Cursor to be too inaccurate to be useful (possibly because I use Julia, which is relatively obscure) – Opus has been roughly the right level for my "pair programming" workflow.
I will very quickly @- the parts of the code that are relevant to get the context up and running right away. Seems in Claude that's harder..
(They also have their own, "Composer 1", which is just lightning fast compared to the others...and sometimes feels as smart as Opus, but now and then don't find the solution if it's too complicated and I have to ask Opus to clean it up. But if there's simple stuff I switch to it.)
Reading HN I feel a bit out of touch since I seem to be "stuck" on Cursor. Tried to make the jump further to Claude Code like everyone tells me to, but it just doesn't feel right...
It may be due to the size of my codebase -- I'm 6 months into solo developer bootstrap startup, so there isn't all that much there, and I can iterate very quickly with Cursor. And it's mostly SPA browser click-tested stuff. Comparatively it feels like Claude Code spends an eternity to do something.
(That said Cursor's UI does drive me crazy sometimes. In particular the extra layer of diff-review of AI changes (red/green) which is not integrated into git -- I would have preferred that to instead actively use something integrated in git (Staged vs Unstaged hunks). More important to have a good code review experience than to remember which changes I made vs which changes AI made..)
You can instruct it to make edits, or say "Use SVG gradients for the windows" and so on and you can further iterate on the SVG.
It can be frustrating at times, but the end result was worth it for me.
Though for some images I've done 2-3 roundtrips manual editing, Nano Banana, svgai.org ...
The advantage is that it produces sane output paths that I can edit easily for final manual touches in Inkscape.
Some of the other "AI" tools are often just simply algorithms for bitmap->vector and the paths/curves they produce are harder to work with, and also give a specific feel to the vector art..