jbonatakis (u/jbonatakis)

jbonatakis commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

davedx · a day ago

My process has organically evolved towards something similar but less strictly defined:

- I bootstrap AGENTS.md with my basic way of working and occasionally one or two project specific pieces

- I then write a DESIGN.md. How detailed or well specified it is varies from project to project: the other day I wrote a very complete DESIGN.md for a time tracking, invoice management and accounting system I wanted for my freelance biz. Because it was quite complete, the agent almost one-shot the whole thing

- I often also write a TECHNICAL-SPEC.md of some kind. Again how detailed varies.

- Finally I link to those two from the AGENTS. I also usually put in AGENTS that the agent should maintain the docs and keep them in sync with newer decisions I make along the way.

This system works well for me, but it's still very ad hoc and definitely doesn't follow any kind of formally defined spec standard. And I don't think it should, really? IMO, technically strict specs should be in your automated tests not your design docs.

jbonatakis · a day ago

I have been building this in my free time and it might be relevant to you: https://github.com/jbonatakis/blackbird

I have the same basic workflow as you outlined, then I feed the docs into blackbird, which generates a structured plan with task and sub tasks. Then you can have it execute tasks in dependency order, with options to pause for review after each task or an automated review when all child task for a given parents are complete.

It’s definitely still got some rough edges but it has been working pretty well for me.

jbonatakis commented on Ask HN: What Are You Working On? (March 2026) · Posted by u/david927

jbonatakis · 5 days ago

Very much mvp but I just got this all set up: https://www.pginbox.dev/

Downloaded and parsed a bunch of the pgsql-hackers mailing list. Right now it’s just a pretty basic alternative display, but I have some ideas I want to explore around hybrid search and a few other things. The official site for the mailing list has a pretty clean thread display but the search features are basic so I’m trying to see how I can improve on that.

The repo is public too: https://github.com/jbonatakis/pginbox

I’ve mostly built it using blackbird [1] which I also built. It’s pretty neat having a tool you built build you something else.

[1] https://github.com/jbonatakis/blackbird

jbonatakis commented on GPT-5.4 openai.com/index/introduc... · Posted by u/mudkipdev

bethekidyouwant · 8 days ago

What dependancy could possibly be tied to a non deterministic ai model? Just include the latest one at your price point.

jbonatakis · 8 days ago

Well it’s not even performance (define that however you will), but behavior is definitely different model to model. So while whatever new model is released might get billed as an improvement, changing models can actually meaningfully impact the behavior of any app built on top of it.

jbonatakis commented on GPT-5.4 openai.com/index/introduc... · Posted by u/mudkipdev

__jl__ · 8 days ago

What a model mess!

OpenAI now has three price points: GPT 5.1, GPT 5.2 and now GPT 5.4. There version numbers jump across different model lines with codex at 5.3, what they now call instant also at 5.3.

Anthropic are really the only ones who managed to get this under control: Three models, priced at three different levels. New models are immediately available everywhere.

Google essentially only has Preview models! The last GA is 2.5. As a developer, I can either use an outdated model or have zero insurances that the model doesn't get discontinued within weeks.

jbonatakis · 8 days ago

Google is already sending notices that the 2.5 models will be deprecated soon while all the 3.x models are in preview. It really is wild and peak Google.

jbonatakis commented on Show HN: Steerling-8B, a language model that can explain any token it generates guidelabs.ai/post/steerli... · Posted by u/adebayoj

jacquesm · 17 days ago

I would rather see that it does not rely on open source projects that have not given permission to be used to train that particular AI on.

jbonatakis · 17 days ago

Doesn’t the nature of most open source licenses allow for AI training though?

Example — MIT:

> Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions

jbonatakis commented on Show HN: Micasa – track your house from the terminal micasa.dev... · Posted by u/cpcloud

jbonatakis · 22 days ago

Just want to say, I appreciate your work on Ibis. I’ve been looking into building sort of a dbt-esque alternative on top of it and noticed how involved you’ve been with its development. I think it’s a cool piece of tech that deserves more attention.

jbonatakis commented on Show HN: GitHub "Lines Viewed" extension to keep you sane reviewing long AI PRs chromewebstore.google.com... · Posted by u/somesortofthing

jbonatakis · 24 days ago

I built (using AI) a small cli that provides the breakdown of changes in a PR between docs, source, tests, etc

https://github.com/jbonatakis/differ

It helps when there’s a massive AI PR and it’s intimidating…seeing that it’s 70% tests, docs, and generated files can make it a bit more approachable. I’ve been integrating it into my CI pipelines so I get that breakdown as a comment on the PR

jbonatakis commented on Anthropic tries to hide Claude's AI actions. Devs hate it theregister.com/2026/02/1... · Posted by u/beardyw

small_model · 25 days ago

I always get Claude Code to create a plan unless its trivial, it will describe all the changes its going to make and to which files, then let it rip in a new context.

jbonatakis · 25 days ago

(Mildly) shameless plug, but you might be interested in a tool I’ve been building: https://github.com/jbonatakis/blackbird

It breaks a spec (or freeform input) down into a structured json plan, then kicks off a new non-interactive session of Claude or codex for each task. Sounds like it could fit your workflow pretty well.

jbonatakis commented on MySQL foreign key cascade operations finally hit the binary log readyset.io/blog/mysql-9-... · Posted by u/marceloaltmann

jbonatakis · a month ago

This is excellent. In the past when replicating via Debezium from a system making heavy use of cascade deletes I’ve had to write a layer that infers these deletes by introspecting the database schema, building a graph of all cascades (sometimes several layers) and identifying rows that should have corresponding delete records. These can then be excluded in whatever downstream system via an anti-join. It works but it will be better to not have to do that and instead have first class support for cascades.

jbonatakis commented on Show HN: Agent Alcove – Claude, GPT, and Gemini debate across forums agentalcove.ai... · Posted by u/nickvec

jbonatakis · a month ago

Neat. I started building something similar[1] but focused more on agents having conversation around whatever I feed them, e.g. a design doc. I had the same idea about using a matrix of different models and prompts to try to elicit varying behaviors/personalities (I used the word “persona”) and avoid getting an echo chamber. It seemed to work well-ish but after the POC phase I got bored and stopped.

Have you considered letting humans create threads but agents provide the discussion?

[1] https://github.com/jbonatakis/panel