abreslav (u/abreslav)

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

the_duke · 16 hours ago

This doesn't make too much sense to me.

* This isn't a language, it's some tooling to map specs to code and re-generate

* Models aren't deterministic - every time you would try to re-apply you'd likely get different output (without feeding the current code into the re-apply and let it just recommend changes)

* Models are evolving rapidly, this months flavour of Codex/Sonnet/etc would very likely generate different code from last months

* Text specifications are always under-specified, lossy and tend to gloss over a huge amount of details that the code has to make concrete - this is fine in a small example, but in a larger code base?

* Every non-trivial codebase would be made up of of hundreds of specs that interact and influence each other - very hard (and context - heavy) to read all specs that impact functionality and keep it coherent

I do think there are opportunities in this space, but what I'd like to see is:

* write text specifications

* model transforms text into a *formal* specification

* then the formal spec is translated into code which can be verified against the spec

2 and three could be merged into one if there were practical/popular languages that also support verification, in the vain of ADA/Spark.

But you can also get there by generating tests from the formal specification that validate the implementation.

abreslav · 11 hours ago

> * model transforms text into a formal specification

formal specification is no different from code: it will have bugs :)

There's no free lunch here: the informal-to-formal transition (be it words-to-code or words-to-formal-spec) comes through the non-deterministic models, period.

If we want to use the immense power of LLMs, we need to figure out a way to make this transition good enough

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

siscia · 11 hours ago

What I found more useful is an extra step. Spec to tests, and then red tests to code and green tests.

LLMs works on both translation steps. But you end up with an healthy amount of tests.

I tagged each tests with the id of the spec so I do get spec to test coverage as well.

Beside standard code coverage given by the tests.

abreslav · 11 hours ago

When you translate spec to tests (if those are traditional unit tests or any automated tests that call the rest of the code), that fixes the API of the code, i.e. the code gets designed implicitly in the test generation step. Is this working well in your experience?

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

siscia · 11 hours ago

What I found more useful is an extra step. Spec to tests, and then red tests to code and green tests.

LLMs works on both translation steps. But you end up with an healthy amount of tests.

I tagged each tests with the id of the spec so I do get spec to test coverage as well.

Beside standard code coverage given by the tests.

abreslav · 11 hours ago

Very much agree on coverage. We're actually doing something in that area: https://codespeak.dev/blog/coverage-20260302

For now, it's only about test coverage of the code, but the spec coverage is coming too.

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

Garlef · 12 hours ago

I think this is 100% the right direction:

Instead of imperatively letting the agents hammer your codebase into shape through a series of prompts, you declare your intent, observe the outcome and refine the spec.

The agents then serve as a control plane, carrying out the intent.

abreslav · 11 hours ago

Very much agree. I like the imperative vs declarative angle you take here. Thank you!

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

cube2222 · 12 hours ago

This is actually... pretty cool?

Definitely won't use it for prod ofc but may try it out for a side-project.

It seems that this is more or less:

  - instead of modules, write specs for your modules
  - on the first go it generates the code (which you review)
  - later, diffs in the spec are translated into diffs in the code (the code is *not* fully regenerated)

this actually sounds pretty usable, esp. if someone likes writing. And wherever you want to dive deep, you can delve down into the code and do "microoptimizations" by rolling something on your own (with what seems to be called here "mixed projects").

That said, not sure if I need a separate tool for this, tbh. Instead of just having markdown files and telling cause to see the md diff and adjust the code accordingly.

abreslav · 11 hours ago

We'd love to hear your feedback! Feel free to come to our discord to ask questions/share experience: https://l.codespeak.dev/discord

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

my_throwaway23 · 14 hours ago

Who is writing the tests?

abreslav · 11 hours ago

There are different kinds of tests:

* regression tests – can be generated

* conformance tests – often can be generated

* acceptance tests – are another form of specification and should come from humans.

Human intent can be expressed as

* documents (specs, etc)

* review comments, etc

* tests with clear yes/no feedback (data for automated tests, or just manual testing)

And this is basically all that matters, see more here: https://www.linkedin.com/posts/abreslav_so-what-would-you-sa...

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

newsoftheday · 15 hours ago

> Eventually, we'll end up in a world where humans don't need to touch code, but we are not there yet.

Will we though? Wouldn't AI need to reach a stage where it is a tool, like a compiler, which is 100% deterministic?

abreslav · 11 hours ago

Two things to mention here:

1. You are right that we can redefine what is code. If code is the central artefact that humans are dealing with to tell machines and other humans how the system works, then CodeSpeak specs will become code, and CodeSpeak will be a compiler. This is why I often refer to CodeSpeak as a next-level programming language.

2. I don't think being deterministic per se is what matters. Being predictable certainly does. Human engineers are not deterministic yet people pay them a lot of money and use their work all the time.

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

le-mark · 16 hours ago

This concept is assuming a formalized language would make things easier somehow for an llm. That’s making some big assumptions about the neuro anatomy if llms. This [1] from the other day suggests surprising things about how llms are internally structured; specifically that encoding and decoding are distinct phases with other stuff in between. Suggesting language once trained isn’t that important.

[1] https://news.ycombinator.com/item?id=47322887

abreslav · 16 hours ago

We are not trying to make things easier for LLMs. LLMs will be fine. CodeSpeak is built for humans, because we benefit from some structure, knowing how to express what we want, etc.

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

lifis · 16 hours ago

As far as I can tell it's not a new language, but rather an alternative workflow for LLM-based development along with a tool that implements it.

The idea, IIUC, seems to be that instead of directly telling an LLM agent how to change the code, you keep markdown "spec" files describing what the code does and then the "codespeak" tool runs a diff on the spec files and tells the agent to make those changes; then you check the code and commit both updated specs and code.

It has the advantage that the prompts are all saved along with the source rather than lost, and in a format that lets you also look at the whole current specification.

The limitation seems to be that you can't modify the code yourself if you want the spec to reflect it (and also can't do LLM-driven changes that refer to the actual code), and also that in general it's not guaranteed that the spec actually reflects all important things about the program, so the code does also potentially contain "source" information (for example, maybe your want the background of a GUI to be white and it is so because the LLM happened to choose that, but it's not written in the spec).

The latter can maybe be mitigated by doing multiple generations and checking them all, but that multiplies LLM and verification costs.

Also it seems that the tool severely limits the configurability of the agentic generation process, although that's just a limitation of the specific tool.

abreslav · 16 hours ago

> Also it seems that the tool severely limits the configurability of the agentic generation process, although that's just a limitation of the specific tool.

Working on that as well. We need to be a lot more flexible and configurable

abreslav commented on Kotlin creator's new language: talk to LLMs in specs, not English codespeak.dev/... · Posted by u/souvlakee

lifis · 16 hours ago

As far as I can tell it's not a new language, but rather an alternative workflow for LLM-based development along with a tool that implements it.

The idea, IIUC, seems to be that instead of directly telling an LLM agent how to change the code, you keep markdown "spec" files describing what the code does and then the "codespeak" tool runs a diff on the spec files and tells the agent to make those changes; then you check the code and commit both updated specs and code.

It has the advantage that the prompts are all saved along with the source rather than lost, and in a format that lets you also look at the whole current specification.

The limitation seems to be that you can't modify the code yourself if you want the spec to reflect it (and also can't do LLM-driven changes that refer to the actual code), and also that in general it's not guaranteed that the spec actually reflects all important things about the program, so the code does also potentially contain "source" information (for example, maybe your want the background of a GUI to be white and it is so because the LLM happened to choose that, but it's not written in the spec).

The latter can maybe be mitigated by doing multiple generations and checking them all, but that multiplies LLM and verification costs.

Also it seems that the tool severely limits the configurability of the agentic generation process, although that's just a limitation of the specific tool.

abreslav · 16 hours ago

> The limitation seems to be that you can't modify the code yourself if you want the spec to reflect it

Eventually, we'll end up in a world where humans don't need to touch code, but we are not there yet. We are looking into ways to "catch up" the specs with whatever changes happen in the code not through CodeSpeak (agents or manual changes or whatever). It's an interesting exercise. In the case of agents, it's very helpful to look at the prompts users gave them (we are experimenting with inspecting the sessions from ~/.claude).

More generally, `codespeak takeover` [1] is a tool to convert code into specs, and we are teaching it to take prompts from agent sessions into account. Seems very helpful, actually.

I think it's a valid use case to start something in vibe coding mode and then switch to CodeSpeak if you want long-term maintainability. From "sprint mode" to "marathon mode", so to speak

[1] https://codespeak.dev/blog/codespeak-takeover-20260223