Readit News logoReadit News
theamk · 3 months ago
KSON proudly claims "no whitespace sensitivity", which means "misleading indentation" is back. And it's pretty light on syntax, so there are going to be plenty of mistakes made here.

Here is an example I made in a few minutes:

    ports:
       - 80
       - 8000 - 10000
       - 12000 -
       - 14000

Guess how it parses? answer:

    {"ports":[80,8000,10000,12000,[14000]]}

andrewla · 3 months ago
I actually prefer a syntax that is whitespace-sensitive, but should not give meaning to whitespace. That is, the whitespace should not perform a semantic duty, but should be required to be correct.

This is roughly equivalent to saying that a linter can transform the AST of the language into a canonical representation, and the syntax will be rejected unless it matches the canonical representation (modulo things like comments or whitespace-for-clarity).

wofo · 3 months ago
This sounds like a stricter version of KSON's current warnings for misleading indentation... maybe KSON should have an opt-in feature for this. Thanks for the idea!
giveita · 3 months ago
So, opposite of python?
wofo · 3 months ago
Hmmm that's interesting. KSON actually shows a warning when there's misleading indentation, exactly to prevent this sort of thing! It seems like the detection logic only considers indents after a new line, so it doesn't complain in this case. I just opened an issue to see if things can be tightened up a bit (https://github.com/kson-org/kson/issues/221).

To see misleading indentation warnings in action, you can try the following snippet in the playground (https://kson.org/playground/) and will properly get a warning:

    ports:
       - 80
       - 8000
         - 10000
       - 12000
       - 14000
Next to that, note that KSON already has an autoformatter, which also helps prevent misleading indentation in files.

theamk · 3 months ago
It's config file - I might be editing with "sudo vi", possibly inside docker container, possibly on remote server with awkward connection procedure. Or maybe it'll get generated via some templating mechanism by terraform rules. Or embedded in the code.

If your config format requires autoformatter and/or linter to detect trivial mistakes, it is junk.

afiori · 3 months ago
The only sane configuration language is something similar to gron

https://github.com/tomnomnom/gron

SebastianKra · 3 months ago
Assuming this takes off, there would be a prettier-plugin that corrects any weird formatting.

When I think about it, any language should come with a strict, non-configurable built-in formatter anyways.

stronglikedan · 3 months ago
> any language should come with a strict, non-configurable built-in formatter

Would that be on the language, or the IDEs that support it? Seems out of scope to the language itself, but maybe I'm misunderstanding.

theamk · 3 months ago
Or you can have languages which do not depend on plugin or formatters to be correct.

If you supposedly human writable config file format is unusable without external tools, there is something wrong with it.

cryptonector · 3 months ago
Yeah, I'm going to say NO WAY to that.
kookamamie · 3 months ago
Yes, the syntax makes no intuitive sense, whatsoever.
comex · 3 months ago
Sounds interesting as a format, but the implementation is a big supply-chain attack risk if you're not already in the JVM ecosystem.

This is because the only implementation is written in Kotlin. There are Python and Rust packages, but they both just link against the Kotlin version.

How do you build the Kotlin version? Well, let's look at the Rust package's build.rs:

https://github.com/kson-org/kson/blob/main/lib-rust/kson-sys...

It defaults to simply downloading a precompiled library from GitHub, without any hash verification.

You can instead pass an environment variable to build libkson from source. However, this will run the ./gradlew script in the repo root, which… downloads an giant OpenJDK binary from GitHub and executes it. Later in the build process it does the same for pixi and GraalVM.

The build scripts also only support a small list of platforms (Windows/Linux/macOS on x86_64/arm64), and don't seem to handle cross-compilation.

The compiled library is 2MB for me, which is actually a lot less than I was expecting, so props for that. But that's fairly heavy by Rust standards.

wofo · 3 months ago
Glad you liked the format. I hope we can close the implementation gaps as development advances, and I'd love to see native libraries sprout for all conceivable programming languages!

Edit: point taken about verifying checksums, just created an issue for it (https://github.com/kson-org/kson/issues/222)

Terr_ · 3 months ago
Past-me had hoped that by the Future Year 2025, it'd be typical to publish a parser grammar file for this kind of thing.

Both to bootstrap making a parser in a new language, and also as a kind of living spec document.

wofo · 3 months ago
I think the current grammar should be precise enough for that, though it's embedded in the source code as a comment and not in its own file (see https://github.com/kson-org/kson/blob/857d585ef26d9f73e080b5...). It probably can't be fed verbatim into a parser generator, but anyone who reads the parser's source code should have an easy time writing a parser by hand for their programming language of choice (heck, they might even have an LLM translate the original parser into whatever language they want, once there is a comprehensive conformance test suite to validate the resulting code).

All in all, I'm confident that KSON can become ubiquitous despite the limitations of the current implementation (provided it catches on, of course).

kccqzy · 3 months ago
Configuration files need to be powerful programming languages (in terms of expressiveness) while being restricted (in terms of network and I/O and non-determinism). We need to aim very high for configuration languages especially when we treat them like user interfaces. Look at Cue (https://cuelang.org/), Starlark or Dhall (https://dhall-lang.org/) for inspiration, not JSON, unless your configuration file is almost always written programmatically.
madeofpalk · 3 months ago
Any configuration language that doesn't support strict/user/explicit types is worthless (ahem jsonnet).

The idea of configuring something but not actually having any sort of assurances that what you're configuring is correct is maddening. Building software with nothing but hopes and dreams.

candiddevmike · 3 months ago
Or Jsonnet (https://jsonnet.org), if you do like JSON but want less quoting.
arccy · 3 months ago
expressiveness unfortunately usually means that while you can read the output value, you lose the ability to modify it programmatically...
foota · 3 months ago
I don't know that I would particularly _want_ to modify starlark programmatically, but it's certainly not impossible. Build files in bazel for instance are nicely modifiable: https://bazel.build/concepts/build-files.
ruuda · 3 months ago
In RCL you can edit source files programmatically with `rcl patch`: https://ruudvanasseldonk.com/2025/automating-configuration-u...

Deleted Comment

taeric · 3 months ago
Meanwhile, I peek at my emacs config and continue to wonder why people don't just embrace a programming language.

Yes, there are bad consequences that can happen. No, you don't dodge having problems by picking a different data format. You just pick different problems. And take away tools from the users to deal with them.

wsc981 · 3 months ago
Just something like an INI file might be fine for most use-cases as well and is easier to reason about.
slowmovintarget · 3 months ago
Elisp for Emacs, Lua for those on Neovim...

Definitely more control than guessing the right JSON, or breaking that YAML file. Plus, you get completion, introspection, and help while editing the config because you're in a code-writing environment. Bonus for having search and text manipulation tools under your fingertips instead of clicking checkboxes or tabbing through forms.

taeric · 3 months ago
Reading this does make me think about how exploring settings in emacs `M-x config` settings is actually better than I would expect. So it isn't like you can't also have the checkboxes and forms.
Xss3 · 3 months ago
JSON5 is good enough that it works for frontend devs, backend, qa, firmware, data science, chemists, optical engineers, and the hardware team, in my org at least. Interns pick up on it quickly.

The comment option gives enough space for devs to explain new options flags and objects included to those familiar enough to be using it.

For customer facing configurations we build a UI.

stronglikedan · 3 months ago
In the kitchen sink example (https://json5.org/), they say:

> "backwardsCompatible": "with JSON",

But in that same example, they have a comment like this:

> // comments

Wouldn't that make it not compatible with JSON?

crazygringo · 3 months ago
It's confusing.

From what I understand, it's "backwards-compatible" with JSON because valid JSON is also valid JSON5.

But it's not "forwards-compatible" precisely because of comments etc.

kiitos · 3 months ago
Backwards-compatible means the new thing can handle the old things. Here JSON5 is backwards-compatible with JSON.

Forwards-compatible means the old thing can handle the new things. Here JSON is not forwards-compatible with JSON5.

rapfaria · 3 months ago
Your existing JSON < 5 will work with json5, not the other way around
arvindh-manian · 3 months ago
It’s a superset of JSON. I guess they mean it’s backwards compatible in terms of reading existing JSONs?
ruuda · 3 months ago
From the application point of view, recently I'm converging on this: define data structures for your config. Ensure it can be deserialized from json and toml. (In Rust this is easy to do with Serde; in Python with Pydantic or dataclasses.) Users can start simple and write toml by hand. If you prefer KSON, sure, write KSON and render to json. If config is UI, I think the structure of the data, and names of fields and values, matter much more than the syntax. (E.g. `timeout = 300` is meaningless regardless of syntax; `timeout_ms = 300` or `timeout = "300 ms"` are self-documenting.)

When the configuration grows complex, and you feel the need to abstract and generate things, switch to a configuration language like Cue or RCL, and render to json. The application doesn't need to force a format onto the user!

squirrellous · 3 months ago
We use protobuf as schemas for json config. Protobuf has builtin json support and works across languages. It’s great for multi-language projects.
fireflash38 · 3 months ago
This is what I have been debating using for a project at work. You can also nest messages within other messages easily with protobuf so you can aggregate/deaggregate configs as you want. Combined with protobuf validation plugins for your Lang and you get a rather neat package.
jdwyah · 3 months ago
the duration one in particular bugs me. I work on a dynamic configuration system and i was super happy when we added proper duration support. we took the approach of storing in iso duration format as a string. so myconfig = `5s` then you get a duration object and can call myconfig.in_millis. so much better imo.
wofo · 3 months ago
I like this take! Have you used Cue or RCL yourself? How was the experience?
ruuda · 3 months ago
I’ve toyed with Cue, but never in a production setting. I like the ideas behind it, it’s very elegant that the same mechanism enables constraining values and reducing boilerplate. It’s somewhat limited compared to Jsonnet, RCL, Dhall, etc., you don’t get user-defined functions, but the flip side of that is that when you see something being defined, you can be confident that it ends up in the output like that, that it’s not just an input to a series of intractable transformations. I haven’t used it in large enough settings to get a feeling for how much that matters. Also, I find the syntax a bit ugly.

We did a prototype at work to try different configuration languages for our main IaC repository, and Cue was the one I got furthest with, but we ended up just using Python to configure things. Python is not that bad for this: the syntax is light, you get types, IDE/language server support, a full language. One downside is that it’s difficult to inspect a single piece of configuration, you run the entry point and it generates everything.

As for RCL, I use it almost daily as a jq replacement with easier to remember syntax. I also use it in some repositories to generate GitHub Actions workflows, and to keep the version numbers in sync across Cargo.toml files in a repository. I’m very pleased with it, but of course I am biased :-)

diarrhea · 3 months ago
Not the OP but I’m a big fan of this pattern as well.

At work we generate both k8s manifests as well as application config in YAML from a Cue source. Cue allows both deduplication, being as DRY as one can hope to be, as well as validation (like validating a value is a URL, or greater than 1, whatever).

The best part is that we have unit tests that deserialize the application config, so entire classes of problems just disappear. The generated files are committed in VCS, and spell out the entire state verbatim - no hopeless Helm junk full of mystery interpolation whose values are unknown until it’s too late. No. The entire thing becomes part of the PR workflow. A hook in CI validates that the generated files correspond to the Cue source (run make target, check if git repo has changes afterwards).

The source of truth are native structs in Go. These native Go types can be imported into Cue and used there. That means config is always up to date with the source of truth. It also means refactoring becomes relatively easy. You rename the thing on the Go side and adjust the Cue side. Very hard to mess up and most of it is automated via tooling.

The application takes almost its entire config from the file, and not from CLI arguments or env vars (shudder…). That means most things are covered by this scheme.

One downside is that the Cue tooling is rough around the edges and error messages can be useless. Other than that, I fully intend to never build applications differently anymore.

mholt · 3 months ago
This is why Caddy has config adapters: bring any config file language you like, and Caddy will run it. It's built-into the binary and just takes a command line flag to switch languages: https://caddyserver.com/docs/config-adapters
kevmo314 · 3 months ago
This makes it difficult to configure Caddy in anything except the native Caddyfile language due to a lack of thorough documentation. It's an interesting idea, but configuring Caddy with a yaml config that someone prior deemed a great idea was quite painful.

Curiously, LLMs have made it a lot easier. One step away from an English adapter that routes through an LLM to generate the config.

Deleted Comment

kiitos · 3 months ago
those formats aren't bijective with each other, right? so there's no way for you to say that foo.cue can be equivalently transformed to any foo.json or any foo.nginx or whatever representation, because those transformations are necessarily lossy, no?
JohnMakin · 3 months ago
Having now worked with terraform for 8 years, I could not agree more. Now, also because of having worked with terraform for 8 years and seeing how that's played out, I've heard and become tired of the whole "superset of json, transcribable to YAML, whitespace is not significant (which has never been a gripe of mine ever, not sure why every product cares so much about that)" promise of a silver bullet, and you very much face the same exact problems, just in different form. Terraform (HCL, to be specific) in particular can become fantastically ugly and verbose and "difficult to modify."

Configuration is difficult, the tooling is rarely the problem (at least in my experience).