Readit News logoReadit News
ivan4th · 7 years ago
From my experience, while YAML itself is something one can learn to live with, the true horror starts when people start using text template engines to generate YAML. Like it's done in Helm charts, for example, https://github.com/helm/charts/blob/master/stable/grafana/te... Aren't these "indent" filters beautiful?
DonHopkins · 7 years ago
I developed Yet Another JSON Templating Language, whose main virtue was that it was extremely simple to use and implement, and it could be easily implemented in JavaScript or any other languages supporting JSON.

We had joy, we had fun, we had seasons in the sun, but as I added more and more features and syntax to cover specific requirements and uncommon edge cases, I realized I was on an inevitable death-march towards my cute little program becoming sufficiently complicated to trigger Greenspun's tenth rule.

https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

There is no need for Yet Another JSON Templating Language, because JavaScript is the ultimate JSON templating language. Why, it even supports comments and trailing commas!

Just use the real thing to generate JSON, instead of trying to build yet another ad-hoc, informally-specified, bug-ridden, slow implementation of half of JavaScript.

PopeDotNinja · 7 years ago
> the true horror starts when people start using text template engines to generate YAML

I just had a shiver recalling a Kubernetes wrapper wrapper wapper wrapper at a former job. I think there were at least two layers of mystical YAML generation hell. I couldn't stop it, and it tanked much joy in my work. It was a factor in me moving on.

msoad · 7 years ago
Oh my god! I'm working on the same wrapper wrapper wrapper
thomasfedb · 7 years ago
Surely the right approach needs to be generating the desired data programmatically, rendering back to YAML if needed, rather than building these files with text macros.
api · 7 years ago
Kubernetes works well but pretty it is not.
codeduck · 7 years ago
> Kubernetes wrapper wrapper wapper wrapper at a former job

oh god why

regecks · 7 years ago
Why would they have chosen to use template/text to generate YAML? That seems insane.

Surely using an encoder on an object/structure hierarchy (like people do with encoding/json) is the way to go?

On the other hand, the quality of the yaml libraries in Go wasn't great, last time I had to choose a configuration file format.

dharmab · 7 years ago
A lot of people working with YAML have an ops background and aren't familiar with basic data structures.
wmil · 7 years ago
It probably starts with an existing YAML config file that you only need to pass one or two variables to. Then things get out of hand.
Noumenon72 · 7 years ago
What is "an encoder"? Like a function that takes the same variables as the template would but does some work itself generating things?
dastx · 7 years ago
At my old place we developed a small tool that wraps CloudFormation with a templating language (jinja2). This was actually great as it CloudFormation is extremely verbose and often unnecessarily complex. Templating it out and adding custom functions to jinja2 made the cfn templates much easier to understand.

I think it all depends. Most of the time I would agree that you shouldn't template yaml, but sometimes, it's the lesser of two evils.

zwkrt · 7 years ago
Templating CFN is really good practice once you hit a certain scale. If you have 5 DDB tables deployed to multiple regions, and on each of them you want to specify keys, attributes, throughput, and usage alarms, at a minimum. That’s already 30-40 values that need to be specified, depending on table schemas. Add EC2, auto scaling, networking, load balancer, and SQS/SNS—now untemplated cloud formation is really unpleasant to work with.

Some of the values like DDB table attributes are common across all regions, other values like tags are common across all infra in the same region. Some values are a scalar multiple of others, or interpolated from multiple sources. For example, a DDB capacity alarm for a given region is a conjunction of the table name (defined at the application level), a scalar multiple of the table capacity (defined at the regional deployment level), and severity (owned by those that will be on-call).

To add insult to injury, a stack can only have 60 parameters, which you butt up against quickly if you try to naively parameterize your deployment.

Given all these gripes, auto-generating CFN templates was easiest for me. I used a hierarchical config (global > application > region > resource) so the deployment params could be easily manipulated, maintained, and where “exceptions to the rule” would be obvious instead of hidden in a bunch of CFN yaml. To generate CFN templates I used ERB instead of jinja, but to similar effect.

A side benefit of this is side-stepping additional vendor lock-in in the form of the weird and archaic CFN operators for math, string concatenation, etc. I don’t have a problem learning them, but it’s one of those things that one person learns, then everyone who comes after them has to re-learn. My shop already uses ruby, so templating in the same language is a no-brainer.

archgoon · 7 years ago
For cloudformation; my team a few years ago got a lot of mileage out of using troposphere.

https://github.com/cloudtools/troposphere

The basic type checking done was quite helpful, and avoided some of the dumb errors that we had run into when we attempted to do everything by hand.

auslander · 7 years ago
> This was actually great as it CloudFormation is extremely verbose and often unnecessarily complex

I think its opposite, the most lean way to deploy AWS resources. Did you wrote it yourself, in text editor? I was doing it for 5 years now. You can omit values if you're fine with defaults, you only state what needs to be different. Other tip is use Export and ImportValue to link stacks.

I kept on using JSON, even after all my buddies jumped on YAML. JSON is just more reliable, harder to miss syntax errors, and can be made readable by not using linters and keep long lines that belong on one line. Also, the brackets are exactly what they are in Python :)

> wraps CloudFormation with a templating language (jinja2)

Not sure it it is a good idea. Everyone's use case is different, though. A well written CFN template is like a rubber stamp, just change the Parameters. The template itself doesn't need to change.

sethammons · 7 years ago
k8s and helm is where I learned to dislike yaml. I now want a compiled and type safe language that generates whatever config a system needs.

I'm pretty much thinking I want Go as a pre-config where I can set variables, loops, and conditionals and that my editor can help with auto-complete. Maybe I can "import github.com/$org/helmconfig" and in the end write one or more files for config.

halfmatthalfcat · 7 years ago
Helm 3 is moving to Lua, that may be better or worse.
felixschl · 7 years ago
You should check out dhall-lang
swaroop · 7 years ago
Are you looking for something like https://jsonnet.org/ ?
ljackman · 7 years ago
Some templating languages such as Jsonnet[0] add built-in templating and just enough programmability to cover basic operations like templating and iteration.

I originally felt it was overly complex, but after seeing some of the Go text/template and Ansible Jinja examples in the wild, it actually seems like a good idea.

Perhaps we should more strongly distinguish between “basic” data definition formats and ones that need to be templated. JSON5 for the former and Jsonnet for the latter, for example.

cppforlife · 7 years ago
agreed, text templating of yaml (or any structured content) does not make sense. too much context (actual config structure) is lost if plain text is used.

i've collaborated on ytt (https://get-ytt.io) - yaml templating tool. it works directly with yaml structure to bind templating directives. for example setting a value is associated with a specific yaml node so that you dont have to do any manual indenting etc. like you would with plain text templating. defining functions that return yaml structures becomes very easy as well. common problems such as improperly escaped values are gone.

i'm also experimenting with a "strict" mode [1] that raises error for questionable yaml features, for example, using NO to mean false.

i think that yaml is here to stay (at least for some time) and it's worth investing in making tools that make dealing with yaml and its common uses (templating) easier.

[1] https://github.com/k14s/ytt/blob/master/docs/strict.md

xxxpupugo · 7 years ago
Shoot, it will be a hell to test such monster, say you have a single typo somewhere, lol
jchw · 7 years ago
The issue is, I think most people (myself included) enter YAML into their lives as basically a JSON alternative with lighter syntax. Without really realizing, or perhaps without internalizing, the rather ridiculous number of different ways to represent the same thing, the painful subtle syntax differences that lead to entirely different representations, the sometimes difficult to believe number of features that the language has that are seldom used..

It's not just alternate skin for JSON, and yet that's what most people use it for. Some users also want things like map keys that aren't strings, which is actually pretty useful.

I recall there being CoffeeScript Object Notation as well... perhaps that would've been better for many use cases, all things said.

norcalli · 7 years ago
I've never understood this. JSON is really not that difficult to work with manually. I tend to write my config files as JSON for utilities I write. What is it with peoples' innate aversion to braces?
zzo38computer · 7 years ago
I don't aversion to braces. Rather, my issues with JSON is that it doesn't have comments and that you cannot use a optional trailing comma.
nine_k · 7 years ago
JSON is serviceable as an intermediate format, machine-generated and machine-consumed.

It is outright bad as a human-operated format. It explicitly lacks comments, it does not allow trailing commas, it lacks namespaces, to name a few pain points.

YAML is much more human-friendly, with all its problems.

voidfunc · 7 years ago
The lack of comments is the real problem. When you need to explain why a particular parameter in the config file is set a certain way JSON becomes a real problem.
ivalm · 7 years ago
Comments, comments, comments.

Seriously, our batch jobs for better or worse have configs with a bunch of parameters that are passed around as json, and while most variable names are intuitive and there is documentation on the wiki, and most often the config can be autogenerated by other tools it would still be better if when I manually open it in the config itself I would easily see the difference between n_run_threads vs n_reg_threads, etc...

brokensegue · 7 years ago
json's lack of int types is what ruins it for me
crdoconnor · 7 years ago
JSON alternative with lighter syntax and comments is basically what I tried to make StrictYAML.

I made it largely because I saw a disconnect with what YAML was, and what people - including me - thought it was (which is what it should be).

Don't agree with non-string map keys though... they're a complication I never saw a use for.

zenexer · 7 years ago
They’re fairly useful in applications that use numeric IDs. For example, if I’m using SQL, and I have a table with an AUTOINCREMENT primary key, I’m going to have a lot of numeric IDs. If I want to reference these in a config file of some kind, I don’t want to have to read them as strings and handle the parsing on my end.

Even if you’re of the opinion that IDs shouldn’t be numeric, there are a lot of cases where you’re stuck with integers—on Linux, user IDs, group IDs, and inodes are just a few examples.

DonHopkins · 7 years ago
I was suspicious of YAML from day one, when they announced "Yet Another Markup Language (YAML) 1.0", because it obviously WASN'T a markup language. Who did they think they were fooling?

https://yaml.org/spec/history/2001-08-01.html

XML and HTML are markup languages. JSON and YAML are not markup languages. So when they finally realized their mistake, they had to retroactively do an about-face and rename it "YAML Ain’t Markup Language". That didn't inspire my confidence or look to me like they did their research and learned the lessons (and definitions) of other previous markup and non-markup languages, to avoid repeating old mistakes.

If YAML is defined by what it Ain't, instead of what it Is, then why is it so specifically obsessed with not being a Markup Language, when there are so many other more terrible kinds of languages it could focus on not being, like YATL Ain't Templating Language or YAPL Ain't Programming Language?

https://en.wikipedia.org/wiki/YAML#History_and_name

>YAML (/ˈjæməl/, rhymes with camel) was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki. Originally YAML was said to mean Yet Another Markup Language, referencing its purpose as a markup language with the yet another construct, but it was then repurposed as YAML Ain't Markup Language, a recursive acronym, to distinguish its purpose as data-oriented, rather than document markup.

https://en.wikipedia.org/wiki/Markup_language

>In computer text processing, a markup language is a system for annotating a document in a way that is syntactically distinguishable from the text. The idea and terminology evolved from the "marking up" of paper manuscripts (i.e., the revision instructions by editors), which is traditionally written with a red or blue pencil on authors' manuscripts. In digital media, this "blue pencil instruction text" was replaced by tags, which indicate what the parts of the document are, rather than details of how they might be shown on some display. This lets authors avoid formatting every instance of the same kind of thing redundantly (and possibly inconsistently). It also avoids the specification of fonts and dimensions which may not apply to many users (such as those with varying-size displays, impaired vision and screen-reading software).

airencracken · 7 years ago
https://noyaml.com/

YAML is bad.

Every YAML parser is a custom YAML parser.

https://matrix.yaml.io/valid.html

takeda · 7 years ago
The problem is with parsers, how they are implemented or used. YAML actually has a way to specify type of the data, alternatively the application supposed to suggest desired type. What's this take is showing is what types are assumed when they are not specified.
traderjane · 7 years ago
Oh Puppet, why did you use your own executable YAML.
felixfbecker · 7 years ago
I'll say it: I think YAML is great and a joy to use for configuration files. I can write it even with the dumbest editor, I can write comments, multi-line strings, I can get autocompletion and validation with JSON schema, I can share and reference other values. It allows tools to have config schemas that read like a natural domain specific language, but you already know the syntax. I haven't had problems with it at all.
martinpw · 7 years ago
This was me too - until yesterday, when I made a minor change to one of our YAML config files and everything broke. On investigation it turned out that all of our YAML files had longstanding errors but those errors happened to be valid syntax and also did not cause any bad side effects, so we had been getting away with it by pure luck until I made a change that happened to expose the problem.

So now no longer a YAML fan...

dragonwriter · 7 years ago
That would make me not a fan of the particular parsers/validators I've been using, rather than not a fan of YAML.

The big strike against YAML I see there is that it needs a good conformance test suite and implementations need to be tested against it. But that's not a problem with the format but a fairly easy to fix ecosystem problem.

meowface · 7 years ago
I agree. As long as you're using a strict parser, I've found YAML to be much nicer for configuration than JSON. I use Python's ruamel.yaml library, and have never had any weird type problems. Once the nesting gets too deep, it can be a pain. but that's the same for JSON.

I have found myself using TOML more and more for configuration, though. It helps a lot with keeping things flat and easy to read. I'll still prefer YAML over JSON for human-writable files, but I'm starting to prefer TOML over YAML.

morpheuskafka · 7 years ago
I've got to say it is the most frustrating config file ever to wrote. The only time I have to use it is for Docker Compose and I am constantly fighting vim on indentation and trying to make sense of confusing errors about "unexpected block start." Do you have any suggested vimrc for YAML?
ridiculous_fish · 7 years ago
fish shell is looking for a new text serialization format for its history file (currently it uses an ad-hoc broken psuedo-YAML).

Boxes to check:

1. Self describing format

2. SAX-style parser available to C++

3. Easy for users to understand and ad-hoc parse using command-line tools

4. No document closing necessary, so appending is trivial

YAML looks pretty good:

    - cmd: git checkout file.txt
      when: 1565133286
      pwd: /home/me/dir/
      paths:
      - file.txt
protobuf is also an option:

    entry {
      cmd: "git checkout file.txt"
      when: 1565133286
      paths: "file.txt"
    }
though I am unsure of how well its text serialization is supported.

Any suggestions?

breck · 7 years ago
Disclaimer: I work on Tree Notation. (https://github.com/treenotation/jtree)

Here's a proposal: use a Tree Language.

I created a demo for you called "Fished": https://github.com/breck7/fished.

Took me just a few minutes but already get type check, autocomplete, syntax highlighting, and more.

Tree Notation is early, and there will be kinks until the community is bigger, but I think it may be useful for you.

http://treenotation.org/designer/#grammar%0A%20fishedNode%0A...

dfee · 7 years ago
Wow. This looks really cool. Is there a sort of design defense on how this was designed (tree notation)?

Deleted Comment

akx · 7 years ago
How about JSONL (JSON Lines)? http://jsonlines.org/

Ps. Thanks for (all the) fish, it's my daily driver shell and keeps me that much more sane c.f. the alternatives.

j0057 · 7 years ago
That's really close to [RFC 7464](https://tools.ietf.org/html/rfc7464), JSON Text Sequences. It uses U+001E RECORD SEPARATOR. The `jq` tool supports those if you pass a flag.
mjevans · 7 years ago
(suggestion) Drop the 4th requirement.

Having to close contexts is a VERY good 'sanity check' to see if something is malformed or not.

If appending is necessary make the parser handle multiple copies of the namespace and merge them upon output. Unknown keys and sections should also always be copied from input to output (this is how you embed comments).

ridiculous_fish · 7 years ago
I'm interested, can you explain more about the merging idea?

To clarify the requirement, history could be a JSON array of objects:

    [
        {"cmd": "git checkout", "when": 1234 },
        {"cmd": "vagrant up", "when": 4567 }
    ]

To append an entry to this file and keep it valid, one must locate the closing square bracket and overwrite it. That work is what I hope to avoid.

scintill76 · 7 years ago
> Unknown keys and sections should also always be copied from input to output (this is how you embed comments).

Better than nothing I guess, but I'd say just use a syntax that supports comments.

Mathnerd314 · 7 years ago
S-Expressions are quite simple, there are some parsers floating around in well-known projects, although I'm not sure they're SAX-style: https://leon.bottou.org/projects/minilisp

I also wonder if you need a text format, or if SQLite or systemd's journal API would work.

bhawks · 7 years ago
I love proto, but the textformat was an after thought. The binary format is rigorously defined, portable, extensible and optimized. The text format was reverse engineered from the c++ implementation after the fact when folks found textproto useful. Unfortunately there are discrepancies between languages around the corner cases of the textformat and that's the sad world we live in. Avoid letting textproto be part of your user exposed interface.
theli0nheart · 7 years ago
TOML?
ziotom78 · 7 years ago
TOML would be great, if not for an annoying obscure detail in the specification that makes it hard to use for my typical use cases (scientific computation) [1]. Moreover, I find quite unintuitive how you are supposed to specify array of tables [2]: this kind of is much easier in JSON (which is the format I am currently using, although it is far from perfect).

[1] https://github.com/toml-lang/toml/issues/356

[2] https://github.com/toml-lang/toml#user-content-array-of-tabl...

pitaj · 7 years ago
Can't recommend TOML enough. I use it for everything. Super simple and easy to edit.

It fulfills all of the requirements. There are several available C++ TOML parsers, including one from Boost.

nine_k · 7 years ago
TOML is great when your data are mostly flat.
orf · 7 years ago
Obligatory "thanks for fish shell".

Try just using line-delimited JSON objects (http://jsonlines.org/). It ticks all of your boxes, especially 3: "jq -s '.cmd' fish_history | histogram".

Neither YAML or Protobufs are quite as easy as that.

All in all it's ridiculously simple, easy to parse in a variety of languages and each row is a single line that's simple to iteratively parse without loading the whole thing into memory.

keithnz · 7 years ago
this seems like it makes json useful for logging, but not too useful as configuration. For instance, it doesn't support commenting, and it seems like every line needs to have all its children compressed onto one line?
kevin_thibedeau · 7 years ago
Tcl with control structure commands disabled and infix assignment for convenience. Jim Tcl is a lightweight implementation if the main line isn't workable.
pjc50 · 7 years ago
I'm tempted to suggest CSV.
fabian2k · 7 years ago
I've used YAML as the format for a config file, and I certainly regret that choice. Trying to explain to someone that doesn't know YAML how to edit it without setting them up for failure is quite annoying. There are too many non-obvious ways to screw up, like forgetting the space after the colon or of course bad indentation.
bastardoperator · 7 years ago
YAML is easier to read and write. That's the benefit. It's also always going to be smaller than anything JSON or XML. Maybe it's not as correct, maybe some people don't like it, I don't really mind it. I don't see it really going anywhere soon either considering Kubernetes and the lack of alternatives in widespread usage.

I've never had someone that needed extensive help understanding YAML and that's besides reviewing work for people just coming up to speed. Find me an IDE or editor that doesn't have YAML support. Also, YAML supports comments so if you have pitfalls people need to know about you can document them inline.

Your argument is people who don't know things might screw stuff up. Well Yeah! This applies to everything.

jtdev · 7 years ago
>”YAML is easier to read and write.”

You may be surprised to find that there’s significant disagreement on that point.

booleandilemma · 7 years ago
Giving meaning to whitespace causes so many headaches and yet people still embrace Python, for some reason. I don’t understand it.
whatshisface · 7 years ago
Your editor makes a world of difference here. Since you shouldn't be writing brace-language code without indents anyways, the biggest issue remaining is mixing tabs and spaces. Gedit makes this a big pain with it's default config (it doesn't even auto-indent) but Atom and IDLE handle it well.
codetrotter · 7 years ago
Admittedly I’ve been writing code in Python for many years now but, even from the start I never had a problem with the significance of whitespace.

Quite the opposite.

I like to format my code nicely anyways (or rather, mostly my editor does it for me because I’ve asked it to do so).

I indent with two spaces usually, regardless of language. And have my editors configured to insert two spaces when I press tab.

JavaScript, Rust, Python, C. Same difference, in terms of how I use whitespace.

ben509 · 7 years ago
The main headaches are due to people either wanting to copy and paste code from various sites, or wanting to write really deeply nested code.

If you're writing well structured, original code in Python, it's generally cleaner and easier than other languages because the syntax avoids ambiguities that other languages have.

DangitBobby · 7 years ago
The difference in my experience is that once you know what's wrong with your whitespaces in Python, you're out of the woods. The interpreter is your friend from that point onward. YAML parsers, on the other hand, give you these really strange errors that are pretty difficult to understand, and it doesn't end with whitespaces.
h1d · 7 years ago
There are quite a few comments saying they don't like python even from 10+ year users.

Language becomes popular largely through library ecosystem and resources around it, not just how the language looks. I think Google embracing it had a good role in acquiring mind shares.

https://news.ycombinator.com/item?id=20672051

h1d · 7 years ago
YAML is so bad for human writing. Everytime I write ansible tasks, I get confused with indentation and how to do arrays etc. JSON and YAML is frankly a generation behind compared to TOML or JSON5.
ljm · 7 years ago
I’m not keen on how so many tools and services opt for YAML by default, either. Both JSON and YAML are a nightmare to handle once you’ve got 3000 line files and several layers of nesting.

CI would be a lot nicer to use if it didn’t rely on a single YAML file to work. And if you want to switch, suddenly you had a build step to convert back to YAML.

js2 · 7 years ago
I keep my YAML CI files as minimal as possible by putting the logic into a Makefile and/or shell scripts and just have the YAML invoke that.
al_form2000 · 7 years ago
As an ansible user, I hate YAML and its broken parsers with a passion, but the security objection does not make much sense. It does apply verbatim to any parser of anything if the implementation decides that a given label means "eval this content right away". I fail to see how this can be a fault of the DDL rather than the parser's.
JelteF · 7 years ago
The reason this is a fault of the DDL and not the parser is that the DDL spec decides that it has label that evaluates a command. The parser then has two options, either implement it or not conform to the spec (and essentially implementing a different DDL). For programming languages it makes sense to have an eval label/command. For configuration/serialization DDLs I think it's a terrible choice.
al_form2000 · 7 years ago
And terrible it is indeed, but I cannot find it specified - the strings eval, exec, command, statement do not even occur in the official specs (shallow doc perusal, I know)
SignalsFromBob · 7 years ago
> DDL spec decides that it has label that evaluates a command

This is simply wrong. There is nothing in the spec stating that.

SignalsFromBob · 7 years ago
> As an ansible user, I hate YAML and its broken parsers with a passion

Could you elaborate on this? I use Ansible daily and I've never had a problem with YAML once I took some time to understand it. What do you mean by broken parsers? I'm assuming that's something Ansible specific you are referring to.

al_form2000 · 7 years ago
I intensely dislike yaml's whitespace-based syntax because whitespace is white, and it gives very little visual context expecially in long, nested documents. Editors that expand/collapse branches do help some, but are no match for highlighting matching pair of braces in other saner formats/languages (I am also not a fan of syntactic whitespace in python, if you get my drift.)

And ansible's parser is broken, in more ways that I can remember (haven't been writing playbooks and stuff for a couple of months now). If you like pointless pain, try embedding ":" in task names for a demo (or one of other several "meta" characters: the colon is just the one that ends to recur most).

I will give a passing mention to the smug, vague error message "You have an error at position (somewhere in the middle of the file) It seems that you are missing... (something) we may be wrong (they almost always are) but it appears it begins in position (some position close to the first line)" that sets off a hunt for the missing brace/colon/space/whatever and makes me want to do stuff to the person who devised it.

This compounds with the confusion brought on weaving of yaml's and jinjia2 syntaxes and ansible's own flakiness on deciding what is evaluated when - which decides when and if a variable does indeed change, when does yes means "yes" rather than true or 1, but not '1' or "true" (try prompting the user for a boolean variable, and find yourself writing if ( (switch == "true") or (switch == "1")) in short order).

Pity that ansible is so damn convenient, or I would have ditched it long time ago for anything - bash included (OK, maybe not bash).

apple4ever · 7 years ago
Funny, as an Ansible user I love YAML. It works so well for me.
crazygringo · 7 years ago
So what's the HN consensus on the best format for config files?

Is it TOML as the author seems to prefer at the end?

philwelch · 7 years ago
My vote is yes. Most configuration doesn’t need anything more sophisticated than key-value pairs, perhaps with namespaces. INI can manage that and TOML is basically a better-specified INI.
hardwaresofton · 7 years ago
I can't tell if I've spent too much time on HN or if I came to this conclusion on my own, but TOML is my language of choice for configuration now. It's flexible in the right ways and sectioning of config is so important.
h1d · 7 years ago
No reason to use INI over TOML. INI doesn't even have a standard specification.
0xbadcafebee · 7 years ago
.ini, followed by TOML, followed by an identical implementation of some other app's config format.

The biggest problem with config formats is they mislead users into thinking they understand the format. The user tries to edit it by hand, and chaos ensues. So only formats that are stupidly simple, or whose warts are already familiar and well documented, are good choices.

Apache had a great configuration format. Nothing else used it (that I knew of) but you could in theory implement "Apache configs" and then people'd just have to look up how to write those, which there's lots of examples of.

JSON and YAML and XML are data formats; they should only be written by machines, and read by humans. Same with protocols like HTTP, Telnet, FTP... You're not supposed to write it yourself, but it's readable to make troubleshooting easier.

Data formats are nice for expressing nested data structures, but then they don't (usually) support logical expressions; at that point you need a template/macro/programming language, and at that point you're writing code, which will need to be tested, and at that point you should just write modules and use a config format to give them arguments. Every complex tool goes through the same evolution.

If you care about your users, write a tool to generate configs based on a wizard. Good CLI tools do this, and it really makes life better. (It's also a great way to document all your config features in code, and test them)

majewsky · 7 years ago
If possible, prefer what tools in your vicinity use. My team uses Kubernetes and Concourse extensively, which both use YAML, so I tend to stick with YAML since people are already familiar with it.

(More recently, I've come around to prefer plain environment variables for configuration, but that only works nicely when the amount of configuration is fairly limited, say 20 values instead of 1000 values.)

For my own use, I do prefer TOML.

kissgyorgy · 7 years ago
Perfect comment! I agree 100%.
mcphage · 7 years ago
In the scale world, HOCON is very nice. It’s a format designed explicitly for config files, and has a lot of niceties (like you can append files together and they merge correctly, so you don’t have to end up with giant config files)
twic · 7 years ago
What's the "scale" world?
SeanA208 · 7 years ago
I agree with HOCON being nice based on personal usage but I haven't seen an in depth analysis of it. This is the canonical parser for JVM based languages — https://github.com/lightbend/config, are there many other implementations that are widely used?
rusk · 7 years ago
I think it's horses for courses. JSON I guess is the best for interchange i.e machine to machine, but I never want to edit it by hand; XML is relatively easy to read but can be quite painful to edit raw, but it can be quite easy to develop a structures editor. I’d favour it for document persistence. YAML is fine for configuration files but I would be careful about how I apply it and would always provide it as a heavily documented templated config file. YAML when used correctly is by far the easiest to edit in the clear, with a plain text editor. With that said, I would try to get away with basic namespaces properties files first before I’d go that far ...
geerlingguy · 7 years ago
ini if needs are crazy simple, YAML if you need a structure like JSON's but with something any human ever needs to interact with. JSON if humans aren't in the loop.

TOML, in my opinion, is like a weird mishmash of JSON, ini, and bashisms. Though I have worked with it a lot less than the other formats, so YMMV.

crdoconnor · 7 years ago
The main issue I had with TOML is how much more syntactically noisy it is. Equivalent files with 2-3 levels of nesting usually become at least 50% longer than equivalent YAML.

More here : https://hitchdev.com/strictyaml/why-not/toml/

meowface · 7 years ago
This is a different use case, I think. This example is defining content, not configuration. In this case the content is user stories. I agree for creating sequences of documents/content in this way, YAML often is nicer. But for configuration, TOML is designed to specify it in a simple and flat way, and that can be very helpful.

I have some projects where I'm frequently writing and midifying content that resembles the example here, and I use YAML there and plan to keep using YAML. For most other things, I'm just doing configuration, so I use TOML. No reason you need to stick to one or the other.

monocularvision · 7 years ago
Been working more with Dhall and have really enjoyed it so far.

https://dhall-lang.org/

twic · 7 years ago
Putting commas at the start of the line is the toe shoes of syntax.
bmurphy1976 · 7 years ago
There's no white Knight here, they all suck in some way. Personally I've had decent success with yaml as simple configuration, but I would never use it as an interchange format. If you know it's caveats and you're targeting one language so you can become familiar with the parser it's serviceable.
markmark · 7 years ago
I say just use JSON. Everyone knows it already and it's good enough. Use a parser in your app that allows comments and trailing commas like vscode does.
dagenix · 7 years ago
That's not JSON anymore, that's some custom format that's JSON inspired.
h1d · 7 years ago
It's called JSON5. Don't bend JSON to confuse parsers.

https://json5.org/

jaredklewis · 7 years ago
I dislike json with comments or trailing commas as even if your parser can handle them, it surprises many text editors.

Aside from lack of comments, the other major thing that can sometime make json a bad config choice is lack of multi-line strings.

sigzero · 7 years ago
JSON is for data. Not documents. Not config files. I don't agree with any "add this to JSON" comments. It's fine just as it is....for data.

Deleted Comment

edoceo · 7 years ago
I'm still using INI files like it's 1999.
h1d · 7 years ago
Everything being a string is a bit of a joke now.
chucky_z · 7 years ago
I use JSON in the end. I prefer to write TOML, then parse that into JSON. This seems to strike a nice balance between human/machine write/read. It's simple enough to reason TOML, even if it gets verbose. If I have to write YAML after 2 layers I usually write it as JSON and include the JSON in the 2nd level of key.
falcolas · 7 years ago
It's telling that the responses to this question are broad and varied. Still not a well solved problem, it seems.
bobbyi_settv · 7 years ago
It's also telling that even with every other possible answer being given by someone, there's still no one who wants XML.
wodenokoto · 7 years ago
My cursory survey of config / serialisation formats concluded that nothing is close to being good.

It's overly verbose, and hard to understand XML, it's no comments son, horrors of yaml or some okay format that doesn't have parsers for the languages (plural) you are using on your project.

wdroz · 7 years ago
For in-house, python-only project, my way to go is to create a "config.py". Then I declare a bunch of module variables that can be overridden by environment variables as a bonus.
davedx · 7 years ago
J S O N all the way.

Deleted Comment

apple4ever · 7 years ago
YAML for me, then TOML