It's a case of someone designs a tool to do A, then a lot of people who have problem B see it and go, that's almost what I want, with just a few adaptations.
JSON in its original form is fine if you want to pass data from a server to a browser's fetch request. No need for comments.
(If you're passing data from one process to another, and neither of the processes is a browser, have you considered protobuf?)
For humans writing config files, such as the VS Code one where you sometimes need to go in and do things by hand, both the ability to temporarily comment stuff out and do put comments above particular settings, are extremely useful if not essential. The "comment: '...'" approach doesn't work here or anywhere else where you have a schema to validate against.
Comments are not part of how JSON was originally defined. It's technically correct (the best kind of correct, some say) that comments are not part of the spec. But comments are sure how most people who edit JSON as a human [want to] use it. If the spec says A but everyone is doing B because that's what best solves their problem, then either you write a new spec and call it JSON++ or JSON-C or something and watch most people switch to that, or you adapt the spec in the first place.
We've seen this with markdown (ok, commonmark) and HTML already, among other things.
+1 for using protobufs. I'll go a step further and suggest using textprotos for configuration files. You get validation out of the box for free (making it hard to mis-type your keys), along with a strict contract for your configuration. Comments too. And no quotes on key names. I don't think trailing commas are supported in condensed repeated fields, but since there aren't commas between fields you can use the long format to avoid them.
Not only it supports comments and trailing commas, it allows to avoid commas at all (they are simply ignored by EDN). Commas indeed are just redundant and incur visual noise most of the time.
"Commas indeed are just redundant and incur visual noise most of the time."
I would argue semicolons are and only sometimes commas. Personally I prefer [1, 2, 3] much more over [1 2 3]. With long numbers and dots and strings and whatever, the seperator is helpful for me.
> If the comment is important enough to be put in the text and remembered, then make it an explicit part of the data structure. That will also make it easier for tools to find it and process it in a useful way. {"comment": ...}
> Use a preprocessor tool such as jsmin to remove the commentary before passing it to a JSON parser.
I'm not super invested in this debate one way or the other but arguing a comment should be part of your data rather than documenting what data is contained is a bizarre take on the question. Either be of the opinion it's a data format and comments are just out of scope (and using JSON for e.g. config is just a bad choice), or agree that allowing comments in JSON should be part of the spec since right or wrong it is used for config and other human edited use cases. JSON5's adoption seems to imply people are mostly on the latter side of the fence.
Isn't this the problem that JSON5 (and probably other similar projects) is supposed to solve?
Both JSON (as defined in the RFC) and JSON5 have a nice property of being well-defined, meaning that you can use different libraries in different languages on different platforms to parse them, and expect the same result. "JSON but parser behaves reasonably (as defined by the speaker)" does not have this property.
JSON5 would be ok if that's all it did. They added so much additional unnecessary complication that it undermines the simplicity of JSON that makes it good.
Nothing will probably ever top Markdown in my mind for bullshit specifications.
And Gruber wouldn’t give Jeff Atwood permission to call his variant <something> Markdown, or it seems anybody else, so we ended up with CommonMark, and GFM.
Json5 is good for JSON at rest, as others have mentioned already.
But they're not two different formats—they're two different jobs being done by the same format.
JSON as currently spec'd is honestly quite bad at both jobs, but the most rational defense of its use as a data format is that it's (mostly) human readable. Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?
> Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?
Because JSON originally did have comments, and people were putting pragmas into them, and so different parsers would act different depending on whether they understood them or not. Comments ended up being an anti-feature in JSON because people were abusing them.
Source:
> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't. […]
But there are dangers there - look at how horribly comments get abused in code:
* doctests are nonsense, just write tests. (doctests like rusts that just validate example snippets are the closest thing to good I've seen so far, but still make me nervous).
* load bearing comments that code mangling/generation tools rely on (see a whole bunch of generated scripts in your linux systen - DO NOT EDIT BELOW THIS LINE)
* things like modelines in editors that affect how programs interact with the code
* things like html or xml comments that on parsing affect end user program logic.
Comments can be abused, and in something like JSON on the wire I can see systems which take additional info from the comments as part of the primary data input. Often a completely different format... and you end up with something like the front-matter on your markdown files as found in static site generators.
Point being, comments are not a purely benign addition.
The problem is using JSON as a file format in the first place. It’s not designed for humans to edit. (Then again, it’s better than the Norway-sceptic YAML.)
I disagree. At least in an ought vs is sense: it's entirely the kind of format that I would create as an editable format. As witnessed by the fact that my workmates and I did create very nearly JSON previously as a file format in the 90s (but for C code programs)
Can I confirm that the reason it's not preferred to have comments in data-formats is because it's to be machine read only and as such should be as efficient as possible and not contain information that wont be used?
Seeing as I can only see the use case as a file format to be read/written by humans in the loop, then maybe the conversation should be about compiling the file format to a data format for compatibility outside of the user tooling.
The argument is that comments are often used as an escape hatch from specified formats to carry further instructions. So you got a properly specified format and then want to do vendor&extensions but not break other implementations ... just make your extensions a comment. Then other parsers ignore it and you can do your thing.
The idea is that this forces better formats.
How well this works? Well, then I got an "x-comment" property or non-standard comments. Nonetheless. If people see the need to hack some extension in, they'll find a way.
I think in the JSON case its because you can't have true comments, any comments are intrinsically part of the data structure, and you invite problems by including irrelevant information
And who knows what deeper layers of hell we avoided.
Frankly, VSCode shows that all this time people were complaining about no comments in JSON config and how hard it was to write config in JSON, they could have just written their apps to strip comments at read time.
JSON is awful for writing manually because it requires typing too many quotes, commas etc. I think JSON is meant to be machine-generated and machine-read and therefore doesn't need any comments.
Having a mess where JSON parsers sometimes do, and sometimes don't allow comments is a bad outcome.
JSON5 is advantageous because it's explicitly separate - it has a different file extension, it needs different libraries. Also it supports a few other utility features people want (like trailing commas) without bringing in the whole dangerous kitchen sink like YAML does.
But if you already decide to forgo the benefits of JSON (mainly the ecosystem/wide spread support) and use a niche solution why would you go for JSON5?
That looks like a worst of both world solution IMHO.
You are probably better of with a TOML -> JSON mapping and would be better of with YAML too if YAML hadn't had really stupid ambiguity pitfalls.
You don't lose all benefits. Ecosystem often isn't a problem (I don't need high interop, I need this JSON5 file for one specific purpose), and you certainly don't lose all benefits:
- JS syntax compatibility (low cognitive overhead)
- Decent balance between machine-readable and human-readable
- High familiarity for developers (if they know JSON, which they likely do, they can work with JSON5 with near-0 learning curve)
Plus, JSON5 support is _somewhat_ widespread (maybe ~50% of tools I use support json5 for config?)
I need to send it to the author, but ASP.NET Core’s JSON configuration parser also does json + comments by default. Confuses the hell out of my colleagues and VSCode when I do it though :D
JSON in its original form is fine if you want to pass data from a server to a browser's fetch request. No need for comments.
(If you're passing data from one process to another, and neither of the processes is a browser, have you considered protobuf?)
For humans writing config files, such as the VS Code one where you sometimes need to go in and do things by hand, both the ability to temporarily comment stuff out and do put comments above particular settings, are extremely useful if not essential. The "comment: '...'" approach doesn't work here or anywhere else where you have a schema to validate against.
Comments are not part of how JSON was originally defined. It's technically correct (the best kind of correct, some say) that comments are not part of the spec. But comments are sure how most people who edit JSON as a human [want to] use it. If the spec says A but everyone is doing B because that's what best solves their problem, then either you write a new spec and call it JSON++ or JSON-C or something and watch most people switch to that, or you adapt the spec in the first place.
We've seen this with markdown (ok, commonmark) and HTML already, among other things.
https://json5.org/
https://protobuf.dev/reference/protobuf/textformat-spec/
Not only it supports comments and trailing commas, it allows to avoid commas at all (they are simply ignored by EDN). Commas indeed are just redundant and incur visual noise most of the time.
The spec: https://github.com/edn-format/edn/blob/master/README.mdExamples at github (261k files found at the moment) : https://github.com/search?q=path%3A*.edn+&type=code
Github search only shows the first 5 pages of results. To extract more results split the search with more specific path qalifiers:
I would argue semicolons are and only sometimes commas. Personally I prefer [1, 2, 3] much more over [1 2 3]. With long numbers and dots and strings and whatever, the seperator is helpful for me.
https://layer8.space/@douglascrockford/113595316189101091
> If the comment is important enough to be put in the text and remembered, then make it an explicit part of the data structure. That will also make it easier for tools to find it and process it in a useful way. {"comment": ...}
> Use a preprocessor tool such as jsmin to remove the commentary before passing it to a JSON parser.
I kind of wish JSON had trailing commas though
either it's a data transfer format and comments should be stripped before transmission
or the comments are part of the data model
makes you have to think clearly whether the comments are just "notes to self" at authoring time, or something relevant to the consumer
OTOH of course there are plenty of 'greyer' cases
https://json5.org/
Both JSON (as defined in the RFC) and JSON5 have a nice property of being well-defined, meaning that you can use different libraries in different languages on different platforms to parse them, and expect the same result. "JSON but parser behaves reasonably (as defined by the speaker)" does not have this property.
"Despite the clarifications they bring, RFC 7159 and 8259 contain several approximations and leaves many details loosely specified."
And Gruber wouldn’t give Jeff Atwood permission to call his variant <something> Markdown, or it seems anybody else, so we ended up with CommonMark, and GFM.
Json5 is good for JSON at rest, as others have mentioned already.
JSON as currently spec'd is honestly quite bad at both jobs, but the most rational defense of its use as a data format is that it's (mostly) human readable. Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?
Because JSON originally did have comments, and people were putting pragmas into them, and so different parsers would act different depending on whether they understood them or not. Comments ended up being an anti-feature in JSON because people were abusing them.
Source:
> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't. […]
* https://web.archive.org/web/20190112173904/https://plus.goog...
I would call out portability instead, which is not dependent on the byte ordering or endianness issues of binary data formats.
sort of like: javascript is portable code, json is portable data.
But there are dangers there - look at how horribly comments get abused in code:
* doctests are nonsense, just write tests. (doctests like rusts that just validate example snippets are the closest thing to good I've seen so far, but still make me nervous).
* load bearing comments that code mangling/generation tools rely on (see a whole bunch of generated scripts in your linux systen - DO NOT EDIT BELOW THIS LINE)
* things like modelines in editors that affect how programs interact with the code
* things like html or xml comments that on parsing affect end user program logic.
Comments can be abused, and in something like JSON on the wire I can see systems which take additional info from the comments as part of the primary data input. Often a completely different format... and you end up with something like the front-matter on your markdown files as found in static site generators.
Point being, comments are not a purely benign addition.
That said, I don't like it as a config file read/written by humans.
Seeing as I can only see the use case as a file format to be read/written by humans in the loop, then maybe the conversation should be about compiling the file format to a data format for compatibility outside of the user tooling.
The idea is that this forces better formats.
How well this works? Well, then I got an "x-comment" property or non-standard comments. Nonetheless. If people see the need to hack some extension in, they'll find a way.
Why did they bother making it text-only ASCII then ?
1. comments are metadata (specifically Human/LLM-readable metadata vs machine-readable metadata)
2. general-purpose data formats should support metadata
What no-comments saved us from was stuff like this in our data interchange:
And who knows what deeper layers of hell we avoided.Frankly, VSCode shows that all this time people were complaining about no comments in JSON config and how hard it was to write config in JSON, they could have just written their apps to strip comments at read time.
So we do have the best of both worlds.
JSON is awful for writing manually because it requires typing too many quotes, commas etc. I think JSON is meant to be machine-generated and machine-read and therefore doesn't need any comments.
Having a mess where JSON parsers sometimes do, and sometimes don't allow comments is a bad outcome.
JSON5 is advantageous because it's explicitly separate - it has a different file extension, it needs different libraries. Also it supports a few other utility features people want (like trailing commas) without bringing in the whole dangerous kitchen sink like YAML does.
That looks like a worst of both world solution IMHO.
You are probably better of with a TOML -> JSON mapping and would be better of with YAML too if YAML hadn't had really stupid ambiguity pitfalls.
Files shouldn't be labeled as compliant with a standard and then not be. Full stop.
The standard that is JSON does not support comments. Don't call something JSON that isn't.
Use one of the many existing JSON extensions or create a new standard. DO NOT however just adhoc crap like this suggests, that's a road to hell.
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/c...