JSON parsers that can accept comments

It's a case of someone designs a tool to do A, then a lot of people who have problem B see it and go, that's almost what I want, with just a few adaptations.

JSON in its original form is fine if you want to pass data from a server to a browser's fetch request. No need for comments.

(If you're passing data from one process to another, and neither of the processes is a browser, have you considered protobuf?)

For humans writing config files, such as the VS Code one where you sometimes need to go in and do things by hand, both the ability to temporarily comment stuff out and do put comments above particular settings, are extremely useful if not essential. The "comment: '...'" approach doesn't work here or anywhere else where you have a schema to validate against.

Comments are not part of how JSON was originally defined. It's technically correct (the best kind of correct, some say) that comments are not part of the spec. But comments are sure how most people who edit JSON as a human [want to] use it. If the spec says A but everyone is doing B because that's what best solves their problem, then either you write a new spec and call it JSON++ or JSON-C or something and watch most people switch to that, or you adapt the spec in the first place.

We've seen this with markdown (ok, commonmark) and HTML already, among other things.

michaelmior · a year ago

If you want JSON with comments (and a couple other extras), use JSON 5 instead. (It's effectively the JSON++/JSON-C you mention.)

https://json5.org/

shaftway · a year ago

+1 for using protobufs. I'll go a step further and suggest using textprotos for configuration files. You get validation out of the box for free (making it hard to mis-type your keys), along with a strict contract for your configuration. Comments too. And no quotes on key names. I don't think trailing commas are supported in condensed repeated fields, but since there aren't commas between fields you can use the long format to avoid them.

https://protobuf.dev/reference/protobuf/textformat-spec/

JSON as a data-format should not have comments. JSON as a file-format should allow comments. The problem is this conflation between the two.

lolinder · a year ago

But they're not two different formats—they're two different jobs being done by the same format.

JSON as currently spec'd is honestly quite bad at both jobs, but the most rational defense of its use as a data format is that it's (mostly) human readable. Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?

throw0101d · a year ago

> Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?

Because JSON originally did have comments, and people were putting pragmas into them, and so different parsers would act different depending on whether they understood them or not. Comments ended up being an anti-feature in JSON because people were abusing them.

Source:

> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't. […]

* https://web.archive.org/web/20190112173904/https://plus.goog...

m463 · a year ago

> but the most rational defense of its use as a data format is that it's (mostly) human readable

I would call out portability instead, which is not dependent on the byte ordering or endianness issues of binary data formats.

sort of like: javascript is portable code, json is portable data.

sophacles · a year ago

I think json should allow comments.

But there are dangers there - look at how horribly comments get abused in code:

* doctests are nonsense, just write tests. (doctests like rusts that just validate example snippets are the closest thing to good I've seen so far, but still make me nervous).

* load bearing comments that code mangling/generation tools rely on (see a whole bunch of generated scripts in your linux systen - DO NOT EDIT BELOW THIS LINE)

* things like modelines in editors that affect how programs interact with the code

* things like html or xml comments that on parsing affect end user program logic.

Comments can be abused, and in something like JSON on the wire I can see systems which take additional info from the comments as part of the primary data input. Often a completely different format... and you end up with something like the front-matter on your markdown files as found in static site generators.

Point being, comments are not a purely benign addition.

Kwpolska · a year ago

The problem is using JSON as a file format in the first place. It’s not designed for humans to edit. (Then again, it’s better than the Norway-sceptic YAML.)

peterashford · a year ago

I disagree. At least in an ought vs is sense: it's entirely the kind of format that I would create as an editable format. As witnessed by the fact that my workmates and I did create very nearly JSON previously as a file format in the 90s (but for C code programs)

Jare · a year ago

What example(s) of file format would you say are designed for humans to edit and still represent the kind of structured contents that json does?

Cthulhu_ · a year ago

But it happens. 'npm install' will edit your json file, but so can I.

That said, I don't like it as a config file read/written by humans.

martin-adams · a year ago

Can I confirm that the reason it's not preferred to have comments in data-formats is because it's to be machine read only and as such should be as efficient as possible and not contain information that wont be used?

Seeing as I can only see the use case as a file format to be read/written by humans in the loop, then maybe the conversation should be about compiling the file format to a data format for compatibility outside of the user tooling.

johannes1234321 · a year ago

The argument is that comments are often used as an escape hatch from specified formats to carry further instructions. So you got a properly specified format and then want to do vendor&extensions but not break other implementations ... just make your extensions a comment. Then other parsers ignore it and you can do your thing.

The idea is that this forces better formats.

How well this works? Well, then I got an "x-comment" property or non-standard comments. Nonetheless. If people see the need to hack some extension in, they'll find a way.

ur-whale · a year ago

> is because it's to be machine read only

Why did they bother making it text-only ASCII then ?

burnished · a year ago

I think in the JSON case its because you can't have true comments, any comments are intrinsically part of the data structure, and you invite problems by including irrelevant information

nivertech · a year ago

Thinking from the first principles:

1. comments are metadata (specifically Human/LLM-readable metadata vs machine-readable metadata)

2. general-purpose data formats should support metadata

jimmaswell · a year ago

Disallow comments and now you just have {"comment": "the quick brown fox.."}, the worst of both worlds.

hombre_fatal · a year ago

That's a harmless example and a tiny price to pay.

What no-comments saved us from was stuff like this in our data interchange:

    {  
        "count": 123 // bigint
        "price": 10.99 // @precision=2
        "date": "2024-08-12" // @format=YY-MM-dd
        "data": /* !transform(rot13) */ "uryyb" 
        "storage": 5 // Unit(TB)
    }

And who knows what deeper layers of hell we avoided.

Frankly, VSCode shows that all this time people were complaining about no comments in JSON config and how hard it was to write config in JSON, they could have just written their apps to strip comments at read time.

So we do have the best of both worlds.

codedokode · a year ago

> JSON as a file-format should allow comments.

JSON is awful for writing manually because it requires typing too many quotes, commas etc. I think JSON is meant to be machine-generated and machine-read and therefore doesn't need any comments.

leptons · a year ago

You're a programmer and you're against writing quotes and commas? You must really hate coding. I've never found JSON to be too much typing.

xnorswap · a year ago

If you're entirely machine writing and reading, but still want to be human-legible, then XML does a much better job while also allowing for schema.

pjc50 · a year ago

The only reason it became popular is that conflation!

Someone · a year ago

I think it’s more “it would be nice if JSON intended to be read or written by humans allowed comments”.

- JS syntax compatibility (low cognitive overhead) - Decent balance between machine-readable and human-readable - High familiarity for developers (if they know JSON, which they likely do, they can work with JSON5 with near-0 learning curve)

red_admiral · a year ago

avodonosov · a year ago

Just use EDN.

Not only it supports comments and trailing commas, it allows to avoid commas at all (they are simply ignored by EDN). Commas indeed are just redundant and incur visual noise most of the time.

    [1, 2, 3]
    [1 2 3]       ; much better

The spec: https://github.com/edn-format/edn/blob/master/README.md

Examples at github (261k files found at the moment) : https://github.com/search?q=path%3A*.edn+&type=code

Github search only shows the first 5 pages of results. To extract more results split the search with more specific path qalifiers:

    path:*a.edn
    path:*b.edn
    path:*c.edn

lukan · a year ago

"Commas indeed are just redundant and incur visual noise most of the time."

I would argue semicolons are and only sometimes commas. Personally I prefer [1, 2, 3] much more over [1 2 3]. With long numbers and dots and strings and whatever, the seperator is helpful for me.

If and when commas improve readability for you, feel free to use them in EDN.

After adopting EDN dont fall into the deceptive feeling that Douglas Crockford is you son.

anentropic · a year ago

He's not my Dad, but I do follow him on Mastodon... where he had this to say on the topic a few days ago:

https://layer8.space/@douglascrockford/113595316189101091

> If the comment is important enough to be put in the text and remembered, then make it an explicit part of the data structure. That will also make it easier for tools to find it and process it in a useful way. {"comment": ...}

> Use a preprocessor tool such as jsmin to remove the commentary before passing it to a JSON parser.

I kind of wish JSON had trailing commas though

remon · a year ago

I'm not super invested in this debate one way or the other but arguing a comment should be part of your data rather than documenting what data is contained is a bizarre take on the question. Either be of the opinion it's a data format and comments are just out of scope (and using JSON for e.g. config is just a bad choice), or agree that allowing comments in JSON should be part of the spec since right or wrong it is used for config and other human edited use cases. JSON5's adoption seems to imply people are mostly on the latter side of the fence.

I don't find the argument incoherent at all

either it's a data transfer format and comments should be stripped before transmission

or the comments are part of the data model

makes you have to think clearly whether the comments are just "notes to self" at authoring time, or something relevant to the consumer

OTOH of course there are plenty of 'greyer' cases

JSON5 has both comments and trailing commas. Of course you can only rely on JSON5 if you're ever passing through anywhere expecting JSON.

vasilvv · a year ago

Isn't this the problem that JSON5 (and probably other similar projects) is supposed to solve?

Both JSON (as defined in the RFC) and JSON5 have a nice property of being well-defined, meaning that you can use different libraries in different languages on different platforms to parse them, and expect the same result. "JSON but parser behaves reasonably (as defined by the speaker)" does not have this property.

donatj · a year ago

JSON5 would be ok if that's all it did. They added so much additional unnecessary complication that it undermines the simplicity of JSON that makes it good.

avmich · a year ago

http://seriot.ch/projects/parsing_json.html

"Despite the clarifications they bring, RFC 7159 and 8259 contain several approximations and leaves many details loosely specified."

hinkley · a year ago

Nothing will probably ever top Markdown in my mind for bullshit specifications.

And Gruber wouldn’t give Jeff Atwood permission to call his variant <something> Markdown, or it seems anybody else, so we ended up with CommonMark, and GFM.

Json5 is good for JSON at rest, as others have mentioned already.

model-15-DAV · a year ago

Instead of inventing "JSON with comments" format, why not simply use JSON5?

chippiewill · a year ago

I agree, it's better to just use JSON5.

Having a mess where JSON parsers sometimes do, and sometimes don't allow comments is a bad outcome.

JSON5 is advantageous because it's explicitly separate - it has a different file extension, it needs different libraries. Also it supports a few other utility features people want (like trailing commas) without bringing in the whole dangerous kitchen sink like YAML does.

dathinab · a year ago

But if you already decide to forgo the benefits of JSON (mainly the ecosystem/wide spread support) and use a niche solution why would you go for JSON5?

That looks like a worst of both world solution IMHO.

You are probably better of with a TOML -> JSON mapping and would be better of with YAML too if YAML hadn't had really stupid ambiguity pitfalls.

williamdclt · a year ago

You don't lose all benefits. Ecosystem often isn't a problem (I don't need high interop, I need this JSON5 file for one specific purpose), and you certainly don't lose all benefits:

Plus, JSON5 support is _somewhat_ widespread (maybe ~50% of tools I use support json5 for config?)

iddan · a year ago

It's already invented and JSON w Comments is a lot simpler than JSON5 to parse

Maybe be should have been though.

Files shouldn't be labeled as compliant with a standard and then not be. Full stop.

The standard that is JSON does not support comments. Don't call something JSON that isn't.

Use one of the many existing JSON extensions or create a new standard. DO NOT however just adhoc crap like this suggests, that's a road to hell.

easton · a year ago

I need to send it to the author, but ASP.NET Core’s JSON configuration parser also does json + comments by default. Confuses the hell out of my colleagues and VSCode when I do it though :D

https://learn.microsoft.com/en-us/aspnet/core/fundamentals/c...