Why I built Skir: https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t...
Quick start: npx skir init
All the config lives in one YML file.
Website: https://skir.build
GitHub: https://github.com/gepheum/skir
Would love feedback especially from teams running mixed-language stacks.
1. Dense json
Interesting idea. You can also just keep the compact binary if you just tag each payload with a schema id (see Avro). This also allows a generic reader to decode any binary format by reading the schema and then interpreting the binary payload, which is really useful. A secondary benefit is you never ever misinterpret a payload. I have seen bugs with protobufs misinterpreted since there is no connection handshake and interpretation is akin to 'cast'.
2. Compatibility checks
+100 there's not reason to allow breaking changes by default
3. Adding fields to a type: should you have to update all call sites?
I'm not so sure this is the right default. If I add a field to a core type used by 10 services, this requires rebuilding and deploying all of them.
4. enum looks great. what about backcompat when adding new enum fields? or sometimes when you need to 'upgrade' an atomic to an enum?
0. Yes, I looked at Avro, Ion. I like Protobuf much better because I think using field numbers for field identity, meaning being able to rename fields freely, is a must.
1. Yes. Skir also supports that with binary format (you can serialize and deserialize a Skir schema to JSON, which then allows you to convert from binary format to readable JSON). It just requires to build many layers of extra tooling which can be painful. For example, if you store your data in some SQL engine X, you won't be able to quickly visualize your data with a simple SELECT statement, you need to build the tooling which will allow you to visualize the data. Now dense JSON is obviously not idea for this use case, because you don't see the field names, but for quick debugging I find it's "good enough".
3. I agree there are definitely cases where it can be painful, but I think the cases where it actually is helpful are more numerous. One thing worth noting is that you can "opt-out" of this feature by using `ClassName.partial(...)` instead of `ClassName()` at construction time. See for example `User.partial(...)` here: https://skir.build/docs/python#frozen-structs I mostly added this feature for unit tests, where you want to easily create some objects with only some fields set and not be bothered if new fields are added to the schema.
4. Good question. I guess you mean "forward compatibility": you add a new field to the enum, not all binaries are deployed at the same time, and some old binary encounters the new enum it doesn't know about? I do like Protobuf does: I default to the UNKNOWN enum. More on this: - https://skir.build/docs/schema-evolution#adding-variants-to-... - https://skir.build/docs/schema-evolution#default-behavior-dr... - https://skir.build/docs/protobuf#implicit-unknown-variant
avro supports field renames though.
3. on second thought i believe you'd only have to deploy when you choose. the next build will force you to provide values (or opt into the default). so forcing inspection of construction sites seems good.
> Cap'n Web is a spiritual sibling to Cap'n Proto (and is created by the same author), but designed to play nice in the web stack.
It's just JSON, which has up and down sides. But things like promise pipelining are such a huge upside versus everything else: you can refer to results (and maybe send them around?) and kick off new work based on those results, before you even get the result back.
This is far far far superior to everything else, totally different ball-game.
I've been a little rebuffed by wasm when I try, keep getting too close to some gravitational event horizon & get sucked in & give up, but for more data-throughput oriented systems, I'm still hoping wrpc ends up being a fantastic pick. https://github.com/bytecodealliance/wrpc . Also Apache Arrow Flight, which I know less about, has mad traction in serious data-throughput systems, which being adjacent to amazingly popular Apache Arrow makes sense. https://arrow.apache.org/docs/format/Flight.html
https://news.ycombinator.com/user?id=kentonv
Maybe I'm missing some additional features but that's exactly what https://buf.build/plugins/typescript does for Protobuf already, with the advantage that you can just keep Protobuf and all the battle hardened tooling that comes with it.
Dead Comment
If Skir wants to be more than prettier syntax it needs concrete wins, including well-specified schema evolution rules that map cleanly to the wire, clear prescriptions for numeric tag management and reserved ranges, first-class reflection and descriptor compatibility, a migration checker, and a canonical deterministic encoding for signing and deduplication. Otherwise you get another neat demo format that becomes a painful migration when ops and clients disagree on tag semantics.
Deleted Comment
In the "dense JSON" format, isn't representing removed/absent struct fields with `0` and not `null` backwards incompatible?
If you remove or are unaware of a `int32?` field, old consumers will suddenly think the value is present as a "default" value rather than absent
GH style import. This is a big one I wish proto had in the first place. The entire idea of a proto registry feels reactive to me when, ideally, you want to pull in a versioned shared file to import that is verified by the compiler long before serve or client verifies the payload schema.
Schema validation and compatibility checks on CI. Again a big one and critical to catch issues early.
Enums done right... No further comment required.
I think with some more attention to details e.g. hammering out the gaps some other comments have identified and more language support e.g. Rust, Go, C# this can actually work out over time.
Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed.
The only concern i have is timing. Ten years ago this would have been a smash hit. These days, we have Thrift and similar meaning the bar is definitely higher. That's not necessarily bad, but one needs to be mindful about differentiation to the existing proto alternatives.
I hope this project gains trajectory and community especially from the frustrated proto folks.
I completely agree with you about timing, wish I had done this 10 years ago :)
"Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed." < I tried asking Claude: "Migrate this project from protobuf to Skir, see https://skir.build/" and it works pretty well. I created http://skir.build/llms.txt which helps with this. The pain point is data migration though, and as much as I want Skir to succeed: I cannot recommend people migrating from protobuf to Skir if they have some persisted data to migrate, the effort is probably not worth it.
The best thing Skir does is strict generated constructors. You add a field, every construction site lights up. Protobuf's "silently default everything" model has caused mass production incidents at real companies. This is a legitimately better default.
Dense JSON is interesting but the docs gloss over the tradeoff: your serialized data is [3, 4, "P"]. If you ever lose your schema, or a human needs to read a payload in a log, you're staring at unlabeled arrays. Protobuf binary has the same problem but nobody markets binary as "easy to inspect with standard tools." The "serialize now, deserialize in 100 years" claim has a real asterisk. Compatibility checking requires you to opt into stable record IDs and maintain snapshots. If you skip that (and the docs' own examples often do), the CLI literally warns you: "breaking changes cannot be detected." So it's less "built-in safety" and more "safety available if you follow the discipline." Which is... also what Protobuf offers.
The Rust-style enum unification is genuinely cleaner than Protobuf's enum/oneof split. No notes there, that's just better language design.
Minor thing that bothered me disproportionately: the constant syntax in the docs (x = 600) doesn't match what the parser actually accepts (x: 600).
The weirdest thing that bugged the heck out of me was the tagline, "like protos but better", that's doing the project no favors.
I think this would land better if it were positioned as "Protobuf, but fresh" rather than "Protobuf, but better." The interesting conversation is which opinions are right, not whether one tool is universally superior.
Quite frankly, I don't use protobuf because it seems like an unapproachable monolith, and I'm not at FAANG anymore, just a solo dev. No one's gonna complain if I don't. But I do love the idea of something simpler thats easy to wrap my mind around.
That's why "but fresh" hits nice to me, and I have a feeling it might be more appealing than you'd think - ex. it's hard to believe a 2 month old project is strictly better than whatever mess and history protobufs gone through with tons of engineers paid to use and work on it. It is easy to believe it covers 99% of what Protobuf does already, and any crazy edge cases that pop up (they always do, eventually :), will be easy to understand and fix.
For dense JSON: the idea is that it is often a good "default" choice because it offers a good tradeoff across 3 properties: efficiency (where it's between binary and readable JSON), persistability (safe to evolve shema without losing backward compatibility), and readability (it's low for the reasons you mentioned, but it's not as bad as a binary string). I tried to explain this tradeoff in this table: https://skir.build/docs/serialization#serialization-formats
I hear your point about the tagline "like protos but better" which I hesitated to put because it sounds presumptuous. But I am not quite sure what idea you mean to convey by "fresh"?
You’re a better man than me. If the docs can’t even get the syntax right, that’s a hard no from me.
Also, fwiw, you’ve got a few points wrong about protos. Inspecting the binary data is hard, but the tag numbers are present. You need the schema, but at least you can identify each element.
Also, I disagree on the constructor front. Proto forces you to grapple with the reality that a field may be missing. In a production system, when adding a new field, there will be a point where that field isn’t present on only one side of the network call. The compiler isn’t saving you.
Fresh is more honest than better, and personally, I wouldn’t change it.
I agree it's important for users to understand that newer fields won't be set when they deserialize old data -- whether that's with Protobuf or Skir. I disagree with the idea that not forcing you to update all constructor call sites when you add a field will help (significantly) with that. Are you saying that because Protobuf forces you to manually search for all call sites when you add a field, it forces you to think about what happens if the field is not set at deserialization, hence, it's a good thing? I'm not sure that outweighs the cost of bugs introduced by cases where you forget to update a constructor call site when you add a field to your schema.
protobuf solved serialization with schema evolution back/forward compatibility.
Skir seems to have great devex for the codegen part, but that's the least interesting aspect of protobufs. I don't see how the serialization this proposes fixes it without the numerical tagging equivalent.
The implicit version is brittle design for backwards compatibility.
People/LLMs will keep adding fields out of order and whatever has been serialised (both in client/server interaction, and stored in dbs) will be broken.