Readit News logoReadit News
Posted by u/gepheum 4 days ago
Show HN: Skir – like Protocol Buffer but betterskir.build/...
Why I built Skir: https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t...

Quick start: npx skir init

All the config lives in one YML file.

Website: https://skir.build

GitHub: https://github.com/gepheum/skir

Would love feedback especially from teams running mixed-language stacks.

shalabhc · 4 days ago
Did you look at other formats like Avro, Ion etc? Some feedback:

1. Dense json

Interesting idea. You can also just keep the compact binary if you just tag each payload with a schema id (see Avro). This also allows a generic reader to decode any binary format by reading the schema and then interpreting the binary payload, which is really useful. A secondary benefit is you never ever misinterpret a payload. I have seen bugs with protobufs misinterpreted since there is no connection handshake and interpretation is akin to 'cast'.

2. Compatibility checks

+100 there's not reason to allow breaking changes by default

3. Adding fields to a type: should you have to update all call sites?

I'm not so sure this is the right default. If I add a field to a core type used by 10 services, this requires rebuilding and deploying all of them.

4. enum looks great. what about backcompat when adding new enum fields? or sometimes when you need to 'upgrade' an atomic to an enum?

gepheum · 4 days ago
Thanks for the feedback.

0. Yes, I looked at Avro, Ion. I like Protobuf much better because I think using field numbers for field identity, meaning being able to rename fields freely, is a must.

1. Yes. Skir also supports that with binary format (you can serialize and deserialize a Skir schema to JSON, which then allows you to convert from binary format to readable JSON). It just requires to build many layers of extra tooling which can be painful. For example, if you store your data in some SQL engine X, you won't be able to quickly visualize your data with a simple SELECT statement, you need to build the tooling which will allow you to visualize the data. Now dense JSON is obviously not idea for this use case, because you don't see the field names, but for quick debugging I find it's "good enough".

3. I agree there are definitely cases where it can be painful, but I think the cases where it actually is helpful are more numerous. One thing worth noting is that you can "opt-out" of this feature by using `ClassName.partial(...)` instead of `ClassName()` at construction time. See for example `User.partial(...)` here: https://skir.build/docs/python#frozen-structs I mostly added this feature for unit tests, where you want to easily create some objects with only some fields set and not be bothered if new fields are added to the schema.

4. Good question. I guess you mean "forward compatibility": you add a new field to the enum, not all binaries are deployed at the same time, and some old binary encounters the new enum it doesn't know about? I do like Protobuf does: I default to the UNKNOWN enum. More on this: - https://skir.build/docs/schema-evolution#adding-variants-to-... - https://skir.build/docs/schema-evolution#default-behavior-dr... - https://skir.build/docs/protobuf#implicit-unknown-variant

shalabhc · 3 days ago
> meaning being able to rename fields freely, is a must.

avro supports field renames though.

3. on second thought i believe you'd only have to deploy when you choose. the next build will force you to provide values (or opt into the default). so forcing inspection of construction sites seems good.

drathier · 4 days ago
https://capnproto.org/ has been my goto since forever. Made by the protobuf inventor
jauntywundrkind · 4 days ago
I've been dabbling with the newer Cap'n Web, whose nicely descriptive README's first line says:

> Cap'n Web is a spiritual sibling to Cap'n Proto (and is created by the same author), but designed to play nice in the web stack.

It's just JSON, which has up and down sides. But things like promise pipelining are such a huge upside versus everything else: you can refer to results (and maybe send them around?) and kick off new work based on those results, before you even get the result back.

This is far far far superior to everything else, totally different ball-game.

I've been a little rebuffed by wasm when I try, keep getting too close to some gravitational event horizon & get sucked in & give up, but for more data-throughput oriented systems, I'm still hoping wrpc ends up being a fantastic pick. https://github.com/bytecodealliance/wrpc . Also Apache Arrow Flight, which I know less about, has mad traction in serious data-throughput systems, which being adjacent to amazingly popular Apache Arrow makes sense. https://arrow.apache.org/docs/format/Flight.html

k_g_b_ · 4 days ago
*Made by the proto2 implementor, Kenton Varda

https://news.ycombinator.com/user?id=kentonv

dewey · 4 days ago
> Skir is a universal language for representing data types, constants, and RPC interfaces. Define your schema once in a .skir file and generate idiomatic, type-safe code in TypeScript, Python, Java, C++, and more.

Maybe I'm missing some additional features but that's exactly what https://buf.build/plugins/typescript does for Protobuf already, with the advantage that you can just keep Protobuf and all the battle hardened tooling that comes with it.

nine_k · 4 days ago
The entire original post, it seems, is dedicated to explaining why Skir is better than plain Protobuf, with examples of all the well-known pain points. If these are not persuasive for you, staying with Protobuf (or just JSON) should be a fine choice.

Dead Comment

gepheum · 4 days ago
Skir has exactly the same goals as Protobuf, so yes, that sentence can apply to Protobuf as well (and buf.build). I listed some of the reasons to prefer Skir over Protobuf in my humble opinion here: https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t... Built-in compatibility checks, the fact that you can import dependencies from other projects (buf.build offers this, but makes you pay for it), some language designs (around enum/oneof, the fact that adding fields forces you to update all constructor code sites), the dense JSON format, are examples.
hrmtst93837 · 3 days ago
Buf plus Protobuf already give you multi-language codegen, a compact tag-based binary format with varint encoding, gRPC service generation, and practical tools like protoc, descriptor sets, ts-proto, and buf's breaking-change checks.

If Skir wants to be more than prettier syntax it needs concrete wins, including well-specified schema evolution rules that map cleanly to the wire, clear prescriptions for numeric tag management and reserved ranges, first-class reflection and descriptor compatibility, a migration checker, and a canonical deterministic encoding for signing and deduplication. Otherwise you get another neat demo format that becomes a painful migration when ops and clients disagree on tag semantics.

gepheum · 3 days ago
Thanks for the comment. I am very familiar with Buf+Protobuf, I think it's a great system overall but has many limitations which I think can be overcome by redesigning the language from scratch instead of building on top of the .proto syntax. In the Skir vs Protobuf part of the blog post [https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t...], only 2 out of 10 pertain to "syntax" (and they're a bit more than syntax). Since you mention compatibility check, Buf's compatibility check prevents message renaming, which is a huge limitation. With Skir, that's not the case. You also get the compatibility checks verified in the IDE.

Deleted Comment

curtisf · 3 days ago
> For optional types, 0 is decoded as the default value of the underlying type (e.g. string? decodes 0 as "", not null).

In the "dense JSON" format, isn't representing removed/absent struct fields with `0` and not `null` backwards incompatible?

If you remove or are unaware of a `int32?` field, old consumers will suddenly think the value is present as a "default" value rather than absent

gepheum · 3 days ago
That is correct and that is a good catch, the idea though is that when you remove a field you typically do that after having made sure that all code no longer read from the removed field and that all binaries have been deployed.
joshuamorton · 3 days ago
How does this work if, for example, you persist the data in a database?
marvin-hansen · 3 days ago
I had my fair share of frustration with proto as well. I appreciate in Skir

GH style import. This is a big one I wish proto had in the first place. The entire idea of a proto registry feels reactive to me when, ideally, you want to pull in a versioned shared file to import that is verified by the compiler long before serve or client verifies the payload schema.

Schema validation and compatibility checks on CI. Again a big one and critical to catch issues early.

Enums done right... No further comment required.

I think with some more attention to details e.g. hammering out the gaps some other comments have identified and more language support e.g. Rust, Go, C# this can actually work out over time.

Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed.

The only concern i have is timing. Ten years ago this would have been a smash hit. These days, we have Thrift and similar meaning the bar is definitely higher. That's not necessarily bad, but one needs to be mindful about differentiation to the existing proto alternatives.

I hope this project gains trajectory and community especially from the frustrated proto folks.

gepheum · 2 days ago
Hey, thanks a lot for the comment! I share your frustration with protobuf: although I think it's great, it carries a few design flaws which are hard to fix at this point and they create pain points which are not going away.

I completely agree with you about timing, wish I had done this 10 years ago :)

"Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed." < I tried asking Claude: "Migrate this project from protobuf to Skir, see https://skir.build/" and it works pretty well. I created http://skir.build/llms.txt which helps with this. The pain point is data migration though, and as much as I want Skir to succeed: I cannot recommend people migrating from protobuf to Skir if they have some persisted data to migrate, the effort is probably not worth it.

refulgentis · 4 days ago
I spent some time in the actual compiler source. There's real work here, genuinely good ideas.

The best thing Skir does is strict generated constructors. You add a field, every construction site lights up. Protobuf's "silently default everything" model has caused mass production incidents at real companies. This is a legitimately better default.

Dense JSON is interesting but the docs gloss over the tradeoff: your serialized data is [3, 4, "P"]. If you ever lose your schema, or a human needs to read a payload in a log, you're staring at unlabeled arrays. Protobuf binary has the same problem but nobody markets binary as "easy to inspect with standard tools." The "serialize now, deserialize in 100 years" claim has a real asterisk. Compatibility checking requires you to opt into stable record IDs and maintain snapshots. If you skip that (and the docs' own examples often do), the CLI literally warns you: "breaking changes cannot be detected." So it's less "built-in safety" and more "safety available if you follow the discipline." Which is... also what Protobuf offers.

The Rust-style enum unification is genuinely cleaner than Protobuf's enum/oneof split. No notes there, that's just better language design.

Minor thing that bothered me disproportionately: the constant syntax in the docs (x = 600) doesn't match what the parser actually accepts (x: 600).

The weirdest thing that bugged the heck out of me was the tagline, "like protos but better", that's doing the project no favors.

I think this would land better if it were positioned as "Protobuf, but fresh" rather than "Protobuf, but better." The interesting conversation is which opinions are right, not whether one tool is universally superior.

Quite frankly, I don't use protobuf because it seems like an unapproachable monolith, and I'm not at FAANG anymore, just a solo dev. No one's gonna complain if I don't. But I do love the idea of something simpler thats easy to wrap my mind around.

That's why "but fresh" hits nice to me, and I have a feeling it might be more appealing than you'd think - ex. it's hard to believe a 2 month old project is strictly better than whatever mess and history protobufs gone through with tons of engineers paid to use and work on it. It is easy to believe it covers 99% of what Protobuf does already, and any crazy edge cases that pop up (they always do, eventually :), will be easy to understand and fix.

gepheum · 4 days ago
Thank you so much for taking the time to dig into the compile source code and the thorough comment you left.

For dense JSON: the idea is that it is often a good "default" choice because it offers a good tradeoff across 3 properties: efficiency (where it's between binary and readable JSON), persistability (safe to evolve shema without losing backward compatibility), and readability (it's low for the reasons you mentioned, but it's not as bad as a binary string). I tried to explain this tradeoff in this table: https://skir.build/docs/serialization#serialization-formats

I hear your point about the tagline "like protos but better" which I hesitated to put because it sounds presumptuous. But I am not quite sure what idea you mean to convey by "fresh"?

computomatic · 4 days ago
Not the parent but I infer “fresh” as meaning a new approach to an old problem (with the benefits of experience baked in). A synonym of “modern” without the baggage.
refulgentis · 4 days ago
Cheers :) (other replier was right on "fresh", "fresh" definitely wasn't right)
gepheum · 4 days ago
Also, thank you for flagging the constant syntax problem (x = 600) on the website. Fixed.
vineyardmike · 4 days ago
> Minor thing that bothered me disproportionately: the constant syntax in the docs (x = 600) doesn't match what the parser actually accepts (x: 600).

You’re a better man than me. If the docs can’t even get the syntax right, that’s a hard no from me.

Also, fwiw, you’ve got a few points wrong about protos. Inspecting the binary data is hard, but the tag numbers are present. You need the schema, but at least you can identify each element.

Also, I disagree on the constructor front. Proto forces you to grapple with the reality that a field may be missing. In a production system, when adding a new field, there will be a point where that field isn’t present on only one side of the network call. The compiler isn’t saving you.

Fresh is more honest than better, and personally, I wouldn’t change it.

gepheum · 4 days ago
> Also, I disagree on the constructor front. Proto forces you to grapple with the reality that a field may be missing. In a production system, when adding a new field, there will be a point where that field isn’t present on only one side of the network call. The compiler isn’t saving you.

I agree it's important for users to understand that newer fields won't be set when they deserialize old data -- whether that's with Protobuf or Skir. I disagree with the idea that not forcing you to update all constructor call sites when you add a field will help (significantly) with that. Are you saying that because Protobuf forces you to manually search for all call sites when you add a field, it forces you to think about what happens if the field is not set at deserialization, hence, it's a good thing? I'm not sure that outweighs the cost of bugs introduced by cases where you forget to update a constructor call site when you add a field to your schema.

poly2it · 4 days ago
I would recommend exploring OpenRPC for those who have not yet seen it. It brings protocol-buffer-like definitions (components), RPC definitions and centralised error definitions.
ndr · 3 days ago
This seems a Chesterton's fence fail.

protobuf solved serialization with schema evolution back/forward compatibility.

Skir seems to have great devex for the codegen part, but that's the least interesting aspect of protobufs. I don't see how the serialization this proposes fixes it without the numerical tagging equivalent.

gepheum · 2 days ago
Hey, Skir does have numerical tagging, see https://skir.build/docs/language-reference#structs
ndr · 2 days ago
This seems new and retrofit.

The implicit version is brittle design for backwards compatibility.

People/LLMs will keep adding fields out of order and whatever has been serialised (both in client/server interaction, and stored in dbs) will be broken.