gepheum (u/gepheum) - Readit News

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

kentonv · 2 days ago

> flatbuffers and capnproto are in the game of trying to make serialization to binary format as efficient as possible.

Little-understood fact about Cap'n Proto: Serialization is not the game at all. The RPC system is the whole game, the serialization was just done as a sort of stunt. Indeed, unless you are mmap()ing huge files, the serialization speed doesn't really matter. Though I would say the implementation of Cap'n Proto is quite a bit simpler than Protobuf due to the serialization format just being simpler, and that in itself is a nice benefit.

The recently-released Cap'n Web jettisons the whole serialization side and focuses just on the RPC system: https://blog.cloudflare.com/capnweb-javascript-rpc-library/

(I'm the author of Cap'n Proto and Cap'n Web.)

gepheum · a day ago

I stand corrected.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

karteum · 3 days ago

Apart from the comparison with Protobuf, how does it compare to flatbuffers, capnproto, messagepack, jsonbinpack... ?

gepheum · 2 days ago

flatbuffers and capnproto are in the game of trying to make serialization to binary format as efficient as possible. Their goal is trying to beat benchmarks: how long it takes to convert an object to bytes and vice-versa. It's cool, but I personally think that for most use cases (not all), serialization efficiency shouldn't be the primary goal: serialization time is often negligible compared to time it takes to send data over the wire, and it's less important than other features (e.g. quality of the generated API) that some of these techs might neglect. I have an example to illustrate this. With Proto3, Google decided that when encoding a `string` field in C++, it would not perform UTF-8 validation. This leads to better benchmark metrics. This has also been a horrible mistake that led to many bugs which have costed so much in eng hours, since for example the same protobuf C++ API fails at deserialization when it encounters an invalid UTF-8 string.

As per messagepack, jsonbinpack, these seem to be layers on top of JSON to make JSON more compact. They still use field names for field identity, which I think can be problematic for long-term data persistence since it prevents renaming fields. I think the Protobuf/Thrift approach of using meaningless field numbers in serialization forms is better.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

cyberax · 2 days ago

Definitely interesting and seems to be a nice improvement over Protobuf. Especially for Python, Protobuf bindings for Python were made probably after taking a lot of hallucinogenic drugs.

I like constants, great addition.

Things that I'll miss:

1. Oneof fields. There are enums, but it looks like it's not possible to have ad-hoc onefos?

2. Streaming requests/responses.

3. Introspection and annotations.

4. Go bindings.

gepheum · 2 days ago

Thanks for the comment! Agree with you about the horrible Protobuf-to-Python bidding, it was a big frustration and definitely contributed to me wanting to build Skir.

1. You can create an enum with just "wrapper" fields, that's exactly like a oneof 2. Totally fair, I'm planning to work on this later this year, probably Q3 (priority is adding support to 4 more languages, and then I'll get to it) 3. So there is introspection in the 6 targeted languages, and I think I did it a bit better than protobuf because it generally has better type safety. Example in C++: https://github.com/gepheum/skir-cc-example/blob/main/string_... Typescript: https://skir.build/docs/typescript#reflection I realize I haven't documented it in Python (although it is available and generally the same API as Typescript), will fix that However, you're right that there is no support yet for annotations. Still trying to gauge whether that's needed 4. Assuming you mean Go language: working on that now, hoping to have C#, Go, Rust and Swift in the next 2-3 months.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

marvin-hansen · 3 days ago

I had my fair share of frustration with proto as well. I appreciate in Skir

GH style import. This is a big one I wish proto had in the first place. The entire idea of a proto registry feels reactive to me when, ideally, you want to pull in a versioned shared file to import that is verified by the compiler long before serve or client verifies the payload schema.

Schema validation and compatibility checks on CI. Again a big one and critical to catch issues early.

Enums done right... No further comment required.

I think with some more attention to details e.g. hammering out the gaps some other comments have identified and more language support e.g. Rust, Go, C# this can actually work out over time.

Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed.

The only concern i have is timing. Ten years ago this would have been a smash hit. These days, we have Thrift and similar meaning the bar is definitely higher. That's not necessarily bad, but one needs to be mindful about differentiation to the existing proto alternatives.

I hope this project gains trajectory and community especially from the frustrated proto folks.

gepheum · 2 days ago

Hey, thanks a lot for the comment! I share your frustration with protobuf: although I think it's great, it carries a few design flaws which are hard to fix at this point and they create pain points which are not going away.

I completely agree with you about timing, wish I had done this 10 years ago :)

"Here is an idea to contemplate as a side gig with your favorite Ai assistant: A tool to convert proto to Skir. Or at least as much as possible. As someone who had to maintain larger and complex proto files, a lot of proto specific pain points are addressed." < I tried asking Claude: "Migrate this project from protobuf to Skir, see https://skir.build/" and it works pretty well. I created http://skir.build/llms.txt which helps with this. The pain point is data migration though, and as much as I want Skir to succeed: I cannot recommend people migrating from protobuf to Skir if they have some persisted data to migrate, the effort is probably not worth it.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

ndr · 3 days ago

This seems a Chesterton's fence fail.

protobuf solved serialization with schema evolution back/forward compatibility.

Skir seems to have great devex for the codegen part, but that's the least interesting aspect of protobufs. I don't see how the serialization this proposes fixes it without the numerical tagging equivalent.

gepheum · 2 days ago

Hey, Skir does have numerical tagging, see https://skir.build/docs/language-reference#structs

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

joshuamorton · 3 days ago

How does this work if, for example, you persist the data in a database?

gepheum · 2 days ago

Let's imagine you have this:

``` struct User { id: int64; email: string?; name: string; } ```

You store some users in a database: [10,"john@gmail.com""john"], [11,"jane",null,"john@gmail.com"]

You remove the email field later:

``` struct User { id: int64; name: string; removed; } ```

Supposedly you remove a field after you have migrated all code that uses the field and you have deployed all binaries.

In your DB, you still have [10,john@gmail.com","john"], [11,null,"jane"], which you are able to deserialize fine (the email field is ignored). New values that you serialize are stored as [12,0,"jack"]. If you happen to have old binaries which still use the old email field and which are still running (which you shouldn't, but let's imagine you accidentally didn't deploy all your binaries before you removed the field), these new binaries will indeed decode the email field for new values (Jack) as an empty string instead of null.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

maxloh · 3 days ago

Looks really like Prisma to me: https://www.prisma.io/docs/orm/prisma-schema/overview#exampl...

Why build another language instead of extending an existing one?

gepheum · 3 days ago

I looked at Prisma, I very much prefer the Protobuf/Thrift model of using numbers to identify fields, which allows 2 important things: fields to be renamed without breaking backward compatibility, and a compact wire format.

I think the Protobuf language (which Skir is heavily influenced by) has some flaws in its core design, e.g. the enum/oneof mess, the fact that it allows spare field numbers which makes the "dense JSON" format (core feature of Skir) harder to get, the fact that it does not allow users to optionally specify a stable identifier to a message to get compatibility checks to work.

I get your point about "why building another language", but also that point taken too far means that we would all be programming in Haskell.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

hrmtst93837 · 3 days ago

Buf plus Protobuf already give you multi-language codegen, a compact tag-based binary format with varint encoding, gRPC service generation, and practical tools like protoc, descriptor sets, ts-proto, and buf's breaking-change checks.

If Skir wants to be more than prettier syntax it needs concrete wins, including well-specified schema evolution rules that map cleanly to the wire, clear prescriptions for numeric tag management and reserved ranges, first-class reflection and descriptor compatibility, a migration checker, and a canonical deterministic encoding for signing and deduplication. Otherwise you get another neat demo format that becomes a painful migration when ops and clients disagree on tag semantics.

gepheum · 3 days ago

Thanks for the comment. I am very familiar with Buf+Protobuf, I think it's a great system overall but has many limitations which I think can be overcome by redesigning the language from scratch instead of building on top of the .proto syntax. In the Skir vs Protobuf part of the blog post [https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t...], only 2 out of 10 pertain to "syntax" (and they're a bit more than syntax). Since you mention compatibility check, Buf's compatibility check prevents message renaming, which is a huge limitation. With Skir, that's not the case. You also get the compatibility checks verified in the IDE.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

curtisf · 3 days ago

> For optional types, 0 is decoded as the default value of the underlying type (e.g. string? decodes 0 as "", not null).

In the "dense JSON" format, isn't representing removed/absent struct fields with `0` and not `null` backwards incompatible?

If you remove or are unaware of a `int32?` field, old consumers will suddenly think the value is present as a "default" value rather than absent

gepheum · 3 days ago

That is correct and that is a good catch, the idea though is that when you remove a field you typically do that after having made sure that all code no longer read from the removed field and that all binaries have been deployed.

gepheum commented on Show HN: Skir – like Protocol Buffer but better skir.build/... · Posted by u/gepheum

nine_k · 4 days ago

If you are fine enough with protobufs so that you're not actively looking for alternatives, maybe you should not spend the effort.

gepheum · 4 days ago

+1

Copying from blog post [https://medium.com/@gepheum/i-spent-15-years-with-protobuf-t...]:

""" Should you switch from Protobuf?

Protobuf is battle-tested and excellent. If your team already runs on Protobuf and has large amounts of persisted protobuf data in databases or on disk, a full migration is often a major effort: you have to migrate both application code and stored data safely. In many cases, that cost is not worth it.

For new projects, though, the choice is open. That is where Skir can offer a meaningful long-term advantage on developer experience, schema evolution guardrails, and day-to-day ergonomics. """