Readit News logoReadit News
mnutt · 3 years ago
I was curious how this performed server-side versus protobufjs. (what we're currently using) I hastily wired it up to protobufjs's benchmark suite. (https://github.com/protobufjs/protobuf.js/tree/master/bench) The suite is pretty ancient so getting buf-compiled esm added was a challenge.

Granted, the benchmark is created for protobufjs and they probably optimize against it. Protobuf-ES was about 5.1x slower than protobufjs for encoding and 14.8x slower than protobufjs for decoding.

This was run on my M1 with node 16.14; not particularly scientific, etc, etc.

still_grokking · 3 years ago
Any benchmarks against other PB implementations? Would love to see numbers against the JVM, and than against C++ or Rust.

My gut feeling is anyway that PB in JS is likely a failure against just using JSON (which the JS runtime implements in efficient native code).

mnutt · 3 years ago
No, but anecdotally the others are probably a lot faster. And you can do even better than that with zero-copy serializations.

But protobuf has very wide support and is decent enough in js, at least server-to-server. Having a schema is very valuable, and there are substantial size wins over JSON, even with gzip.

lowbloodsugar · 3 years ago
Protobuf is really easy to write a working but slow version for.
francislavoie · 3 years ago
I gave up on protobufs years ago. The protobuf team has no idea how to write PHP and JS libraries. I got segfaults from using the PHP extension. The built-in toJSON would return invalid JSON (missing braces for binary types). Ridiculous stuff.

I really just prefer to use JSON for everything. It's much easier to debug and observe traffic (browser Network tab). I like JSON-RPC, very simple spec (basically one page long). I don't like REST.

All that said, I'm really glad to see the community take things into their own hands.

onion2k · 3 years ago
It's much easier to debug and observe traffic (browser Network tab).

The DX for JSON things is much better. The UX for protobufs is much better (faster, less data over the wire, etc). Which you optimize for is up to you, but there isn't a straightforward "Use this tech because it's the best one."

lucideer · 3 years ago
> faster, less data over the wire, etc.

I've always wondered about this. Firstly, I'm fairly sure clientside JSON parsing is significantly faster than protobuf decoding but even data over the wire: JSON can be pretty compressible so surely the gains here are going to be marginal. Surely never enough benefits to UX to warrant the DX trade off, right?

izacus · 3 years ago
protobufs have a great property of having a schema (and then generating code). Which means that it's pretty easy to setup a system where accidental change of API fails CI tests for mobile apps and web.

This is doable with JSON, but I've never seen a JSON based setup actually work well at catching these kind of regressions.

ZiiS · 3 years ago
Assuming your developer time is contained improved DX often also leads to better UX (more features). So even if you are optimizing for UX you may well be better with JSON.
nlnn · 3 years ago
I don't develop in JS so can't comment on DX there, but I've found the DX to be pretty good when using protobuf in other languages.

That's mostly been down to having IDE autocompletion for data structures and fields once the protobuf code's been generated.

For many JSON APIs I've worked with there's only been human readable documentation, making them more error prone to work with (e.g. having to either craft JSON manually for requests, or writing a client library if one doesn't already exist).

mike_hock · 3 years ago
There's also msgpack. Best of both worlds.
halfmatthalfcat · 3 years ago
So does that make GraphQL the best then? JSON + faster/less data over the wire.
asim · 3 years ago
I think protobuf really works well on the backend and specifically with compiled languages like Go or C++ as per seen by the usage at Google and adoption of gRPC for Go based cloud tooling. Beyond that it's a huge failure. The generated code and usage for other languages is not idiomatic. In fact it's a hindrance and you can see that by the lack of adoption except by the largest orgs who are enforcing it using some sort of grpc-web bridge with types for the frontend. Ultimately you can just convert proto to OpenApi specs and do a much better job at custom client libs with that.

I'm not a frontend dev. Most of my time was spent on the backend but what I'll say is I much prefer the fluidity and dynamic nature of JavaScript and the built in ability to deal with JSON that naturally become objects. All the type stuff is easy to do but with docs you can get away with not needing it.

My feeling. Protobuf lives on for gRPC server side stuff but for everywhere else OpenApi is winning.

bufbuild · 3 years ago
It's worth checking out our take on a lot of these problems: https://buf.build/blog/connect-web-protobuf-grpc-in-the-brow...
fsaintjacques · 3 years ago
JSON parsing is a minefield, especially in cross-platforms scenarios (language and/or library). You won't encounter those problems on toy project or simple CRUD applications. For example, as soon as you deal with (u)int64 where values are greater than 2^53, a simple round-trip to javascript can wreak silent havoc.

See http://seriot.ch/projects/parsing_json.html

Protobuf support for google's first-class citizen languages is usually very good, i.e. C++, Java, Python and Go. For other languages, it depends on each implementation.

RedShift1 · 3 years ago
Though you're not wrong, in what common cases are integers larger than 2^53 required?
arein3 · 3 years ago
Nice article
capableweb · 3 years ago
As always, each protocol/data format has it's place. You need to maximize the amount of data you send in each packet? Then protobuf is better than JSON. Need to support large amount of clients without any fuzz? Then JSON is better. Wanna pass around data you don't know the schema of? JSON again.

Contexts matters, there is no silver bullets, everything has trade offs and so on, and so on.

speedgoose · 3 years ago
JSON messages in a compressed websocket stream are surprisingly tiny. Bigger than compressed protobuf packets but not by much, and much smaller than uncompressed protobuf packets.
ninepoints · 3 years ago
Honestly, gzipped json is likely much smaller than uncompressed protobuf.

If you were going to use a binary protocol, why choose one that has no partial parsing/toc these days. There are much better alternatives IMO (flatbuffers being one of them)

maccard · 3 years ago
> Wanna pass around data you don't know the schema of? JSON again.

This is a false flag. If you don't know the schema on the receiving (or sending, for that matter) side, then you can't do anything with the data, other than pass it on. If you _do_ know what it looks like, then it has an implicit schema whether you call it a schema or not.

francislavoie · 3 years ago
At the time, we needed interop with C. So that's why we chose protobufs. But it was a nightmare to work with in other languages. Including C++ for cross platform desktop apps where cross compiling became a problem too.

JSON in C is unfortunately way harder than in other modern languages (e.g. Go which makes it a breeze with struct tags and a great stdlib).

depr · 3 years ago
Surely the technical requirements of my specific use case are applicable to any use case.
fuzzy2 · 3 years ago
The problem I see with JSON is its limited set of “native” types. I really wish it had specified support for proper numeric types (int, uint, various widths) and not just doubles. A timestamp type would be great as well.

What I really like about Protocol Buffers is that you must write a schema to get started. No more JSON.stringify anything. Everything else sucks though.

robertlagrant · 3 years ago
I think we could remove about a quarter of all Javascript programming time if JSON had a native Date type.
haberman · 3 years ago
Hi there, I am the primary maintainer of the PHP library as of the last few years. I have heard that there used to be a lot of crashes; the code was almost completely rewritten in 2020 and is in a much better state now. If you find a segfault and you have a repro, file a bug and we will fix it.
bitwize · 3 years ago
I recommend Capnproto. Parsing time is zero, you can pretend you're a Microsoft programmer in the early 90s and just use the in-RAM struct as your wire format. Maybe it doesn't make sense for in-browser JS applications (though WASM is a different story) but for IPC and RPC in the general case, all parsing and unparsing does is generate waste heat.

ALWAYS favor a binary format unless you have a really good reason otherwise.

kccqzy · 3 years ago
Capnproto is designed by Kenton, a former Google engineer who did a lot of work with protobufs at Google. I see Capnproto as the spiritual successor of protobuf, fixing many issues in protobufs.

Also, Capnproto is quite extensively used in some Cloudflare products.

sa46 · 3 years ago
I like protobufs but I was also disappointed at the JS protobuf options. I disliked both the JS object representation and RPC transport.

grpc-web in particular requires an Envoy proxy which seems absurdly heavyweight. I ended up using Twirp because Buf connect wasn't yet released or planned.

I rolled my own JS representation. The major differences from Connect:

- Avoid undefined if the message is not present on the wire and use an empty instance of the object instead. For recursive types, find the minimal set of fields to initialize as undefined instead of empty.

- Transparently promote some protobuf types, like google.protobuf.Timestamp to a proper Instant type (from js-joda or similar library). This makes a surprisingly large difference on reducing the number of jumps from the UI to the API.

tough · 3 years ago
What about tRPC?
francislavoie · 3 years ago
I would use tRPC if I used TypeScript in the backend. But I use PHP, so it's not viable.
artursapek · 3 years ago
your problem is that you're using PHP
francislavoie · 3 years ago
Bad take. Modern PHP is great.
arein3 · 3 years ago
Why should usual developers use protobuf instead of json? You are just making your life harder

If using compression the size is in the same ballpark (protobuf can be between 20% and 50% smaller). For 99% of users it should not make a difference. https://nilsmagnus.github.io/post/proto-json-sizes/#gzipped-...

Thaxll · 3 years ago
JSON/REST does not declare its schema, it's like talking about type vs dynamic typed language.
arein3 · 3 years ago
A subjective opinion, but it's much easier to read some documentation and checking maybe an OpenAPI spec than having to deal with protobuf.

You also have solutions like GraphQL that define a schema, or you can publish some kind of schema (a good thing to do) but use JSON instead of a binary format.

morelisp · 3 years ago
Protobuf also does not declare its schema. Message parsers can be generated from a schema, but that's also true for REST over JSON. Even ad hoc REST APIs often have better self-declaration of resource types than protobuf.

(I still like protobuf, but the schemas are a terrible reason to like it.)

arriu · 3 years ago
It has everything to do with automatic validation on both sides and little to do with the transfer size.
MrJohz · 3 years ago
But you can do automatic validation fairly easily with JSON Schema. You don't need to choose a binary format to get validation.

The principle benefit is that you can use the schema to define the data format, which means you can pack the data in more tightly (you don't need a byte to say "this is an object" if you know that the input data must be an object at this point). That's a big benefit in certain situations, but if you're using this sort of stuff just to get validation then you're probably better off using JSON Schema and having a wire transfer format that you can read easily without additional tools.

arein3 · 3 years ago
Introducing a binary format for payload validation is like shooting yourself in the foot because you have an itch.
onion2k · 3 years ago
For 99% of users it should not make a difference.

The link you included shows that protobufs are at least 15% better for all users, and as much as 57% better for cases where the data is small. Doesn't that mean for 100% of users it will actually make a difference?

Your users might not care about the difference but it will be there.

arein3 · 3 years ago
Usually when visiting a website, saving a few kilobytes on the client side on requests to backend does not make any difference.
jameshart · 3 years ago
A feature that never ships has value for 0% of users.

Actually realizing that speed up for your users will take time away from delivering features.

Engineering is a trade off, always will be.

marcosdumay · 3 years ago
> 57% better for cases where the data is small

You don't optimize things for the cases when they are fast. (Unless the gain is a couple of orders of magnitude; certainly not for a 50% speedup.)

The 15% gain is the one that matters. On practice, it comes at the expense of a more complex (thus larger, negating some of it) and less reliable system. It is very rare that this trade-off is worth it.

jtolmar · 3 years ago
You'd also have to compare this against the download size of the protobuf library itself.

Deleted Comment

endtime · 3 years ago
protobuf is much more concise and readable than OAS. You can define API contracts in protobuf and still serve JSON APIs via the standard-ish gRPC/JSON transcoding enabled by google.api annotations.
dboreham · 3 years ago
To talk to a server that doesn't speak json.
lolinder · 3 years ago
This only makes sense if you have a server that someone else put together that for some reason only speaks protobuf. I'm not aware of any language ecosystem that has protocol buffers but no json support, so if you're building a server from scratch this isn't a good reason to use protobufs.

And if you are faced with a server that only speaks protobuf, the same question applies to the original devs: why did they make that decision?

arein3 · 3 years ago
For non-niche use cases that is a bad developer experience.

If you are designing your own solution that uses protobuf instead of JSON say goodbye to a range of useful tools that the whole industry uses. From testing to automation it will be harder at every step, and you will have to find custom solutions instead of usual no-customization solution that works OOTB with JSON.

It is a good way to frustrate your developers and generate sometimes brittle solutions related to testing/automation/infrastructure.

Deleted Comment

tekkk · 3 years ago
I am using it for sending data between game server and client. Encoding the messages in JSON would be just silly, although I wonder what is the standard in the game industry.
depr · 3 years ago
Protocol buffers are used in Dark Souls 3, Pokemon GO, Hearthstone and I'm sure many other games.
cdelsolar · 3 years ago
we use it at https://woogles.io for pretty much all communication (server-to-server and client-to-server). I do loathe dealing with the JS aspect of it and am very excited to move over to protocol-es after reading this article (and shaving off a ton of repeated code and generated code).
arein3 · 3 years ago
Your case is one of those 1% if you have a real time game where a fraction of a second is important.
soylentgraham · 3 years ago
Large blocks of data. (Eg 10,000 floats)

Otherwise personally json wins

cdelsolar · 3 years ago
nothing to do with the size, but with having robust schemata.
yawnxyz · 3 years ago
I keep trying to understand and use protobuf but every time I look at it and its API (this article included) I get more confused and have absolutely no idea how to implement it.

I can't tell whether I'm just dumb or a really terrible developer, or if the docs or the thing itself is really hard to use?

izacus · 3 years ago
There are a few tricks to make them successful:

1. Your schema is the source of truth.

2. The protoc should generate code as part of your build (try not to check in generated proto code if at all possible).

3. Use generated code to output bytes/parse bytes (this depends on your HTTP/RPC library).

The other trick is that you should use the exact same (!) schema file for your frontend and backend projects. This means that changing it should trigger regeneration of generated code for your clients and servers and then run CI on them.

So if you accidentally introduce a breaking API change, the CI for broken client will fail before you deploy it.

quietbritishjim · 3 years ago
> The other trick is that you should use the exact same (!) schema file for your frontend and backend projects. This means that changing it should trigger regeneration of generated code for your clients and servers and then run CI on them.

You do not need to have the exact same schema file, in fact protobuf is carefully designed to avoid needing this. You need to follow some rules about what to do when fields are added or removed:

* Generally, roll out the server side first then, once that is complete, start rolling out the client afterwards.

* If a field is added (on the server side), make sure that it can be ignored on the client side, so old clients are not impacted. For example, don't add a "units" field that changes the meaning of existing "temperature" field (previously had to be fahrenheit, now can be celsius or fahrenheit). Instead add a separate field "temperature_celsius" and send both. (You can always remove the old one later on the server if new clients don't need it and you have 100% finished roll out of clients.) Note that receiving unexpected field data is not an error in protobuf, so the extra field won't cause any problems so long as it's not a problem at application level.

* You can equally remove a field so long as the client isn't relying on it (in this case you may need to roll out client update first). More accurately (with proto3 syntax) it will appear as empty/zero so this needs to be OK.

* You can't change a field's type e.g. from integer to double (or from one message type to another, but just adding a field to a message according to the above is OK). If you want to do that, go through a controlled process of adding a new field with the new type you want then removing the old field.

* You are free to reorganise the order fields appear in the proto file but don't renumber the fields - the field number is what defines it in the binary encoding. In particular, if you remove field number 2 (for example) you should leave a gap (fields 1, 3, 4,... remaining) rather than renumbering the remaining ones to be contiguous.

Depending on the application, it is often actually a good idea to have a completely separate copy of the proto file in the client and server applications, with the client proto typically lagging behind the server one.

smaye81 · 3 years ago
I can empathize. I was the same way at first. What is it that you find confusing? Perhaps we can help clear it up or link you to helpful documentation (or improve our own docs).
osigurdson · 3 years ago
Maybe what could be added is a debug header when using grpc. If it is present, the proto schema is sent with each request / response. Then the tooling can be enhanced to look for this.

I suspect this would not be much heavier than json so it could be always left on for those who are ok with the overhead.

Win win?

SpaghettiX · 3 years ago
Protobufjs is good, but I can't use it because it's only a protobuf library, not a gRPC library. I end up having to use grpc-web, with all the problems it comes with.

I was hoping Buf could solve that problem... Maybe in the future! :)

conroy · 3 years ago
They already have! Connect (https://github.com/bufbuild/connect-web) is what you're looking for, as it's grpc-web compatible.
f_devd · 3 years ago
The same reason along with the fact that you had to generate code, as well as usually needing to convert it to a class afterward was the reason I wrote my own typescript-native binary serializer[0] (mostly based on C-FFI for compatibility) a few years ago.

[0]: https://github.com/i404788/honeybuf

kamilafsar · 3 years ago
Shameless plug to my project Phero [0]. It’s a bit like gRPC but specifically for full stack TypeScript projects.

It has a minimal API, literally one function, with which you can expose your server’s functions. It will generate a Typesafe SDK for your frontend(s), packed with all models you’re using. It will also generate a server which will automatically validate input & output to your server.

One thing I’ve seen no other similar solution do is the way we do error handling: throw an error on the server and catch it on the client as if it was a local error.

As I said, it’s only meant for teams who have full stack TypeScript. For teams with polyglot stacks an intermediate like protobuf or GraphQL might make more sense. We generate a TS declaration file instead.

[0] https://github.com/phero-hq/phero

throwthere · 3 years ago
tRPC is another similar library.

https://trpc.io/docs/v10/quickstart

jasperpressplay · 3 years ago
There’re some key differences though, one being you can use plain typescript types to define your models, instead of a validation lib like zod :)