Why does gRPC insist on trailers?

> As an aside, HTTP/2 is technically superior to WebSockets. HTTP/2 keeps the semantics of the web, while WS does not.

WTF is this? Those are different layer protocols. WebSocket can run on top of HTTP/2.

It's like saying TLS is technically superior to TCP, or IP is superior to copper cables.

Reference: https://www.rfc-editor.org/rfc/rfc8441.html

Matthias247 · 4 years ago

The idea here is that http provides something like request/response semantics, methods, path, status code, etc - which are all also useful for gRPC. Websockets provide none of that - they are just message streams.

Websockets over http/2 are a new thing, and haven’t even been available at the time gRPC incepted.

the_mitsuhiko · 4 years ago

I'm not sure if websockets over HTTP/2 are actually a new thing. Firefox implemented support years ago but it was disabled almost immediately afterwards because it doesn't work with proxies and has been disabled still. I think the only engine implementing it is Chrome.

As far as I know for HTTP/3 there is no way to use websockets yet.

stefan_ · 4 years ago

That is their whole point. That is why they exist. Dumb reliable pipe, please keep your silly semantics away.

rektide · 4 years ago

For sure there are good reasons to abandon or find alternative resourceful protocols!

But in general, http is & could be the de-facto really good resourceful protocol. It's already 90% there.

Alas, the browser has been a major point of obstruction & difficulty & tension in making the most obvious most successful most clearly winning resourceful protocol at all better. The browser has sat on it's haunches & pissed around & prevented obvious & straightforward incremental growth that has happened everywhere else except the browser, such as with http trailers, such as with http2+ push. The browser has kept the de-facto resouceful protocol from developing.

You dont have to believe in http as the way to see what an oppressive & stupid shitshow this is. Being able to enhance the de-facto protocol of the web better should be in everyones interest, in a way that doesnt prevent alternatives/offshoots. But right now only alternatives & offshoots have any traction, because the browsers have all shot doen & rejected doing anything to support modern http 2+ in any real capacity. Their http inplementations are all frozen in time.

jayd16 · 4 years ago

HTTP/2 provides features that websockets don't. Even if you were to use websockets over HTTP/2, you'd lose features like being able to multiplex requests _because_ it's a higher level protocol. Why is it wrong to say its better to use a more feature full and lower level protocol?

Thorrez · 4 years ago

They provide different APIs. Websockets provide a bidirectional stream of messages. The messages in a given direction are always delivered in order. If they were to be suddenly reordered that would cause a lot of headaches.

While the individual messages can't be multiplexed, different websocket streams over a single HTTP/2 connection can be multiplexed.

I think websockets also provides a feature that HTTP/2 doesn't: the ability to easily push data from the server to browser javascript.

funny_falcon · 4 years ago

TCP can't multiplex. HTTP/2 runs over TCP and does multiplexing.

WebSocket can't multiplex. Nothing prevents gRPC over WebSocket implement multiplexing itself.

remram · 4 years ago

Again, you're saying fiber optics provides features that telephony doesn't. The fact that telephony usually runs over copper cables, and telephony over fiber optics is a recent thing, doesn't change the fact that the comparison makes no sense.

Telephony (websockets) runs over the copper cables or fiber optics (HTTP/1 or HTTP/2).

anyfoo · 4 years ago

I’m not a web developer, but that RFC, which talks about bootstrapping, talks about using the CONNECT method to “transition” to the WebSockets protocol. Which matches what I thought the CONNECT method does: Switch to a protocol that is not HTTP?

But I only skimmed the introduction, did I miss something?

blibble · 4 years ago

normally uses GET and the Upgrade header, not CONNECT

kevinmgranger · 4 years ago

That's in a section specifically about picking the right transport. Per your example, it's like saying "TLS is technically superior to TCP, because it means our protocol can offload encryption and authentication to it".

Deleted Comment

jjtheblunt · 4 years ago

> WebSocket can run on top of HTTP/2

Isn't "websocket" just a standard tcp socket, whose specification to instantiate it was born in a comparatively ephemeral HTTP (of whatever version) request, and which outlives the request, so isn't on top of anything other than tcp?

wmf · 4 years ago

No, despite the name WebSockets are not plain TCP.

aaaaaaaaaaab · 4 years ago

TCP is a stream of bytes. WebSockets are message-based.

Personal opinion: RPC is a failed architectural style, independent of what serialization/marshalling of arguments is used. it failed with CORBA, it failed with ONC-RPC, it failed with Java RMI.

Remote Procedure Calls attempt to abstract away the networked nature of the function and make it "look like" a local function call. That's Just Wrong. When two networked services are communicating, the network must be considered.

REST relies on the media type, links and the limited verb set to define the resource and the state transfer operations to change the state of the resource.

HTTP explicitly incorporates the networked nature of the server/client relationship, independent of, and irrespective of, the underlying server or client implementation.

Media types, separated from the HTTP networking, define the format and serialization of the resource representation independent of the network.

HTTP/REST doesn't really support streaming.

akshayshah · 4 years ago

That's true of CORBA, for sure. I'm not familiar with ONC-RPC or Java RMI.

It's not true of gRPC. It's not "RPC" in any traditional sense - it's just a particular HTTP convention, and the clients reflect that. They're asynchronous, make deadlines and transport errors first-class concepts, and make it easy to work with HTTP headers (and trailers, as the article explains). Calling a gRPC API with a generated client often doesn't feel too different from using a client library for a REST API.

It's definitely a verb-oriented style, as opposed to REST's noun orientation. That's sometimes a plus, and sometimes a minus; it's the same "Kingdom of Nouns" debate [0] that's been going on about Java-style OOP for years.

0: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

rswail · 4 years ago

The verb-oriented style is part of the problem. Too many verbs is the problem. Java/OOP problems are not the same as the REST style, which is entirely about Nouns. There's none of the Java ManagerFactoryManager problems.

The generated client from an IDL that wraps the network protocol with a function call is also part of the problem.

REST APIs that have function calls that aren't "Send Request and wait for Response" aren't REST. ("wait for response" doesn't imply synchronous implementation, but HTTP is a request/response oriented protocol).

atombender · 4 years ago

CORBA and RMI are quite different from gRPC. (I have not used ONC-RPC.)

Both of those are explicitly centered on objects and locality transparency. The idea is that you get back references to objects, not mere copies of data. Those references act as if they're local, but the method calls actually translate into RPC calls, as the local reference is just a "stub". Objects can also contain references, meaning that you are working on entire object graphs, which can of course be cyclical, too.

These technologies (as well as Microsoft's DCOM) failed for many reasons, but it was in part because pretending remote objects are local leads to awful performance. Pretty magical and neat, but not fast. I built a whole clustered system on DCOM back in the late 1990s, and it was rather amazing, but we were also careful to not fall into the traps. One of the annoying bugs you can create is to accidentally hold on to a reference for too long in the client (by mis-implementing ref counting, for example); as long as you have a connection open to the server, this creates a server memory leak, because the server has to keep the object alive for as long as clients have references to them.

Ultimately, gRPC and other "simple" RPC technologies like Thrift are much easier to reason about precisely because they don't do this. An RPC call is just passing data as input and getting data back as output. It maps exactly to an HTTP request and response.

As for REST, almost nobody actually implements REST as originally envisioned, and APIs today are merely "RESTful", which just means they try to use HTTP verbs as intended, and represent URLs paths that map as cleanly to the nouns as possible. But I would argue that this is just RPC. Without the resource-orientation and self-describability that comes with REST, you're just doing RPC without calling it RPC.

I don't believe in REST myself (and very few people appear to, otherwise we'd have actual APIs), so I lament the fact that we haven't been able to figure out a standard RPC mechanism for the web yet. gRPC is great between non-browser programs, mind you.

rswail · 4 years ago

I don't "believe" in REST, except that when you do it "properly" with media types and links, and proper thought about the resources you identify and what the state transitions are, it all works very nicely as a request/response API style.

The difference is during the design phase, where you focus on those resources and their state, instead of the process for changing that state.

twiss · 4 years ago

> Whether it’s because I was wrong, or failed to make the argument [for HTTP trailers support], I strongly suspect organizational boundaries had a substantial effect. The Area Tech Leads of Cloud also failed to convince their peers in Chrome, and as a result, trailers were ripped out [from the WHATWG fetch specification].

FWIW, I personally think it's a good thing that other teams within Google don't have too much of an "advantage" for getting features into Chrome, compared to other web developers, however, I also think it's very unfortunate that a single Chrome engineer gets to decide not only that it shouldn't be implemented in Chromium, but that that also has the effect of it being removed from the specification. (The linked issue [1] was also opened by a Google employee.)

Of course, you might reasonably argue that, without consensus among the browsers to implement a feature, having it in the spec is useless. But nevertheless, with Chromium being an open source project, I think it would be better if it had a more democratic process of deciding which features should be supported (without, of course, requiring Google specifically to implement them, but also without, ideally, giving Google the power to veto them).

[1]: https://github.com/whatwg/fetch/issues/772

modeless · 4 years ago

It's clear that the "single engineer" thing is a lie. Many engineers commented on the Chrome issue with opposing viewpoints, and even the original post describes it being escalated to tech leads on both sides, getting more people involved. I guarantee if it was only one person standing alone opposed to trailers then they would have been overruled. As you say, it's a good thing that Chrome resists adding the pet features of every other Google team to the web.

Sure, that's fair enough. But I'm not sure if characterizing this feature as a pet feature of the gRPC team is accurate either - after all, it's simply exposing an HTTP 1.1 & H2 feature, and it was already in the WHATWG fetch spec. There, the security concerns were apparently discussed as well [1], and adding the trailer headers as a separate object was deemed safe. I haven't read the entire discussion and don't have a vested interest in it, but the WHATWG spec seems like a better place to have this discussion, and come to a conclusion, than the Chromium issue tracker.

Apparently, there is a new issue for it, so that might yet happen: [2].

[1]: https://github.com/whatwg/fetch/issues/34

[2]: https://github.com/whatwg/fetch/issues/981

guenthert · 4 years ago

Jeebus. Just because it's not true doesn't mean it's a lie.

sneak · 4 years ago

Why should the decisionmaking process for Chromium be “democratic” simply because it is open source?

Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

If Google is paying for the implementors’ time, Google should have 100% say in what code they write. You and everyone else are free (thanks to Google’s generosity) to fork it at any point in the commit history and individually veto any specific change.

> Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

Leaving aside whether that's how it should work, I'm not sure if that's in fact how it works for Chromium today. If I write a high-quality patch adding support for trailers, will it get accepted? As I understand it, the answer is no. (But I would be happy to be wrong.)

So that's my main point: it would be good to have a democratic decision making process, not for what code Googlers should write, but for what patches would get accepted into Chromium. Not just because it's open source, but also because it's the basis not just of Google's browser, but a bunch of other browsers as well.

(And note that https://www.chromium.org/ seemingly aims to give the project an air of independence from Google. Thus, I'm merely questioning whether it is, in fact, independent, and arguing that it should be, if it isn't.)

Brian_K_White · 4 years ago

The democratic process is that anyone who wants to pay for the ad campaign can try to convince everyone else that it's a good spec everyone should adopt, not merely pay a developer to code it.

If everyone else is not convinced then it should not become a thing no matter how much one party with money wants it.

fijiaarone · 4 years ago

I wanted my road to swoop up and down like a roller coaster across the gorge but a single structural engineer on the bridge team overruled my obvious benefit.

joe_guy · 4 years ago

I had never heard of HTTP trailers. So FYI

> The Trailer response header allows the sender to include additional fields at the end of chunked messages in order to supply metadata that might be dynamically generated while the message body is sent, such as a message integrity check, digital signature, or post-processing status.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Tr...

lloeki · 4 years ago

I suppose even fewer people have heard that Transfer-Encoding: chunked supports chunk extensions, which allows one to supply arbitrary metadata without trailers.

https://datatracker.ietf.org/doc/html/rfc2616#section-3.6.1

Ever went to some site that generates compressed downloads or database exports on the fly, got no progress bar as a result, and were severely annoyed by that lack of feedback? I was, so I used chunk extensions to submit a draft to emit progress information dynamically:

https://datatracker.ietf.org/doc/html/draft-lnageleisen-http...

As noted at the end of the draft, this could be generalized and extended to have additional capabilities such as in flight integrity checks, or whatever you can think of.

k__ · 4 years ago

Haven't headers and trailers been renamed in the recent past?

Yes, slightly - RFC 9110 ("HTTP Semantics") calls them "header fields" and "trailer fields," and it calls "headers" and "trailers" colloquialisms. In a nod to gRPC-style usage, the section on trailer fields even says, "Trailer fields can be useful for supplying...post-processing status information."

https://www.rfc-editor.org/rfc/rfc9110.html#header.fields

https://www.rfc-editor.org/rfc/rfc9110.html#trailer.fields

thamer · 4 years ago

A few years ago I worked on a service that had to stream data out using protobuf messages, in a single request that could potentially transfer several gigabytes of data. At the HTTP level it was chunked, but above that I used a protobuf message that contained data plus a checksum of that data, with the last message of the stream containing no data but a checksum of the entire dataset (a flag was included to differentiate between the message types).

This simple design led us to find several bugs in clients of this API (e.g. messages dropped or processed twice), and gave us a way to avoid some of the issues mentioned in this article. Even if you don't use HTTP trailers, you can still use them one layer above and benefit from similar guarantees.

vidarh · 4 years ago

Inserting metadata in the protobuf itself seems like the obvious, simple solution to avoid having to depend on what the transport layer supports. Just defining a message to provide the metadata they wanted to insert in trailers would have avoided a whole lot of pain.

chucky_z · 4 years ago

From my perspective, I think the biggest issue with gRPC is it using HTTP/2. I understand that there’s a lot of reasons to say “No, HTTP/2 is far superior to HTTP/1.1.” However, in terms of proxying _outside Google_ HTTP/2 has lagged, and continues to lags at the L7 proxy layer. I recently performed a lot of high-throughput proxying comparing HAProxy, Traefik, and Envoy. HTTP/1.1 outperformed HTTP/2 (even H2C) by a pretty fair margin. Enough that if gRPC used HTTP/1.1 we could use noticeably less hardware. I could see this holding true even with a service mesh.

thayne · 4 years ago

Also, http/2 over cleartext is not very well supported by a lot of things. Which is probably a good thing when going over the open internet. But it means you have to deal with setting up certificates even if just developing locally, and makes it more difficult to use for IPC on a single host.

mort96 · 4 years ago

My preferred setup is to have an unencrypted service running on 127.0.0.1 (so not publicly available), and then have nginx in front to handle certificates. Lets me do all certificate stuff across all virtual hosts in one place. HTTP/2 makes this impossible due to its ridiculous TLS requirement, so I, and everyone who does it the way I do, must keep using HTTP/1.1 forever.

It's my belief that requiring TLS for HTTP/2 is what killed the protocol. It just causes too much friction during both development and deployment, for little to no (or negative) performance gain.

IshKebab · 4 years ago

I agree. HTTP/2 is a huge requirement to push everywhere you want to use RPC. Want to do RPC to a microcontroller? Tough luck. Want to make RPC calls from a web page? Yeah have fun figuring out the two incompatible gRPC-web systems, setting up complicated proxies and actually finding a gRPC library that actually supports them fully.

Thrift has a much more sane design where everything is pluggable including the transport layer.

Bit of a shame that Thrift never became more popular.

arriu · 4 years ago

Yup, and this is also why many end up proxy gRPC over HTTP/1.1 after giving up on making HTTP/2 work with systems that don’t support it…

> In this flow, what was the length of the /data resource? Since we don’t have a Content-Length, we are not sure the entire response came back. If the connection was closed, does it mean it succeeded or failed? We aren’t sure.

I don’t get that argument. GRPC uses length prefixed protobuf messages. It is obvious for the peer if a complete message (inside a stream or single response) is received - with and without trailers.

The only thing that trailer support adds is the ability to send an additional late response code. That could have been added also without trailers. Just put another length prefixed block inside the body stream, and add a flag before that differentiates trailers from a message. Essentially protobuf (application messages) in protobuf (definition of the response body stream).

I assume someone thought trailers would be a neat thing that is already part of the spec and can do the job. But the bet didn’t work out since browsers and most other HTTP libraries didn’t found them worthwhile enough to fully support.

afc · 4 years ago

He offers two facts that I think explain this well enough:

> Additionally, they chose to keep Protobuf as the default wire format, but allow other encodings too.

And:

> Since streaming is a primary feature of gRPC, we often will not know the length of the response ahead of time.

These make sense; you'd enable servers to start streaming back the responses directly as they were generating them, before the length of the response could be known. Not requiring servers to hold the entire response can have drastic latency and memory/performance impact for large responses.

This doesn't match what I see in the gRPC spec. It says every message must be length-prefixed.

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2....

Disclaimer: I don't know much about gRPC.

zimpenfish · 4 years ago

> It is obvious for the peer if a complete message (inside a stream or single response) is received

If I'm reading [1] correctly, you can't distinguish between [repeated element X is empty] and [message truncated before repeated element X was received] because "A packed repeated field containing zero elements does not appear in the encoded message." You'd need X to be the last part of the message but that's not a problem because "When a message is serialized, there is no guaranteed order [...] parsers must be able to parse fields in any order".

[1] https://developers.google.com/protocol-buffers/docs/encoding...

Yes, the protobuf format makes the end ambiguous, meaning the end needs to be indicated by the protocol containing the protobuf.

But it looks to me like the gRPC spec says that everything must be prefixed by a length at the gRPC layer. So then it doesn't matter that protobuf doesn't internally indicate the end, since the gRPC transport will indicate the end.

alexcpn · 4 years ago

I use GRPC between micro services instead of REST and for that it is really great; All the deficiencies of REST - Non versioned, no typed goes away with GRPC and the protobuf is the official interface for all micro-services. No problems with this approach for over two years now; and also multi language support - We have Go and Java and Python and TypeScript micro-services now happily talking and getting new features and new interface methods updated. Maybe it was demise in the web space; but a jewel in micro-service space

marcyb5st · 4 years ago

This is more or less what stubby is/was for Google and so the original driving force in implementing it. Now, if you add a catch all service that translates the requests from the outside to Protobuffers and then forwards the translated requests to the correct service you have a GFE (Google Front-End) equivalent.

Should you do it? Probably not as it's not just a dumb translation layer and it is extremely complex (e.g. needs to support streams which is non-trivial in this situation). For Google it's worth because this way you only have to handle protobuffers beyond the GFE layer.

withinboredom · 4 years ago

With GRPC, you lose the ability to introspect the data on-the-wire. You lose the ability to create optimized data formats for YOUR application (who said you have to use JSON?) Most people can’t implement REST correctly, so it has been a shitshow for the last 20 or so years, GRPC isn’t a magic bullet, it just forces you to solve problems (or helps you to solve them) that you should have been doing in the first place. You can do all of these things without GRPC, there is no power it grants you that can’t be done better or at all in your own libraries and specs.

ackfoobar · 4 years ago

I suppose you mean "inspect" the data on-the-wire

https://grpc.io/blog/wireshark/

Wireshark can load proto files and decode the data for you.

BTW, "The Internet is running in debug mode".