Readit News logoReadit News
madan · 5 years ago
Hey all, one of the authors here. I wanted to provide some more context on the questions asked here.

- Most of the tech described in the article is something you will have to do whether you choose gRPC or websockets. Most of the tech was about sharding and maintaining sticky connections and load balancing those connections across servers based on load.

- The heartbeat introduced on top of HTTP was needed to detect the broken connection faster and recover quickly as some of the payloads were very latency sensitive. Note that these connections are not 1:1 connections but have multiple hops across low bandwidth mobile networks.

- At the time we developed this initially in 2014, most of the WebSockets libraries would fallback to longpolling when the network connections were unstable. We explicitly went away from long polling. Since server->mobile payloads are the larger of the volumes, we settled in on SSE. Netty and the netty libraries provided most of the implementations of SSE out of the box.

lsb · 5 years ago
At a certain mobile security shop ;) there was a push service rollout, back in 2013, and it was basically pings over the network, and mailboxes in Cassandra, no Redis layer atop. The volume was much smaller (~1hz), so it's interesting to see how the ecosystem has evolved, and how the volume requirements change the shape of the system
madan · 5 years ago
Good times!
perfectstorm · 5 years ago
On iOS, what happens if the user declines to allow notifications? Can you send silent notifications even when the user declines permission?
samat · 5 years ago
Yes except when user disables 'background app refresh'. https://stackoverflow.com/questions/30644343/is-silent-remot...
madan · 5 years ago
This push is not the native push like FCM/APNS that show notifications to the users. These are payloads that update the app (e.g., location of the cars on the map or eta) and never show any notifications to the user.
pjmlp · 5 years ago
> This first generation of RAMEN server was written in Node.js using Uber’s in-house consistent hashing/sharding framework called “Ringpop.”

> ...

> Additionally, Node.js workers were single threaded and would have elevated levels of event loop lag resulting in a further delay in the convergence of membership information. These issues could result in topology information that is inconsistent and lead to message loss, timeouts, and errors.

>

> In early 2017, we decided that it was time for a reboot of the server implementation of RAMEN protocol to continue to scale. For this iteration we used the following technologies: Netty, Apache Zookeeper, Apache Helix, Redis and Apache Cassandra.

Yet another example of why starting with Java, .NET or similar stacks avoids rewrites.

me551ah · 5 years ago
I don't get the point of this article. What they achieved with the RAMEN service can easily be achieved by anyone using gRPC. At the very end of the article they even talk of a version 2 of their service which uses gRPC and seems promising.

Realtime push technically exists in every chat app on your mobile phone, so it's not a great technical feat either. Am I missing something?

malandrew · 5 years ago
This was published today, but these technical decisions appear to be as old as 2014 and there were other considerations that madan commented on here. gRPC was definitely far from mature up until like 2017 or so.
xiwenc · 5 years ago
Is it just me or they have overestimated the problem?

For Uber’s use case I don’t see a reason to build the system that requires global scale. Users are often residing in the same geolocation. Why build this massive system that works for all instead of building smaller decentralized edge systems that does slower syncs in the back?

Very nice read nonetheless.

bavell · 5 years ago
Resume-driven development?
bluesign · 5 years ago
Ramen v3 prediction:

- we separated payload from notification, now client is pulling payload on notification.

mooted1 · 5 years ago
as someone that has complained about a lot of uber's resume driven development (see comments), I can attest that this system is

a.) necessary, and saves the company millions

b.) a simplification over the ... unusually ... architected system that came before and did not scale.

there's a lot of junk technology at many large tech companies, but scaling problems /are/ real when you have hundreds of millions of users and have demanding performance and reliability requirements. RAMEN is a differentiated and necessary part of Uber's infrastructure, even if that can't be said of a lot of uber eng blog posts.

technicolorwhat · 5 years ago
What would be the second best way to do cover this use case with minimal effort instead of doing it from scratch? For lets say Amsterdam City Sized cab hailing?
atonse · 5 years ago
I didn’t see them mention why they didn’t go with the more obvious websocket for the actual last mile communication with the device, since they wanted it to be bidirectional.

My guess is because doing this in Java is a huge pain whereas it’s practically a breeze in something like elixir.

ofrzeta · 5 years ago
Using websockets with Java surely is painless, for instance with Vert.x