Readit News logoReadit News
Posted by u/sewen 2 years ago
Show HN: Restate – Low-latency durable workflows for JavaScript/Java, in Rustrestate.dev/...
We'd love to share our work with you: Restate, a system for workflows-as-code (durable execution). With SDKs in JS/Java/Kotlin and a lightweight runtime built in Rust/Tokio.

https://github.com/restatedev/https://restate.dev/

It is free and open, SDKs are MIT-licensed, runtime permissive BSL (basically just the minimal Amazon defense). We worked on that for a bit over a year. A few points I think are worth mentioning:

- Restate's runtime is a single binary, self-contained, no dependencies aside from a durable disk. It contains basically a lightweight integrated version of a durable log, workflow state machine, state storage, etc. That makes it very compact and easy to run both on a laptop and a server.

- Restate implements durable execution not only for workflows, but the core building block is durable RPC handlers (or event handler). It adds a few concepts on top of durable execution, like virtual objects (turn RPC handlers into virtual actors), durable communication, and durable promises. Here are more details: https://restate.dev/programming-model

- Core design goal for APIs was to keep a familiar style. An app developer should look at Restate examples and say "hey, that looks quite familiar". You can let us know if that worked out.

- Basically every operation (handler invocation, step, ...) goes through a consensus layer, for a high degree of resilience and consistency.

- The lightweight log-centric architecture gives Restate still good latencies: For example around 50ms roundtrip (invoke to result) for a 3-step durable workflow handler (Restate on EBS with fsync for every step).

We'd love to hear what you think of it!

BenoitP · 2 years ago
For context (because he's too good to brag) OP is among the original creators of Apache Flink.

Question for OP: I'd bet Flink's Statefuns comes in Restate's story. Could you please comment on this? Maybe Statefuns we're sort of a plugin, and you guys wanted to rebase to the core of a distributed function?

sewen · 2 years ago
Thank you!

Yes, Flink Stateful Functions were a first experiment to build a system for the use cases we have here. Specifically in Virtual Objects you can see that legacy.

With Stateful Functions, we quickly realized that we needed something built for transactions, while Flink is built for analytics. That manifests in many ways, maybe most obviously in the latency: Transactional durability takes seconds in Flink (checkpoint interval) and milliseconds in Restate.

Also, we could give Restate a very different dev ex, more compatible with modern app development. Flink comes from a data engineering side, very different set of integrations, tools, etc.

mikeqq2024 · 2 years ago
Does the efficiency come from the raft implementation of distributed transactions or something else?
pavel_pt · 2 years ago
I hope @sewen will expand on this but from the blog post he wrote to announce Restate to the world back in August '23:

> Stateful Functions (in Apache Flink): Our thoughts started a while back, and our early experiments created StateFun. These thoughts and ideas then grew to be much much more now, resulting in Restate. Of course, you can still recognize some of the StateFun roots in Restate.

The full post is at: https://restate.dev/blog/why-we-built-restate/

sewen · 2 years ago
A few links worth sharing here:

- Blog post with an overview of Restate 1.0: https://restate.dev/blog/announcing-restate-1.0-restate-clou...

- Restate docs: https://docs.restate.dev/

- Discord, for anyone who wants to chat interactively: https://discord.com/invite/skW3AZ6uGd

yaj54 · 2 years ago
how do tools like this handle evolving workflows? e.g., if I have a "durable worklflow" that sleeps for a month and then performs its next actions, what do I do if I need to change the workflow during that month? I really like the concept but this seems like an issue for anything except fairly short workflows. If I keep my data and algorithms separate I can modify my event handling code while workflows are "active."
p10jkle · 2 years ago
I wrote two blog posts on this! It's a really hard problem

https://restate.dev/blog/solving-durable-executions-immutabi...

https://restate.dev/blog/code-that-sleeps-for-a-month/

The key takeaways:

1. Immutable code platforms (like Lambda) make things much more tractable - old code being executable for 'as long as your handlers run' is the property you need. This can also be achieved in Kubernetes with some clever controllers

2. The ability to make delayed RPCs and span time that way allows you to make your handlers very short running, but take action over very long periods. This is much superior to just sleeping over and over in a loop - instead, you do delayed tail calls.

delusional · 2 years ago
> Immutable code platforms (like Lambda) make things much more tractable

My job is admittedly very old-school, but is that actually doable? I dont think my stakeholders would accept a version of "well we can't fix this bug for our current customers, but the new ones wont have it". That just seems like a chaos nobody wants to deal with.

yaj54 · 2 years ago
ah! this took me a second to grok, but from #2 above: "we just want to send the email service a request that we want to be processed in a month. The thing that hangs around ‘in-flight’ wouldn’t be a journal of a partially-completed workflow, with potentially many steps, but instead a single request message."

I'll have to think through how much that solves, but it's a new insight for me - thanks!

I like that you're working on this. seems tricky, but figuring out how to clearly write workflows using this pattern could tame a lot of complexity.

rockostrich · 2 years ago
My org solved this problem for our use case (handling travel booking) by versioning workflow runs. Most of our runs are very shortlived but there are cases where we have a run that lasts for days because of some long running polling process e.g. waiting on a human to perform some kind of action.

If we deploy a new version of the workflow, we just keep around the existing deployed version until all of its in-flight runs are completed. Usually this can be done within a few minutes but sometimes we need to wait days.

We don't actually tie service releases 1:1 with the workflow versions just in case we need a hotfix for a given workflow version, but the general pattern has worked very well for our use cases.

p10jkle · 2 years ago
Yeah, this is pretty much exactly how we propose its done (restate services are inherently versioned, you can register new code as a new version and old invocations will go to the old version).

The only caveat being that we generally recommend that you keep it to just a few minutes, and use delayed calls and our state primitives to have effects that span longer than that. Eg, to poll repeatedly a handler can delayed-call itself over and over, and to wait for a human, we have awakeables (https://docs.restate.dev/develop/ts/awakeables/)

More discussion: https://restate.dev/blog/code-that-sleeps-for-a-month/

delusional · 2 years ago
Conceptually I think the only thing these tools add on to the mental model of separation of data and logic is that they also store the name of next routine to call. The name is late bond, so migration would amount to switching out the implementation of that procedure.
pavel_pt · 2 years ago
Restate also stores a deployment version along with other invocation metadata. FaaS platforms like AWS Lambda make it very easy to retain old versions of your code, and Restate will complete a started invocation with the handlers that it started with. This way, you can "drain" older executions while new incoming requests are routed to the latest version.

You still have to ensure that all versions of handler code that may potentially be activated are fully compatible with all persisted state they may be expected to access, but that's not much different from handling rolling deployments in a large system.

p10jkle · 2 years ago
not necessarily - we store the intermediary states of your handler, so it can be replayed on infrastructure failures. if the handler changes in what it does, those intermediary states (the 'journal') might no longer match this. the best solution is to route replayed requests to the version of the code that originally executed the request, but: 1. many infra platforms dont allow you to execute previous versions 2. after some duration (maybe just minutes), executing old code is dangerous, eg because of insecure dependencies.
senorrib · 2 years ago
Looks very interesting, but calling it Open Source is misleading. BSL is not "minimal Amazon defense". It effectively prevents any meaningful dynamic functionality to be built on top of it without a commercial subscription.
stsffap · 2 years ago
We tried to design the additional usage grant (https://github.com/restatedev/restate/blob/39f34753be0e27af8...) as permissive as possible. Our intention is to only prevent the big cloud service providers from offering Restate as a managed service as it has happened in the past with other open source projects. If you find the additional usage grant still too restrictive, then let us talk how to adjust it to enable you while still maintaining our initial intention.
senorrib · 2 years ago
Our use case is to allow users to customize workflows based on a few building blocks. Think of an ERP that would allow users to add or remove steps or different paths to their payroll workflow, for example.

The wording in the additional grant labels software like this as an Application Platform Service -- which is fair, and perhaps intended, but we're still not a big cloud service provider.

hintymad · 2 years ago
I'm not sure "In Rust" serve any marketing value. A product's success rarely has to do with the use of a programming language, if not at all. I understand the arguments made by Paul Graham on the effectiveness of programming languages, but specifically for a workflow manager, a user like me cares literally zero about which programming language the workflow system uses even if I have to hack into the internal of the system, and latency really matters a lot less than throughput.
tempaccount420 · 2 years ago
You are free to ignore it. Personally I like to see new projects be made in Rust, because it means they're easier to contribute to than projects in other unmanaged non-GC languages.
threeseed · 2 years ago
Having spent a lot of time recently writing Rust it's a major negative for me.

It's a terrible language for concurrency and transitive dependencies can cause panics which you often can't recover from.

Which means the entire ecosystem is like sitting on old dynamite waiting to explode.

JVM really has proven itself to be by far the best choice for high-concurrency, back-end applications.

swyx · 2 years ago
it does if it makes Hners click upvote...
bilalq · 2 years ago
Could you share details on limits to be mindful of when designing workflows? Some things I'd love to be able to reference at a glance:

1. Max execution duration of a workflow

2. Max input/output payload size in bytes for a service invocation

3. Max timeout for a service invocation

4. Max number of allowed state transitions in a workflow

5. Max Journal history retention time

stsffap · 2 years ago
1. There is no maximum execution duration for a Restate workflow. Workflows can run only for a few seconds or span months with Restate. One thing to keep in mind for long-running workflows is that you might have to evolve the code over its lifetime. That's why we recommend writing them as a sequence of delayed tail calls (https://news.ycombinator.com/item?id=40659687)

2. Restate currently does not impose a strict size limit for input/output messages by default (it has the option to limit it though to protect the system). Nevertheless, it is recommended to not go overboard with the input/output sizes because Restate needs to send the input messages to the service endpoint in order to invoke it. Thus, the larger the input/output sizes, the longer it takes to invoke a service handler and sending the result back to the user (increasing latency). Right now we do issue a soft warning whenever a message becomes larger than 10 MB.

3. If the user does not specify a timeout for its call to Restate, then the system won't time it out. Of course, for long-running invocations it can happen that the external client fails or its connection gets interrupted. In this case, Restate allows to re-attach to an ongoing invocation or to retrieve its result if it completed in the meantime.

4. There is no limit on the max number of state transitions of a workflow in Restate.

5. Restate keeps the journal history around for as long as the invocation/workflow is ongoing. Once the workflow completes, we will drop the journal but keep the completed result for 24 hours.

sewen · 2 years ago
For a many of those values, the answer would be "as much as you like", but with awareness for tradeoffs.

You can store a lot of data in Restate (workflow events, steps). Logged events move quickly to an embedded RocksDB, which is very scalable per node. The architecture is partitioned, and while we have not finished all the multi-node features yet, everything internally is build in a partitioned scalable manner.

So it is less a question of what the system can do, maybe more what you want:

- if you keep tens of thousands of journal entries, replays might take a bit of time. (Side note, you also don't need that, Restate's support for explicit state gives you an intuitive alternative to the "forever running infinite journal" workflow pattern some other systems promote.)

- Execution duration for a workflow is not limited by default. More of a question of how long do you want to keep instances older versions of the business logic around?

- History retention (we do this only for tasks of the "workflow" type right now) as much as you are willing to invest into for storage. RocksDB is decent at letting old data flow down the LSM tree and not get in the way.

Coming up with the best possible defaults would be something we'd appreciate some feedback on, so would love to chat more on Discord: https://discord.gg/skW3AZ6uGd

The only one where I think we need (and have) a hard limit is the message size, because this can adversely affect system stability, if you have many handlers with very large messages active. This would eventually need a feature like out-of-band transport for large messages (e.g., through S3).

bilalq · 2 years ago
I still haven't gotten around to adopting Restate yet, but it's on the radar. One thing that Step Functions probably has over Restate is the diagram visualization of your state machine definition and execution history. It's been really neat to be able to zero in on a root cause at the conceptual level instead of the implementation level.

One big hangup for me is that there's only a single node orchestrator as a CDK construct. Having a HA setup would be a must for business critical flows.

I stumbled on Restate a few months ago and left the following message on their discord.

> I was considering writing a framework that would let you author AWS Step Functions workflows as code in a typesafe way when I stumbled on Restate. This looks really interesting and the blog posts show that the team really understands the problem space.

> My own background in this domain was as an early user of AWS SWF internally at AWS many, many years ago. We were incredibly frustrated by the AWS Flow framework built on top of SWF, so I ended up creating a meta Java framework that let you express workflows as code with true type-safety, arrow function based step delegations, and leveraging Either/Maybe/Promise and other monads for expressiveness. The DX was leaps and bounds better than anything else out at the time. This was back around 2015, I think.

> Fast-forward to today, I'm now running a startup that uses AWS Step Functions. It has some benefits, the most notable being that it's fully serverless. However, the lack of type-safety is incredibly frustrating. An innocent looking change can easily result in States.Runtime errors that cannot be caught and ignore all your catch-error logic. Then, of course, is how ridiculous it feels to write logic in JSON or a JSON-builder using CDK. As if that wasn't bad enough, the pricing is also quite steep. $25 for every million state transitions feels like a lot when you need to create so many extra state transitions for common patterns like sagas, choice branches, etc.

> I'm looking forward to seeing how Restate matures!

p10jkle · 2 years ago
A visualisation/dashboard is a top priority! Distributed architecture (to support multiple nodes for HA and horizontal scaling) is being actively worked on and will land in the coming months
bilalq · 2 years ago
That's exciting!

Out of curiosity, have you explored the possibility of a serverless orchestration layer? That's one of the most appealing parts of Step Functions. We have many large workflows that run just a couple times a day and take several hours alongside a few short workflows that run under a minute and are executed more frequently during peak hours. Step Functions ends up being really cost effective even through many state transitions because most of the time, the orchestrator is idle.

Coming from an existing setup where everything is serverless, the fixed cost to add serverfull stuff feels like a lot. For a HA setup, it'd be 3 EC2 instances and 3 NAT gateways spread across 3 AZs. Then multiply that for each environment and dev account, and it ends up being pretty steep. You can cut costs a bit by going single AZ for non-prod envs, but still...

I couldn't find a pricing model for Restate Cloud, but I'm including "managed services" under the definition of serverless for my purposes. Maybe that offering can fill the gap, but then it does raise security concerns if the orchestration is not happening on our own infra.

aleksiy123 · 2 years ago
Looks really awesome. Always been looking for some easy to use async workflows + cronjobs service to use with serverless like Vercel.

Also something about this area always makes me excited. I guess it must be the thought of having all these tasks just working in the background without having to explicitly manage them.

One question I have is does anyone have experience for building data pipelines in this type of architecture?

Does it make sense to fan out on lots of small tasks? Or is it better to batch things into bigger tasks to reduce the overhead.

stsffap · 2 years ago
While Restate is not optimized for analytical workloads it should be fast enough to also use it for simpler analytical workloads. Admittedly, it currently lacks a fluent API to express a dataflow graph but this is something that can be added on top of the existing APIs. As @gvdongen mentioned a scatter-gather like pattern can be easily expressed with Restate.

Regarding whether to parallelize or to batch, I think this strongly depends on what the actual operation involves. If it involves some CPU-intensive work like model inference, for example, then running more parallel tasks will probably speed things up.

gvdongen · 2 years ago
Here is a fan-out example for async tasks: https://docs.restate.dev/use-cases/async-tasks#parallelizing... First, a number of tasks are scheduled, and then their results are collected (fan-in). This probably comes closest to what you are looking for. Each of those tasks gets executed durably, and their execution tracked by Restate.