We provide everything needed to create production-grade agents in your codebase and deploy, run, monitor, and debug them. You can use just our primitives or combine with tools like Mastra, LangChain and Vercel AI SDK. You can self-host or use our cloud, where we take care of scaling for you. Here’s a quick demo: (https://youtu.be/kFCzKE89LD8).
We started in 2023 as a way to reliably run async background jobs/workflows in TypeScript (https://news.ycombinator.com/item?id=34610686). Initially we didn’t deploy your code, we just orchestrated it. But we found that most developers struggled to write reliable code with implicit determinism, found breaking their work into small “steps” tricky, and they wanted to install any system packages they needed. Serverless timeouts made this even more painful.
We also wanted to allow you to wait for things to happen: on external events, other tasks finishing, or just time passing. Those waits can take minutes, hours, or forever in the case of events, so you can’t just keep a server running.
The solution was to build and operate our own serverless cloud infrastructure. The key breakthrough that enabled this was realizing we could snapshot the CPU and memory state. This allowed us to pause running code, store the snapshot, then restore it later on a different physical server. We currently use Checkpoint Restore In Userspace (CRIU) which Google has been using at scale inside Borg since 2018.
Since then, our adoption has really taken off especially because of AI agents/workflows. This has opened up a ton of new use cases like compute-heavy tasks such as generating videos using AI (Icon.com), real-time computer use (Scrapybara), AI enrichment pipelines (Pallet, Centralize), and vibe coding tools (Hero UI, Magic Patterns, Capy.ai).
You can get started with Trigger.dev cloud (https://cloud.trigger.dev), self-hosting (https://trigger.dev/docs/self-hosting/overview), or read the docs (https://trigger.dev/docs).
Here’s a sneak peek at some upcoming changes: 1) warm starts for self-hosting 2) switching to MicroVMs for execution – this will be open source, self-hostable, and will include checkpoint/restoring.
We’re excited to be sharing this with HN and are open to all feedback!
Both of them are focused more on being workflow engines.
Temporal is a workflow engine – if you use their cloud product you still have to manage, scale, and deploy the compute.
With Temporal you need to write your code in a very specific way for it work, including working with the current time, randomness, process.env, setTimeout… This means you have to be careful using popular packages because they often using these common functions internally. Or you need to wrap all of these calls in side effects or activities.
Restate is definitely simpler than Temporal, in a good way. You wrap any code that's non-deterministic in their helpers so it won't get executed twice. I don't think you can install system packages that you need, which has been surprisingly important for a lot of our users.
I haven't tried Trigger, planning to give it a spin this weekend!
https://news.ycombinator.com/item?id=37750763
We use it as an extension of our node app, for all things asynchronous (long or short). The fact that it's the same codebase on our server and trigger cloud is a huge plus.
For me, it's the most accessible incarnation of serverless. You can add it to your stack for one task and gradually use it for more and more tasks (long or short). Testing and local development is easy as can be. The tooling is just right. No complex configurations. You can incrementally use the queuing, wait points, batch triggers for more power.
We've had some issues with migrating from v3 to v4. The transition felt rushed (some of the docs / examples are still showing v3 code, that is deprecated in v4). I understand that it might take some time to update the docs and examples, because there is a lot of content.
Sorry you had some issues migrating. You're right, it was our biggest docs update so far, and unfortunately a few things did get missed which we have (hopefully) since rectified. Please do let us know if there's anything else we missed and we'll get it sorted.
Question: is a first-class Supabase/Postgres integration on the roadmap so we can (a) start Trigger jobs from SQL functions and (b) read job status via a foreign data wrapper? That "SQL-native job control (invoke from SQL, query from SQL)" path would make Trigger.dev feel native in Supabase apps.
Disclosure: I'm building pgflow, a Postgres-first workflow/background jobs layer for Supabase (https://pgflow.dev).
Listing hero allows ecom brands to generate consistent templated infographics so I reinvented all these things via data share between Django, Celery processes, Prefect, and webhooks. Users can start multiple generations at the same time and all run in parallel in Prefect and realtime progress visible in frontend via webhooks.
I will try playing with Trigger next weekend and probably integrate with a static stack like cloudflare worker. Excited to try it out!
One thing I did notice though from looking through the examples is this:
Uncaught errors automatically cause retries of tasks using your settings. Plus there are helpers for granular retrying inside your tasks.
This feels like one of those gotchas that is absolutely prone to benign refactoring causing huge screwups, or at least someone will find they pinged a pay for service 50x by accident without realising.
ergonomics like your helper of await retry.onThrow feel like a developer friendly default "safe" approach rather than just an optional helper, though granted it's not as magic feeling when you're trying convert eyeballs into users.
When you setup your project you choose the default number of retries and back-off settings. Generally people don't go as high as 50 and setup alerts when runs fail. Then you can use the bulk replaying feature when things do wrong, or if services you rely on have long outages.
I think on balance it is the correct behaviour.