tonyhb (u/tonyhb) - Readit News

tonyhb commented on Cloudflare outage on December 5, 2025 blog.cloudflare.com/5-dec... · Posted by u/meetpateltech

I’m not sure I share this sentiment.

First, let’s set aside the separate question of whether monopolies are bad. They are not good but that’s not the issue here.

As to architecture:

Cloudflare has had some outages recently. However, what’s their uptime over the longer term? If an individual site took on the infra challenges themselves, would they achieve better? I don’t think so.

But there’s a more interesting argument in favour of the status quo.

Assuming cloudflare’s uptime is above average, outages affecting everything at once is actually better for the average internet user.

It might not be intuitive but think about it.

How many Internet services does someone depend on to accomplish something such as their work over a given hour? Maybe 10 directly, and another 100 indirectly? (Make up your own answer, but it’s probably quite a few).

If everything goes offline for one hour per year at the same time, then a person is blocked and unproductive for an hour per year.

On the other hand, if each service experiences the same hour per year of downtime but at different times, then the person is likely to be blocked for closer to 100 hours per year.

It’s not really bad end user experience that every service uses cloudflare. It’s more-so a question of why is cloudflare’s stability seeming to go downhill?

And that’s a fair question. Because if their reliability is below average, then the value prop evaporates.

tonyhb · 13 days ago

Cloudbleed. It’s been a fun time.

tonyhb commented on GitHut – Programming Languages and GitHub (2014) githut.info/... · Posted by u/tonyhb

jtwaleson · a month ago

Would love to see an update to 2025

tonyhb · a month ago

I really, really want this updated too and saw it in my bookmarks. Figured the historic data was interesting, and that someone might want to give this another go.

tonyhb commented on Make any TypeScript function durable useworkflow.dev/... · Posted by u/tilt

AgentME · 2 months ago

This seems pretty similar to Cloudflare Workflows (https://developers.cloudflare.com/workflows/), but with code-rewriting to use syntax magic (functions annotated with "use workflow" and "use step") instead of Cloudflare's `step.do` and `step.sleep` function calls. (I think I lightly prefer Cloudflare's model for not relying on a code-rewriting step, partly because I think it's easier for programmers to see what's going on in the system.) Workflow's Hooks are similar to Cloudflare's `step.waitForEvent` + `instance.sendEvent`. It's kind of exciting to see this programming model get more popular. I wonder if the ecosystem can evolve into a standardized model.

tonyhb · 2 months ago

Actually, both Vercel and Cloudflare are based off of the API that we built at https://inngest.com (disclaimer, I'm a founder).

I strongly believe that being obvious about steps with `step.run` is important: it improves o11y, makes things explicit, and you can see transactional boundaries.

tonyhb commented on A subtle bug with Go's errgroup gaultier.github.io/blog/s... · Posted by u/broken_broken_

masklinn · 4 months ago

> The entire point of the returned context is that it's cancelled if an error occurs, yet the author never uses it again until after he's done with the errgroup.

`Wait()` also does that. And the examples in the package documentation don't show the context as a way for the caller to be notified that things are done (that's what `Wait()` is for) but as a way for the callees (the callbacks passed to Do) to early abort.

This is mostly confirmed by the discussion dantillberg linked above, where someone suggests passing the errgroup's context down to the callbacks as parameter and the package author replies they don't do that because the lack of inference makes for nasty boilerplate (https://github.com/golang/go/issues/34510#issuecomment-53961...).

tonyhb · 4 months ago

Yeah, just... like, don't use ctx from WithContext after the errgroup. WithContext is an API that specifically and explicitly cancels the context on error. It's absolutely optional. It's not a bug at all. It's using the wrong APIs and using a variable out of intended scope.

tonyhb commented on Build durable workflows with Postgres dbos.dev/blog/why-postgre... · Posted by u/KraftyOne

KraftyOne · 4 months ago

As the post says, the exactly-once guarantee is ONLY for steps performing database operations. For those, you actually can get an exactly-once guarantee by running the database operations in the same Postgres transaction as your durable checkpoint. That's a pretty cool benefit of building workflows on Postgres! Of course, if there are side effects outside the database, those happen at-least-once.

tonyhb · 4 months ago

You can totally leverage postgres transactions to give someone... postgres transactions!

I just figured that the exactly once semantics were so worth discussing that any external side effects (which is what orchestration is for) aren't included in that, which is a big caveat.

tonyhb commented on Build durable workflows with Postgres dbos.dev/blog/why-postgre... · Posted by u/KraftyOne

tonyhb · 4 months ago

Anything that guarantees exactly once is selling snake oil. Side effects happen inside any transaction, and only when it commits (checkpoints) are the side effects safe.

Want to send an email, but the app crashes before committing? Now you're at-least-once.

You can compress the window that causes at-least-once semantics, but it's always there. For this reason, this blog post oversells the capabilities of these types of systems as a whole. DBOS (and Inngest, see the disclaimer below) try to get as close to exactly once as possible, but the risk always exists, which is why you should always try to use idempotency in external API requests if they support it. Defense in layers.

Disclaimer: I built the original `step.run` APIs at https://www.inngest.com, which offers similar things on any platform... without being tied to DB transactions.