Readit News logoReadit News
Posted by u/gusowen 3 months ago
Show HN: I made a down detector for down detectordowndetectorsdowndetector...
After down detector went down with the rest of the internet during the Cloudflare outage today I decided to build a robust, independent tool which checks if down detector is down. Enjoy!!
spyridonas · 3 months ago
As a European solo developer, I’ve switched entirely to European alternatives for all my infrastructure since the beginning of the year.

Cloudflare > Bunny.net

AWS > Hetzner

Business email > Infomaniak

Not a single client site has experienced downtime, and it feels great to finally decouple from U.S. services.

graemep · 3 months ago
Those are all much smaller. Smaller providers have a much stronger incentive to be reliable, as they will lose customers if they are not. In a corporate settings management will say "this would not have happened if you had gone with AWS". its the current version of "no one ever got fired for buying IBM" (we had MS and others in between).

Hetzner provides a much simpler set of services than AWS. Less complexity to go wrong.

A lot of people want the brand recognition too. Its also become the standard way of doing things and is part of the business culture. I have sometimes been told its unprofessional or looks bad to run things yourself instead of using a managed service.

pksebben · 3 months ago
There is this weird thing that happens with hyperscale - the combination of highly central decision-making, extreme interconnection / interdependence of parts, and the attractiveness of lots of money all conspire to create a system pulled by unstable attractors to a fracturing point (slowed / mitigated at least a little by the inertia of such a large ship).

Are smaller scale services more reliable? I think that's too simple a question to be relevant. Sometimes yes, sometimes no, but we know one thing for sure - when smaller services go down the impact radius is contained. When a corrupt MBA who wants to pump short term metrics for a bonus gains power, the damage they can do is similarly contained. All risk factors are boxed in like this. With a hyperscale business, things are capable of going much more wrong for many more people, and the recursive nature of vertical+horizontal integration causes a calamity engine that can be hard to correct.

Take the financial sector in 08. Huge monoliths that had integrated every kind of financial service with every other kind of financial service. Few points of failure, every failure mode exposed to every other failure mode.

There's a reason asymmetric warfare is hard for both parties - cellular networks of small units that can act independently are extremely fault tolerant and robust against changing conditions. Giants, when they fall, do so in spectacular fashion.

mbesto · 3 months ago
> Smaller providers have a much stronger incentive to be reliable, as they will lose customers if they are not.

Hard disagree. A smaller provider will think twice about whether they use a Tier 1 data center versus a Tier IV data center because the cost difference is substantial and in many cases prohibitively expensive.

giancarlostoro · 3 months ago
> A lot of people want the brand recognition too.

Not to mention the familiarity of the company, its services and expectations. You can hire people with experience with AWS, Azure or GCP, but the more niche you go, the higher the possibility that some people you hire might not know how to work with those systems and their nuances, which is fine they can learn as they work, but that adds to ramp up time and could lead to inadvertent mistakes happening.

hoppp · 3 months ago
I think cloudflare has billions worth of incentives to be reliable however they can slip up, it happens and that's why centralization is bad.
codexon · 3 months ago
I've actually tried hetzner on and off with 1 server for the past 2 years and keep running into downtime every few months.

First I used an ex101 with an i9-13900. Within a week it just froze. It could not be reset remotely. Nothing in kern.log. Support offered no solution but a hard reboot. No mention of what might be wrong other than user error.

A few months later, one of the drives just disconnects from raid by itself. It took support 1 hour to respond and they said they found no issue so it must be my fault.

Then I changed to a ryzen based server and it also mysteriously had problems like this. Again the support blamed the user.

It was only after I cancelled the server and several months later that I see this so I know it isn't just me.

https://docs.hetzner.com/robot/dedicated-server/general-info...

Krutonium · 3 months ago
>I have sometimes been told its unprofessional or looks bad to run things yourself instead of using a managed service.

That's an incredibly bad take lol.

There are times where "The Cloud" makes sense, sure. But in my experience the majority of the time companies over-use the cloud. On Prem is GOOD. It's cheaper, arguably more secure if you configure it right (a challenge, I know, but hear me out) and gives you data sovereignty.

I don't quite think companies realize how bad it would be if EG AWS was hacked.

Any Data you have on the cloud is no longer your data. Not really. It's Amazon, Microsoft, Apple, whoevers.

amelius · 3 months ago
> Less complexity to go wrong.

This sounds like a good thing.

runjake · 3 months ago
> Smaller providers have a much stronger incentive to be reliable, as they will lose customers if they are not.

I disagree because conversely, outages for larger providers cause millions or maybe even billions of dollars in losses for its customers. They might be more "stuck" in their current providers' proprietary schemes, but these kinds of losses will cause them to move away, or at least diversify cloud providers. In turn, this will cause income losses to the cloud provider.

Deleted Comment

simultsop · 3 months ago
And they sell when get big but can't afford to be.
nonethewiser · 3 months ago
AWS and Cloudflare don't actually experience more downtime, it's just a bigger story when they are down because so many people use them.

You can use whatever infrastructure you want for whatever reason, but you may not have an accurate picture of the availability.

monooso · 3 months ago
> AWS and Cloudflare don't actually experience more downtime, it's just a bigger story when they are down because so many people use them.

This may be true over a long enough timeframe, but GP stated that their clients had experienced no downtime since switching at the start of the year.

That is clearly better than both AWS and Cloudflare during that time.

lilydjwg · 3 months ago
Earlier this year, a Hetzner server I manage was shutdown, and after I started it via the console, it booted to a rescue system. In the same month, it was rebooted without a reason. There was some maintenance notice but the server was not listed as impacted.

Note that I'm not saying Hetzner is bad. Just incidents happen in Europe too. The server didn't have a lot of issues like this over the years.

herbst · 3 months ago
Big fan of bunny.net as CDN, however Cloudflare is my "smart" filter for all kind of attacks, AI scrapers, malicious traffic, etc.

Am I missing something or is bunny.net not actually a replacement for that?

Stoo · 3 months ago
They've recently introduced bunny.net Shield to add a security layer. I've not made use of it yet so I don't know what the coverage is like or how effective it is: https://bunny.net/shield/
benatkin · 3 months ago
That component is what keeps me from using Cloudflare for anything. Not because it exists, but because the way it's run is terrible for the open web: https://www.theregister.com/2025/03/04/cloudflare_blocking_n...
valevk · 3 months ago
How does Infomaniak compare to Proton? I see they have more office productivity products, but regarding mail and drive?
baaron · 3 months ago
As an American solo developer, I am close to doing the same. These mega-corps are out of control.
bananalychee · 3 months ago
Out of control in what way?
buildfocus · 3 months ago
I've done something similar, it's worth noting Scaleway in the same space, for people looking for an AWS replacement more like managed services (equivalents to fargate/lambda/sqs/s3/etc) instead of just bare instance hosting.
moooo99 · 3 months ago
+1 for Scaleway. I also use Hetzner for most of my compute. But some stuff just really profits from using managed services. I‘ve used Scaleway‘s Serverless compute offers and managed DBs an been quite happy with them.
supz_k · 3 months ago
We are also looking to migrate off Cloudflare. I thought Bunny.net was mostly a pure CDN, not a reverse proxy like Cloudflare. Am I wrong? One of the most important things for us would be DDoS protection.
sp4cec0wb0y · 3 months ago
American solo developer here. Moved to Hetzner two months ago. They have servers in Oregon for west coast people. My storage box is in Germany but that is okay, it is for backups.
INTPenis · 3 months ago
Do you have anything for device management? Like managing local admin accounts on Linux, Macintosh and Windows? I'm afraid we'll have to use InTune.
GordonS · 3 months ago
Are you using a US-based transactional email service like Twilio? Curious about EU-based alternatives.
pydubreucq · 3 months ago
Hello, You can test Sweego - https://www.sweego.io/ We (I'm the CTO) are fully European Bye Pierre-Yves
supz_k · 3 months ago
Hyvor Relay (https://github.com/hyvor/relay) can be self-hosted. We are planning a cloud version for 2026. (I am a co-founder)
albertgoeswoof · 3 months ago
https://mailpace.com is fully European based and independent
smashah · 3 months ago
There are self hostable alts to twillio
spiderfarmer · 3 months ago
Same here! I also got a nice peak in my traffic, because so many sites were down.
alecco · 3 months ago
This is worth its own post.
cortesoft · 3 months ago
I feel like a year is too short a time frame to measure reliability.
richardreeze · 3 months ago
Aren't costs higher though?
moffkalast · 3 months ago
> Bunny.net

Ah yes, the place for RabbitMQ endpoints.

everdev · 3 months ago
Wow looks like you don't have one of the most renowned benefits of capitalism: over-reliance on corporatist infrastructure.
jesperwe · 3 months ago
Yeah we had a good laugh when Downdetector was down during the Cloudflare outage yesterday. So this is appropriate. +1
cortesoft · 3 months ago
I remember when the CDN I was working for had to change our status page provider when our first one became our client.
mylons · 3 months ago
This is GOLD Jerry, Gold.

but who detects the down detector detecting the down detector detecting the down detector

eYrKEC2 · 3 months ago
You're on that site right now!
bombcar · 3 months ago
HN is the true down detector - if HN is down TCP is down.
falcor84 · 3 months ago
I know you were joking, but responding in seriousness - while in general it's worthwhile asking "Quis custodiet ipsos custodes?", in this particular case, I don't see any issue with Down Detector detecting the Down Detector Down Detector. Assuming they are in different availability zones, using different code, with a different deployment cadence, this approach works quite well in practice.
mylons · 3 months ago
haha — this is the exact comment i was hoping to see! indeed, i was joking. The Watchmen graphic novel is very important to me as it opened my eyes to the concept of “who watches the watchmen” which I was ultimately eluding to here, albeit extremely facetiously.
Waterluvian · 3 months ago
> Quis custodiet ipsos Custodes?

Arbites.

graemep · 3 months ago
Can down detector not detect whether down detector detector is down or not?

Maybe distributed down detection?

I know there are people here perfectly capable of running with that idea and we might just see a distributed down detector announced on HN :)

PunchyHamster · 3 months ago
See, that's the joke, all of them are on cloudflare/us-west-1 so they all go down together anyway
bryanrasmussen · 3 months ago
pervs.
joelhaasnoot · 3 months ago
Time for the META Down Detector - detecting which of the three is down
philipwhiuk · 3 months ago
Or "Quis custodiet ipsos custodes?"

Deleted Comment

mproud · 3 months ago
I think the original down detectors do
jl6 · 3 months ago
Mutually assured down-detection.
excalibur · 3 months ago
It's detectors all the way down.
state_less · 3 months ago
There's always another asking, "Are you down?" It's a bit of a bop.

https://youtu.be/DpMfP6qUSBo

mhb · 3 months ago
Three down detectors walk into a bar. The bartender asks them if they're all up. The first says "I don't know". The second says "I don't know". The third says "Yes".

Deleted Comment

oniony · 3 months ago
Presumably they're blind down detectors.
khasan222 · 3 months ago
Crying. I’m stealing this.

Deleted Comment

Deleted Comment

4ndrewl · 3 months ago
But we need another one to detect whether yours is still up.

It's downdetectorsdown all the way down.

thinkingemote · 3 months ago
wltr · 3 months ago
It was worth the laugh, thanks!
bell-cot · 3 months ago
Downdetection can be thought of as a directed graph, or digraph*.

From there, the "who's watching who?" can become mathematically interesting.

* https://en.wikipedia.org/wiki/Directed_Graph

Nevermark · 3 months ago
Given enough of them, some fraction will always be down. It would be helpful if we had a site that could track that ratio.
hirako2000 · 3 months ago
It's a centralization vs decentralisation vs distributed system question.

Since down detectors serve to detect failures of centralized (and decentralized systems) the idea would be to at least get that right: a distributed system to detect outages.

You basically run detectors that heartbeat each others. Just a few suffice.

Once you start to see clusters of detectors go silent, you can assume things are falling apart, which is fine so long as a few remain.

Self healing also helps to make the web of nodes resilient to inevitable infrastructure failures.

neoCrimeLabs · 3 months ago
It's down detectors all the way down
rozenmd · 3 months ago
here's a page that monitors that page: https://onlineornot.com/website-down-checker?requestId=jCfaD...

Looks like it's hosted in London?

meken · 3 months ago
We could create a linked list of these and just refer to the N’th one as N-down detector.

Deleted Comment

BrenBarn · 3 months ago
Sup dawg, I heard you like down detectors.
ZeroConcerns · 3 months ago
Thank you for your service! Now, for an even bigger challenge: since it seems the increased demand for the Cloudflare status page brought down Amazon CloudFront for a bit as well, build a new CDN capable of handling that load as well...
carstenhag · 3 months ago
Do you need a CDN for a static html, no images? I would guess no, even if you.are being bombarded with requests
ZeroConcerns · 3 months ago
I would guess yes, unless you have a server with unlimited file descriptors and flawless connectivity to every other AS...
_nickwhite · 3 months ago
I think an important caveat here is that down detector was not actually down, the cloudflare human verification component was (AFAIK). I wonder if this downdetector down detector accounts for that aspect? It was technically "not down" but still unusable.