Readit News logoReadit News
rococode · 6 years ago
Discord is entirely down right now, both the website and the app itself. Amusingly, a lot of the sites that normally track outages are also down, which made me think it was my internet at first. Downdetector, monitortheinternet, etc.

Lots of other big sites that are down: Patreon, npmjs, DigitalOcean, Coinbase, Zendesk, Medium, GitLab (502), Fiverr, Upwork, Udemy

Edit: 15 min later, looks like things are starting to come back up

saagarjha · 6 years ago
Hacker News is an excellent status page for those cases.
blisseyGo · 6 years ago
Out of curiosity, done HN use any CDN or other way of DDOS protection? dang?
macNchz · 6 years ago
My iPhone actually popped up a message saying that my wifi didn't appear to have internet, which was strange and obviously false as I was actively using the internet on it and the laptop next to it, but now it makes sense that it must have been pinging something backed by cloudflare!

Deleted Comment

maxk42 · 6 years ago
Discord attempted to route me to: everydayconsumers.com/displaydirect2?tp1=b49ed5eb-cc44-427d-8d30-b279c92b00bb&kw=attorney&tg1=12570&tg2=216899.marlborotech.com_47.36.66.228&tg3=fbK3As-awso

(Visit at your own risk.)

Hack?

jeffus · 6 years ago
I'd be looking at your browser extensions or malware (if you use the Discord app).

Deleted Comment

iamtheyammer · 6 years ago
Sure you didn't misspell discord?
jpxw · 6 years ago
How can this be reproduced?
GeneralTspoon · 6 years ago
Same here!

I even checked to see if an AWS region was down once I realised it wasn't on my side (I thought it might have been my ISP's DNS servers or something).

The next move was to check Hacker News - thankfully it's not also hosted on Cloudflare, ha!

mackal · 6 years ago
I noticed discord being down, so I went to check downforeveryoneorjustme, also down. So I figured I'd check NANOG mailing list, also down :P
clairegraham · 6 years ago
Yep, we were down completely. We are quite dependent on Cloudflare (frontend + dns).
belltaco · 6 years ago
Deathmax · 6 years ago
And that is why you host your status page on separate infra.
Miner49er · 6 years ago
It's hosted by statuspage.io, and their own status page was also down (metastatuspage.com). It is now back up, but their page shows the outage.
bufferoverflow · 6 years ago
Works for me.
abafazi · 6 years ago
Works for me too... Australia
dafoex · 6 years ago
I can live without creepy instant messengers, but its shocking just how much everything else relies on one, central system. And furthermore, why is it always cloudflair?
BrianHenryIE · 6 years ago
Cloudflare is free and has a nice UI. I manage ~40 domains from ~six domain registrars through it, the consistency is great. The caching is a bonus.
laughinghan · 6 years ago
Discord confirmed Cloudflare is also the reason they're down: https://twitter.com/discord/status/1284237737638461453
bloopernova · 6 years ago
Doordash too. My first order. On my wife's birthday.

Ah, well. This too shall pass.

mjayhn · 6 years ago
I did a pickup order a few weeks ago when of all things Tmobile SMS went down for 3+ hours. I couldn't go in the restaurant (covid) and I couldn't text them the parking # I was sitting at in a packed parking lot. I got a flood of about 50 texts a few hours later. Sat there for about an hour waiting for a $9 sandwich. I have no idea if they didn't get my order until late, or if they finally realized it was me or what. About 45 minutes in I decided to just give up on the day and take a nap, woke up to a door knock.
leon-z · 6 years ago
Kudos to the people at Discord. Just a few minutes after I got disconnected they already tweeted about the issue. Some minutes later and they have a message in their desktop app confirming it's an issue with Cloudflare. All while Cloudflare's statuspage says there are 'minor outages'.
mjayhn · 6 years ago
Every company rushes to report an outage when they can blame another vendor, well that might be hyperbolic but it's sure a lot easier!
imron · 6 years ago
As a percentage of total traffic, a 'minor' outage for Cloudflare probably equates to a significant outage for a non-trivial amount of the internet.

It will also be especially noticeable to end-users, because sites using Cloudflare are typically high-traffic sites, and so a 'minor' issue that affects only a handful of sites is still going to be noticed by millions of people.

ljm · 6 years ago
I wonder if they are all using Cloudflare's free DNS stuff or if they're paying for business accounts?

My stuff is on Netlify (for the next week or so) and the rest is on a VPS bought from a local business who isn't reselling cloud resources. I'm kinda glad I moved all my stuff from cloudflare.

jorgenphi · 6 years ago
I think it's going to be everyone. Some of my free sites are dead, but also huge enterprise Cloudflare users (Discord/Patreon/4chan) are also dead.
RcouF1uZ4gsC · 6 years ago
> Amusingly, a lot of the sites that normally track outages are also down, which made me think it was my internet at first.

That is why if you have this question, you should go to google.com

My guess is that there are more resources invested in making sure google.com stays up than for any other site on the internet.

qmarchi · 6 years ago
Depending on what part we're talking about, it varies. But yeah, just a few.
e40 · 6 years ago
Crazily, my local name resolution started failing, because I have these names servers: 192.168.0.99, 1.1.1.1 and 8.8.8.8. The first does the local resolution, but macOS wasn't consulting it because 1.1.1.1 was failing?? Crazy. When I removed 1.1.1.1 from the list, everything started working.
projektfu · 6 years ago
DNS over HTTP might bypass your local nameservers.
Cultmethod · 6 years ago
Thought something like this was going on. At first I thought it was my router and restarted everything - to no avail. Glad to see confirmation that it wasn't an issue on my end.

Deleted Comment

solarkraft · 6 years ago
Discord works for me, but https://redbubble.com/ prints "Service unavailable".
michel-slm · 6 years ago
Freenode's IRC servers were down which was unexpected for me. I was expecting old-school communication networks to not have a dependency on Cloudflare.
encom · 6 years ago
I've had no connection interruptions to the three IRC networks I'm connected to. Freenode, EFnet and Hackint.

I loathe Discord, and I can barely contain myself with schadenfreude at this news.

Reason077 · 6 years ago
Ironically, downdetector.com is also down.
mavdi · 6 years ago
Who watches the watchmen?
noobermin · 6 years ago
Would IRC be down?
emsy · 6 years ago
Same for the German downtime trackers.
thepete2 · 6 years ago
same here. I tried if hacker news still works and saw this
ashleyn · 6 years ago
It really defies the original vision of the internet to have so many services depend on a single company. Almost every news site I was reading dropped off at once. I thought for a second that I lost internet in my own house.
jeremyjh · 6 years ago
Yes its really odd that core backbone providers can go down and everything works like its supposed to. Even trans-pacific cables can be cut and things will usually work with only increased latency. But there is not much redundancy for many companies at this layer; having redundant DNS providers is I'm sure possible but not something we think about very often, and of course many of the sites that are down are depending on the proxy and DOS mitigation services.

On my home network I use Google as a backup DNS provider so the whole internet didn't go dark for me, but I don't have a backup DNS host for my company's DNS records.

woolcap · 6 years ago
Redundant DNS is possible, but challenging when you're making use of features like geo DNS that don't lend themselves to easy replication via zone transfer.
kiobu · 6 years ago
I imagine most people would never expect something like this to happen, so having a fallback option when Cloudflare has a huge interruption of service like this is just unthinkable.

Deleted Comment

Meekro · 6 years ago
Agreed, but the real problem is DDoS and nobody seems to know how to globally solve it. Fighting DDoS is expensive, so you see consolidation. It's well and good to live in a tiny farming town but when raiders start attacking every week, those castle walls and guards start to look really appealing.
schoolornot · 6 years ago
It's nice that Cloudflare provides their services for free but scrubbing has existed for a long time. With your own address space and an appropriate budget it's not difficult to have Cloudflare/Akamai/AWS announce your IP space with a higher weight than a direct path to your infrastructure. That will give you a little bit more fault tolerance for incidents like these.
labawi · 6 years ago
That's what we get for externalizing costs. It's not hard to track down sources, but network operators usually let it be, hence the incentives are probably counter-productive.
hn_throwaway_99 · 6 years ago
Agreed, but I think people really underestimated the forces at work that would cause so much consolidation into a couple internet giants.

The original idea was that with the barrier to entry being so low, anyone and everyone could set up their own websites, mail servers, etc.

But with it being so easy to compare and contrast service (i.e. the market being so open), it means that the competitive forces naturally consolidate to a winner-take-all model. If when starting out Cloudflare was just 5% better than the competition, it could have easily taken the vast majority of the mindshare on the internet. Couple that with the fact that there are huge advantages with scale to a business like Cloudflare's, and it's not hard to see how so much of the internet has become dependent on it.

rickyc091 · 6 years ago
Same here. Rebooted the router and modem thinking it was me, but my phone was still on wifi then realized it was probably my cloudflare DNS.
asadlionpk · 6 years ago
This! I got all sorts of alerts from pingdom and my laptop refused to get online. Pure Panic!
xen2xen1 · 6 years ago
Yup, reinforces the thought that you never have both DNS servers with the same service.
spiritplumber · 6 years ago
Pihole is your friend.
remmargorp64 · 6 years ago
I consider DNS and the way how top level domains are handled to be one of the weakest parts of our current Internet design.

We REALLY need a truly decentralized, distributed DNS system that is not owned by private entities.

the8472 · 6 years ago
DNS is far less of a single point of failure and more decentralized than cloudflare. Nameservers can and are operated redundantly via simple, resolver-side round-robin scheduling and the TLD servers should have longer TTLs that allow plenty of caching. The rootzone even has anycast thanks to using UDP. Take a moment to look at DoH and laugh.

You can also also register your domain on multiple TLDs.

q3k · 6 years ago
DNS worked just fine throughout this. You're barking up the wrong tree.
hpfr · 6 years ago
https://handshake.org is pretty interesting.
ghastmaster · 6 years ago
I just recently ran across this. I wonder how much performance would be degraded.

https://ieeexplore.ieee.org/document/7530014/authors#authors

> Unlike previous DNS replacement proposals, D 3 NS is reverse compatible with DNS and allows for incremental implementation within the current system.

xen2xen1 · 6 years ago
DNS is decentralized, it's just not when everyone goes with one big service.
tenebrisalietum · 6 years ago
I'm down for passing around a GPG signed hosts2.txt file. Let's get started.
Algent · 6 years ago
And the worst is if you try to raise concerns about cloudflare now it get brushed of as "cf already proxy half the internet, if it goes down our stuff will be minor concern".
Can_Not · 6 years ago
That's true, but what's a free or low cost alternative for DDoS protection for a small webapp?
cortesoft · 6 years ago
I don't understand why the big companies don't always have at least two CDN providers, so they can failover to another one if something like this happens.

I know a lot of big companies do, but I am always surprised when you see ones that don't.

LoSboccacc · 6 years ago
the DNS itself is not as easy to duplicate across multiple provider, with CF DNS down having a backup CDN wouldn't have helped
thathndude · 6 years ago
My CRM was nonfunctional. That’s some critical infrastructure for me. And then I’m wondering, is it me or is it my CRM. Turns out it’s door #3 - cloudflare
lumberingjack · 6 years ago
Same here. I'm working at an auto parts store looking though ASE parts sites and it was like well close up the store the catalogs are missing RN.
karlmcguire · 6 years ago
"All systems operational"

What's the point of a status page if it doesn't reflect the real status...

It's either the status page goes down with everything else or the status page is wrong. Great.

EDIT: Looks like it's accurate now, 20 minutes later.

Jasper_ · 6 years ago
The point of the status page is so you can point to it for your five nines SLA and go "look? we were only down for one hour". As soon as the money relies on the metric, the metric will reflect the money.
mjlawson · 6 years ago
Goodhart's Law[1] in action.

[1]https://en.wikipedia.org/wiki/Goodhart%27s_law

parliament32 · 6 years ago
Despite their update, I like how they're saying only their recursive DNS had "degraded performance", while authoritative is "operational". The entire reason everything blew up was because their authoritative nameservers weren't responding.
Xenoamorphous · 6 years ago
IBM Cloud status is pretty much always green... although we have issues pretty much every week.
mathattack · 6 years ago
They’re still using Lotus Notes for the tracking.
Miner49er · 6 years ago
There are status page providers that actually monitor services and automatically update. Cloudfare just doesn't use them.
jeremyjh · 6 years ago
Let's start a betting pool. How many upvotes do you think OP will get before the status page acknowledges a problem? I say its going to be 600.
jedberg · 6 years ago
You lost. ;) 476 points, status page says it's down now.
saagarjha · 6 years ago
This post is getting something like 30 upvotes a minute…might want to up that a bit ;)
HappyKasper · 6 years ago
And it looks like they started "investigating" at around 450!

Deleted Comment

hmmazoids · 6 years ago
Ahh I remember when AWS went down (think it was 2 years ago now?) or at least a data center in us-east? Majority of the internet went down and status page went down as well. Man good times.
dewey · 6 years ago
Status pages are a marketing channel not a channel for developers most of the time. It most likely has to go through some layers before someone updates the status page.
cellar_door · 6 years ago
This is an Atlassian Statuspage status page, so it's not hosted by Cloudflare.

Deleted Comment

smsm42 · 6 years ago
this-is-fine.gif
geerlingguy · 6 years ago
I don't think it's just Cloudflare; I just had a fun 10 minutes seeing servers start flipping on my Server monitoring service[1]. This has only happened once or twice per year, and is usually due to weird global DNS issues.

[1] https://servercheck.in/

(To give an update, I'm seeing from my monitoring systems (about 15 points around the globe) sporadic outages for Microsoft, Apple, Reddit, Bing, Node.js, Twitter, Yahoo, and YouTube. And my own servers (not behind CF at all) are also flipping up and down. It started around 21:14 UTC.)

cm2187 · 6 years ago
a DNS issue wouldn't cripple all of the internet at once, with all the caching.
RL_Quine · 6 years ago
Most sites set the absolute minimum TTL for every record, for no reason. There’s a lot less caching than you’re thinking.
xtracto · 6 years ago
It was interesting that we saw our domains affected from the USA but from Mexico everything looked OK.

The crazier thing is that I tried to login to our CloudFlare account, it never sent me the 2FA code... I still haven't been able to login (Enterprise account)

jgrahamc · 6 years ago
This was a problem with our backbone network; wasn't caused by an attack. The effect was regional and not global. Naturally, we'll write it all up.
EE84M3i · 6 years ago
Was it a problem with a provider you use?
jgrahamc · 6 years ago
Looks like problem with one of our large routers in Atlanta.
clairegraham · 6 years ago
We were down (downforeveryoneorjustme.com) completely, but back up now (as of a few minutes ago). Our domain wasn't even resolving; we use Cloudflare for frontend and DNS.

We had a surge of people checking if Discord was down on our site, then I noticed everything went down shortly after. Discord is still the top check right now.

I can't ever remember hitting these kind of traffic numbers before.

ricardo81 · 6 years ago
Interesting data you get in the face of adversity, providing your host resolves!
Avalaxy · 6 years ago
Funny, I tried to use your site because the website I was trying to access stopped working. But your site was also down so I figured it was just my internet being cranky :/
wcchandler · 6 years ago
I enjoy your service. Have you ever thought about expanding your offerings? I would love to see a recreation of "Internet Pulse"
clairegraham · 6 years ago
Thanks! Yep, we have a lot of things on the todo. We want to add more user-focused / location-based outage information since our site is still too reliant on simple HTTP checks to report downtime. This is especially a problem with a Discord outage, for example, where the frontend website is not down, but there might be problems with the API, apps, or other components.

And I'd like to be able to have our site communicate outages like this Cloudflare one, where more than one site might be affected by a larger provider. Automating that is difficult.

This is still a side project, though, so I mostly work on it when I get the urge :)

jchw · 6 years ago
Something’s wonky, because it’s not just Cloudflare. One of my personal sites is down that uses nothing but a VPS, and I noticed my Unifi AP disconnect from its controller a little bit ago. Fiber cut? Routing issues?
parliament32 · 6 years ago
If that VPS is on DO they're down too cause of CF. Or if you set the resolver on your VPS to 1.1.1.1 that's also down.
jchw · 6 years ago
Why are digital ocean VPSes down due to a Cloudflare outage? Hoping for a clarifying post mortem...
dpcx · 6 years ago
DO is still up as my machines are still up and accessible.
rconti · 6 years ago
Huh. My Ubiquiti was reporting WAN link down during this outage. I'm using ATT fiber. I'm wondering if "link down" doesn't mean what I think it means. Now that I check, it says "WAN iface [eth2] transition to state [inactive]". I'm wondering if that means link down or if it's doing service checking.

I actually have a WAN2 configured but not plugged in and it was set to "Load Balancing: Failover Only" ... I wonder if all of my 'connection issues' were software assuming my network link is down and switching interfaces to an unavailable one.

rconti · 6 years ago
to reply to myself, if you have a second interface configured for failover, it actually tests against ping.ubnt.com. I bet every single time my ATT fiber has "gone out" for a minute or two at a time, it's been bogus.

  root@USG-PRO-4:~# show load-balance watchdog
  Group wan_failover
  eth2
  status: Running
  pings: 2
  fails: 0
  run fails: 0/3
  route drops: 0
  ping gateway: ping.ubnt.com - REACHABLE

  eth3
  status: Waiting on recovery (0/3)
  failover-only mode
  pings: 1
  fails: 1
  run fails: 3/3
  route drops: 1
  ping gateway: ping.ubnt.com - DOWN
  last route drop   : Fri Jul 17 17:32:58 2020

rozab · 6 years ago
We can't keep going on like this. The vulnerability of centralised internet infrastructure is a huge problem for everyone. Somebody, somewhere, really ought to sort it all out
Reedx · 6 years ago
> Somebody, somewhere, really ought to sort it all out

That could be the slogan for 2020

fivre · 6 years ago
10-20 minute router misconfigurations and subsequent fixes are sometimes a fact of life. big network infrastructure is complicated, and sometimes the best laid route tables of mice and men do go abloop and die.

Outages happen no matter what the infrastructure is. There's no solution, they're just something you need to recognize and handle, which Cloudflare seemingly did relatively quickly here.

lima · 6 years ago
Yes, but other providers are not a single point of failure for a significant percentage of the internet.

Level 3 or Telia going offline is perfectly survivable for any customer who has multiple upstreams.

iso947 · 6 years ago
Think Back to the 80s. Imagine you’re watching the super bowl. Imagine it goes off for 15 minutes.
oliverobscure · 6 years ago
If it impacts enough wallets, things might change. I'm not holding my breath though.
parliament32 · 6 years ago
Why not you? Just don't use CF. The more people stay away from CF, the better.
adsjhdashkj · 6 years ago
I feel like for a lot of sites CF & CDNs are the only way to survive Reddit/HN/etc - do you disagree?

I definitely agree in concept with you, but then i think back to how frequently script kiddies took down sites ~10 years ago, or w/e. I feel like what has changes is the massive CDNs in front of so many sites.

So while i do want a better solution, i'm not sure what it looks like. Thoughts?

remmargorp64 · 6 years ago
Sounds like a problem for... us!
juancampa · 6 years ago
If only there was some website full of computerphiles...

Deleted Comment

atemerev · 6 years ago
Be the change you want to see in the world :) There are no somebodys somewheres.

One question is how to do DDoS protection without somebody like Cloudflare. Some new protocol for edge caching, perhaps?

toast0 · 6 years ago
DDoS has two components:

a) complexity: trick your servers into doing something hard

b) volumetric: overwelm your servers with a lot of traffic

c) volumetric part two: overwelm your servers with a lot of requests, so you respond with a lot of traffic

A and C are things you can work on your self --- try to limit the amount of work your server does in response to requests, and/or make resource consuming responses require resource consuming requests; and monitor and fix hotspots as they're found.

B is tricky, there's two ways to solve volumentric attack; either have enough bandwidth to drop the packets on your end, or convince the other end to drop the packets (usually called null routing). Null routes work great, but usually drop all packets to a particular destination IP, which means you need to move your service to another IP if you want it to stay online; that's hard to do if your IP needs to stay fixed for a meaningful time (TTL for glue records at TLDs is usually at least a day); and IP space is limited, so if your attackers are quick at moving attacks, you could run out of IPs to use. Some attacks are going above 1 Tbps though, so that's a lot of bandwidth if you need to accept and drop; and of course, the more bandwidth people get so they can weather attacks, the more bandwidth that can be used to attack others if it's not well secured.

peterwwillis · 6 years ago
Or just stop using the internet. The majority of tech problems stem from people using tech. Don't rely on it alone and you don't have problems.
saagarjha · 6 years ago
Yells 'peterwwillis on the internet ;)