CrowdStrike caused millions of computers to be stuck in an infinite reboot-to-BSOD cycle.
The CrowdStrike CEO admitted fault and apologized publicly. CrowdStrike then sent out 'fuck you' $10 uber eats as a sorry, further admitting fault to places.
The CrowdStrike CEO did this at McCaffee? Aswell as their CTO.
I get the desire to assign a simple “blame” to a single entity, but reality has a habit of being more complex.
Even if we accept that CrowdStrike’s fuckup was the root cause of all the issues that Delta customers experienced, the level of impact CS’s fuckup was able to have is entirely on Delta. Their choice of platform, their level of resilience, the processes they have in place for rapid recovery, etc, are all Delta’s responsibility.
No, you cannot be resilient against a company directly injecting stuff into their software without user input.
There is zero complexity here. CrowdStrike accepted money for security services and offered a client. Then they didn't bother to test their stuff and literally caused the biggest IT outage in history to save a few bucks. They literally have a CEO with a history of not caring.
If you offer your stuff for use in critical infrastructure you cannot do stuff like that. This company needs to be sued into bankruptcy. This was plain bad engineering.
If the same happened with a bridge, and it collapsed no one would even hesitate to lay blame on the engineering company/architects. You have a choice if you want to drive over a bridge or not, but when it collapses and you are on there I doubt you'll think I have to accept the consequences of my choices on your way down
In some industries, like vehicles, if the product has a malfunction, the government requires the mfg to issue a recall and repair the issue, even long after the warranty ends. And same thing with children's toys via the CPSC.
Yet in software this expectation is always absent.
There is no such thing as a root cause of the failure. There's over 40 years of academic research on the subject providing evidence, but still, here we are. Trying to reduce an incident into a single cause.
Sandbox rules simply don’t apply when real money is at stake— the contracts that sit behind these relationships are all that matters + a companies ability to stop doing business with one another.
Delta probably isn’t even entitied to a pro-rated refund of their prepaid CrowdStrike subscription. If Delta has a multi-year deal contract with CrowdStrike, Delta most likely have to keep paying CS for some time In the future.
CrowdStrike breached but almost certainly cured within allowable period.
Maybe they sue for gross negligence which I think may circumvent contractual liability limits in certain situations.
People who gave a third party firm remote code execution control of millions of computers without any code review, reproducible builds, or accountability of any kind.
If you give third parties access to your systems with proprietary code without any checks or balances because you are too lazy to fight for the budget to deploy an in-house alternative, then whatever bad results happen are partly on you.
Yes, sue CrowdStrike, but also fire every sysadmin and IT manager that recommended or approved installing it on systems in the first place.
So when you go to a restaurant, you also go into the kitchen, review the origin seal of every ingredient, the tools using to prepare the dish, perform a background check of each kitchen employee, before commiting to anything on the menu...
I think the Uber Eats voucher was intended to be an internal thing, like “sorry everyone, we’ve had a rough Friday, probably things won’t be smooth again for a while, here, order yourself a pizza while we work this through together” rather than something to be offered to customers, and it leaked.
To a certain extent I get it, although the root cause was the CrowdStrike issue, the knock-on effects were exacerbated by Delta’s duck-tape-and-prayers existing IT services.
Makes it harder to accept responsibility for that when you’re being sued.
There is a pie chart of accountability, including some with Delta themselves. Microsoft carry some blame, but yes, the lion share is carried by CS.
Delta took 6 days to recover from something as impactful as this, which is much worse than any other airline. So much so, the US Transportation Department is currently reviewing why it took so long.
It is vital that any critical infrastructure such as this has a low MTTR, which Delta clearly failed to plan and test for, therefore the IMPACT that Delta felt cannot be solely attributed to CS, albeit they did trigger it, but Delta failed to mitigate the impact.
Other airlines who were also similarly affected by the crowd strike outage did not display the same prolonged issues as Delta.
CrowdStrike was clearly responsible for some of it. It’s pretty clear that Delta had other major issues that were also triggered by the CrowdStrike outage which it wasnt responsible for since other airlines didn’t face those issues.
> Other airlines who were also similarly affected by the crowd strike outage did not display the same prolonged issues as Delta.
You raise a valid point, but imo the key question (to which I don't claim to have an answer) is whether Delta's extended problems are due more to their disorganization or simply to bad luck.
Here's a somewhat silly analogy: Imagine Kellogg's introduces a new ingredient in their cereals that triggers an adverse reaction only in people who have eaten sardines within the last hour. It would still be correct to hold Kellogg's entirely responsible for the adverse reactions experienced by those individuals. It wouldn't be a valid defense for Kellogg's to say, "Most people didn't experience any issues, so it can't be our product causing your adverse reactions".
I am looking forward to see how this goes in court, as it can be yet another step forcing companies into proper quality development workflows, and liability.
I'm hoping it results in ClownStrike being sued into bankruptcy and the whole company being dissolved, and the CEO never getting another job again except maybe as a janitor. They need to stand as an example of what not to do.
Delta's IT department is in for tough times ahead, considering these cases drags on for years..
Should Delta pursue this path, Delta will have to explain.. why CrowdStrike took responsibility for its actions—swiftly, transparently, and constructively while Delta did not.. Delta would have to preserve a series of documents, including those describing its information-technology infrastructure, IT business continuity plans and its handling of outages in the past five years
The crew location and status systems are all manual and phone based this is a chronic problem across many airlines.
When the system goes down it takes hours or days as crew all have to call in, wait on hold for hours and get their status back in the system.
If only some smart tech people gathered somewhere and someone could make a mobile app to allow crew to set their status and location instantly. They’d corner the market and save airlines billions.
We prosecute arsonists and look towards implementing better forest management to ensure future arsonists don't have as much impact. Just like with personal crime, we prosecute the perpetrators and teach potential victims how to better protect themselves from being in vulnerable positions. Delta isn't at fault for the failure but they do need to reevaluate their systems to ensure this kind of total system failure will be less likely.
Wait, were Crowdstrike seriously suggesting giving their staff physical access to Delta systems at airports across the world having just nuked those same systems through incompetence or negligence... what....
if CrowdStrike isn't to blame, then who is?
Even if we accept that CrowdStrike’s fuckup was the root cause of all the issues that Delta customers experienced, the level of impact CS’s fuckup was able to have is entirely on Delta. Their choice of platform, their level of resilience, the processes they have in place for rapid recovery, etc, are all Delta’s responsibility.
There is zero complexity here. CrowdStrike accepted money for security services and offered a client. Then they didn't bother to test their stuff and literally caused the biggest IT outage in history to save a few bucks. They literally have a CEO with a history of not caring.
If you offer your stuff for use in critical infrastructure you cannot do stuff like that. This company needs to be sued into bankruptcy. This was plain bad engineering. If the same happened with a bridge, and it collapsed no one would even hesitate to lay blame on the engineering company/architects. You have a choice if you want to drive over a bridge or not, but when it collapses and you are on there I doubt you'll think I have to accept the consequences of my choices on your way down
Yet in software this expectation is always absent.
Sandbox rules simply don’t apply when real money is at stake— the contracts that sit behind these relationships are all that matters + a companies ability to stop doing business with one another.
Delta probably isn’t even entitied to a pro-rated refund of their prepaid CrowdStrike subscription. If Delta has a multi-year deal contract with CrowdStrike, Delta most likely have to keep paying CS for some time In the future.
CrowdStrike breached but almost certainly cured within allowable period.
Maybe they sue for gross negligence which I think may circumvent contractual liability limits in certain situations.
If you give third parties access to your systems with proprietary code without any checks or balances because you are too lazy to fight for the budget to deploy an in-house alternative, then whatever bad results happen are partly on you.
Yes, sue CrowdStrike, but also fire every sysadmin and IT manager that recommended or approved installing it on systems in the first place.
To a certain extent I get it, although the root cause was the CrowdStrike issue, the knock-on effects were exacerbated by Delta’s duck-tape-and-prayers existing IT services.
Makes it harder to accept responsibility for that when you’re being sued.
Delta took 6 days to recover from something as impactful as this, which is much worse than any other airline. So much so, the US Transportation Department is currently reviewing why it took so long.
It is vital that any critical infrastructure such as this has a low MTTR, which Delta clearly failed to plan and test for, therefore the IMPACT that Delta felt cannot be solely attributed to CS, albeit they did trigger it, but Delta failed to mitigate the impact.
Otherwise they were forced to NOT lock down the kernel at Vista time.
CrowdStrike was clearly responsible for some of it. It’s pretty clear that Delta had other major issues that were also triggered by the CrowdStrike outage which it wasnt responsible for since other airlines didn’t face those issues.
You raise a valid point, but imo the key question (to which I don't claim to have an answer) is whether Delta's extended problems are due more to their disorganization or simply to bad luck.
Here's a somewhat silly analogy: Imagine Kellogg's introduces a new ingredient in their cereals that triggers an adverse reaction only in people who have eaten sardines within the last hour. It would still be correct to hold Kellogg's entirely responsible for the adverse reactions experienced by those individuals. It wouldn't be a valid defense for Kellogg's to say, "Most people didn't experience any issues, so it can't be our product causing your adverse reactions".
If push comes to shove, Delta can sue and/or stop using the product.
This is ultimately a question of contracts, liability limits— particularly if Delta secured consequential damages.
SaaS contracts are designed to defaulted to NOT allow a customer to pursue consequential damages remedies.
https://en.m.wikipedia.org/wiki/Consequential_damages
This is a question of CrowdStrike’s Deal Desk contracting hygiene.
Deal Desks are the joint finance-legal-sales teams that work on enterprise contracts in scaled enterprise SaaS startups.
This is a SaaS CFOs nightmare.
But Delta for having terrible investment in modernizing their IT infrastructure.
This is very likely to settle out of court or dropped once the CS outage falls out of the news cycle.
Should Delta pursue this path, Delta will have to explain.. why CrowdStrike took responsibility for its actions—swiftly, transparently, and constructively while Delta did not.. Delta would have to preserve a series of documents, including those describing its information-technology infrastructure, IT business continuity plans and its handling of outages in the past five years
If only some smart tech people gathered somewhere and someone could make a mobile app to allow crew to set their status and location instantly. They’d corner the market and save airlines billions.
Society still blames the match based on recent legal outcomes, so Delta will probably win the argument.