This is so strange, this is exactly how you don't handle problems like this.
Write a blog post, explain what happened, explain who's affected and to what extent, explain if it can be fixed and what you're doing, and explain what you'll do to make sure it doesn't happen again.
Putting out a supposed hidden fix in the Drive for Desktop client, to see if it can recover files locally (?!) when the entire issue is files disappearing from the cloud, doesn't seem like it makes any sense.
Or if the problem really is solely with files that should have been uploaded but weren't, and nothing from the cloud ever actually got deleted, then explain and justify that as well -- because that's not what people are saying.
I don't understand what's going on at Google. If actual data loss occurred, trying to pretend it didn't happen is never the answer. As the saying goes, "the cover-up is worse than the crime". Why Google is not fully and transparently acknowledging this issue baffles me. The corporate playbook for these types of situations is well known, and it involves being transparent and accountable.
To preface, I do not intend to defend Google nor do I work with them or represent them.
That said, I have been in similar situations with large scale customers. It is hard. Some percentage of customers are pathological, and even after you fix their problem refuse to stop continuing the rumors.
Once it’s fixed, I want all communication forward looking. Some percent of people are flat out insane, incompetent, or just assholes. Sometimes you have to lock the thread in order to stop a conversation about something that is already fixed.
Large scale customer bases are just a different beast. Once you experience it, you know what I mean. That doesn’t mean Google took the right path - only people with a comprehensive perspective can evaluate that, and I’m just some idiot on a forum who knows nothing about the specifics.
Of course. Doing the right thing at the moment is also hard. But that's the right thing. Google is famously under-communicating and opaque, locking a thread is par for the course.
Again, of course, their reputation loss doesn't show on their bottom line. (How would it? They let loose the whole CFO army, and we don't really have the convenience of a randomized trial.) But incidences like these are accumulating the kindling to slowly but surely chip users away from the behemoth.
If you lose customers their data, fail to recover it, and then want all communication to be "forward looking" you are either flat out insane, incompetent or just an asshole.
Maybe the “right” grayhat/blackhat way to handle it is to use high-quality, convincing sock puppet accounts to manufacture consensus against the “conspiracy theorists”. It’s not ethical but its the more effective alternative if you’re already at the point of locking threads where people continue to point out that you still haven't fixed the problem.
The part I quote below resonated with me, if you have an email I could reach you at I would like to ask your opinion about how to handle a situation. It is very private.
> I have been in similar situations with large scale customers. It is hard. Some percentage of customers are pathological, and even after you fix their problem refuse to stop continuing the rumors.
>Once it’s fixed, I want all communication forward looking. Some percent of people are flat out insane, incompetent, or just assholes. Sometimes you have to lock the thread in order to stop a conversation about something that is already fixed.
>Large scale customer bases are just a different beast. Once you experience it, you know what I mean. That doesn’t mean Google took the right path - only people with a comprehensive perspective can evaluate that, and I’m just some idiot on a forum who knows nothing about the specifics.
That’s what any normal company who is used to dealing with customers would do, but google isn’t that. Google is entirely unacquainted with the concept of “customer relations”. I’m half convinced that google-the-business sees customers as convenient peasants that purchase whatever it deigns to sell. The idea of supporting customers is basically antithetical to them: look at all the stories of people trying to get support for GCP as a great case in point.
Google, institutionally, still seems to have not realized that many users are now customers and not (or at least, in addition to) product. If you are receiving money from somebody, they are your customer, and you have a responsibility to provide at least vaguely good service.
And this is why no one should have anything to do with Google. If the last fifteen years haven't taught people that, well, there's just no hope for you. For everyone else: Stay away from their offerings as if they were the plague.
It is weird, it's off putting, and it does come off as arrogant. Someone has convinced someone else that these lack of interactions is saving them something or from something. I think it's costs them but I have no data, so it's pure speculation. I know I wouldn't consider them for any serious service except maybe email at this point.
> I don't understand what's going on at Google. If actual data loss occurred, trying to pretend it didn't happen is never the answer.
They're not pretending it didn't happen though? As per the article they acknowledged it and published a help center article on it. They named the software versions affected (notifying the affected users seems impossible, since the entire problem was that the data had not been synced). Following the links in the help center article, during the incident they posted in a pinned article in the support forum (multiple times) on how to avoid triggering the bug and how to avoid making it worse.
That's pretty much what you wanted to see except for a blog post with an RCA, no?
> Or if the problem really is solely with files that should have been uploaded but weren't, and nothing from the cloud ever actually got deleted, then explain and justify that as well -- because that's not what people are saying.
So the suggestion is that in addition to the bug that they acknowledged, there's a totally different one that appeared at the same time affecting totally different functionality and with different symptoms, and that they're covering up despite not covering up the other bug? That seems like a complicated explanation when there's an obvious and simpler explanation around.
That's also the kind of thing that's pretty much impossible to prove categorically, let alone communicate the proof in a way that's understandable to the average user. What are you going to say? "We've checked really hard and can't confirm the reports"?
(I mean, I guess it's possible to do it. Collect 100 credible reports of files going missing that can reliably identify the supposedly missing file by name and creation date rather than say that it was probably a .doc file sometime in March. Then do an analysis on e.g. audit logs on what the reality is. How many files were never there at all? How many were explicitly deleted by the user? How many were accidentally uploaded to a user's work account rather than personal account? How many were still in the drive, and the user just couldn't find them? And yes, once you've exhausted all the possibilities, how many disappeared without a trace? Then publish the statistics. But while doing such an investigation privately to make sure whether there is a problem makes sense, publishing the results seems like a stunningly bad PR strategy even if no data was indeed lost.)
From that help center article: "If you're among the small subset of Drive for desktop users on version 84 who experienced issues accessing local files that had yet to be synced to Drive"
Is "issues accessing local files" how anyone would describe deleting a user's local files?
Your question "why" is probably because you think google should know better. However, I am reminded of the post from a few weeks of the person who left google after decades of working there who claims the culture has changed and attracted more incompetent corporate, political types.
EDIT: found it, not decades but almost 2, he left after working there 18 years.
for google the problem is small enough they're encouraging individuals to file a small claims they'll gladly hand a check for, or it's big enough that google doesn't want to document shooting themselves in the foot.
I also think there's a long tail of Beavis' out there that you need to lock things down to stop the rumors.
Genuinely, I would love an answer from someone that believes in both "never talk to the cops" and "corporations should be open about their fuck-ups" to articulate how they reconcile both concepts. For me they're the same side of the coin, but I'd enjoy to be convince otherwise.
I don't understand the point of Company hosting forums that aren't staffed by Company. Well I do. It doesn't help users at all. The only feature is Company's censorship. It's a hostile social hack on the user base who should be using a different forum host.
When I ask people if they have out-of-cloud backups of their data they look at me as if I'm mad. The cloud can't lose data. Until it does. And then what?
This makes me irate, in a way that’s hard to express.
They are not doing it wrong. What’s the threat model you’re saying people are not accounting for? eu-west-1 getting nuked?
“AWS” also isn’t a monolithic entity: for all intents and purposes AWS Backup is a separate vendor from AWS RDS, just with a unified billing and management pane.
I’d rather use that vendor, integrated with my AWS resources and managed with the same access controls, encryption, billing, etc that I use for everything else than ship it off to a random third party and maintain that connection.
Because the risk factor of multiple, isolated and separate AWS teams running different products with different infra having simultaneous large data loss incidents boils down to “nukes”.
So maybe people get irate in the same way as they might do with people who say stuff like “the cloud is a scam, why use it when you can host things on servers in a closet?”
I think it is generally fine to not have out-of-cloud backups of data as long as you still have the primary copy locally (as opposed to data being only in the cloud), so you're screwed only if the cloud provider loses data the same day as you happen to lose your local data.
I like the expression, "Your data doesn't exist unless it exists in three places, and at least one of those places should be under your direct control."
I'm currently battling their support. Tried opening a Play account to publish an app to the app store but missed an email to verify my identity. Now the link in the email no longer works because "Google couldn't verify your identity" as I've had "too many tries" and my account is restricted meaning I can't publish the app.
Support just repeats the same things back that I've had too many tries and my account is restricted and I can't get a refund.
It's woeful how bad the support is to get such a simple thing sorted out. Don't miss that email if setting up a developer play account!
(If anyone can fix this my developer play account ID is: 7827257533299144892)
I like the text at the bottom of the page if you don't have javascript enabled:
> Hey NoScript peeps, (or other users without Javascript), I dig the way you roll. That's why this page is mostly static and doesn't generate the list dynamically. The only things you're missing are a progressive descent into darkness and a happy face who gets sicker and sicker as you go on. Oh, and there's a total at the bottom, but anyone who uses NoScript can surely count for themselves.
I feel this is increasingly becoming the era of the NAS.
A lot of the time I go looking for shows or movies it’s no longer offered on the same service if quite literally at all. Many of my liked YouTube videos are now just [deleted].
Not to mention any data you store in the cloud when engineer(s) experience career altering events.
Most Americans don't have enough data to warrant a dedicated NAS. And even if they do, the cost is a major dealbreaker, $400+ including drives is way more than most would ever consider spending on computer hardware.
So last time I saw this discussed on HN [1], people said there were tape backups of all these files and that they would be fully recovered. Some people said they worked adjacent to this tape backup system. I'm inclined to believe they weren't lying and that the tape backup system does exist.
So why weren't they able to recover from tape? Is the tape backup more limited than people reported, and this data wasn't backed up? Was it just too difficult and expensive to scan the tapes and decide which was the canonical version of each file?
Are we sure this is limited to Drive? The Ars article mentions that some users reportedly lost data without using the desktop app at all, which seems to imply that (one of?) the bugs was inside Google's infra.
I wonder if they might have suffered some invisible data corruption issue in Colossus or whatever they use now, and the effects on Drive just happen to be the most visible. Though presumably whatever broke wasn't part of GCP or we would have noticed by now, right?
> Are we sure this is limited to Drive? The Ars article mentions that some users reportedly lost data without using the desktop app at all, which seems to imply that (one of?) the bugs was inside Google's infra.
Seems much more plausible that there's something wrong with the backend code for google drive (the product).
Hard to tell. Google does lose data, but they aren't going to own up to it.
Anecdotally, I use a different Google Drive client (Syncdocs) and haven't lost anything. However, I don't know what % of users are affected.
Seriously. The system in question is a distributed one with multiple participants and the resolution, which is gated to specific versions of the Windows sync client, strongly suggests the bug is in that client.
Write a blog post, explain what happened, explain who's affected and to what extent, explain if it can be fixed and what you're doing, and explain what you'll do to make sure it doesn't happen again.
Putting out a supposed hidden fix in the Drive for Desktop client, to see if it can recover files locally (?!) when the entire issue is files disappearing from the cloud, doesn't seem like it makes any sense.
Or if the problem really is solely with files that should have been uploaded but weren't, and nothing from the cloud ever actually got deleted, then explain and justify that as well -- because that's not what people are saying.
I don't understand what's going on at Google. If actual data loss occurred, trying to pretend it didn't happen is never the answer. As the saying goes, "the cover-up is worse than the crime". Why Google is not fully and transparently acknowledging this issue baffles me. The corporate playbook for these types of situations is well known, and it involves being transparent and accountable.
That said, I have been in similar situations with large scale customers. It is hard. Some percentage of customers are pathological, and even after you fix their problem refuse to stop continuing the rumors.
Once it’s fixed, I want all communication forward looking. Some percent of people are flat out insane, incompetent, or just assholes. Sometimes you have to lock the thread in order to stop a conversation about something that is already fixed.
Large scale customer bases are just a different beast. Once you experience it, you know what I mean. That doesn’t mean Google took the right path - only people with a comprehensive perspective can evaluate that, and I’m just some idiot on a forum who knows nothing about the specifics.
Of course. Doing the right thing at the moment is also hard. But that's the right thing. Google is famously under-communicating and opaque, locking a thread is par for the course.
Again, of course, their reputation loss doesn't show on their bottom line. (How would it? They let loose the whole CFO army, and we don't really have the convenience of a randomized trial.) But incidences like these are accumulating the kindling to slowly but surely chip users away from the behemoth.
> I have been in similar situations with large scale customers. It is hard. Some percentage of customers are pathological, and even after you fix their problem refuse to stop continuing the rumors.
>Once it’s fixed, I want all communication forward looking. Some percent of people are flat out insane, incompetent, or just assholes. Sometimes you have to lock the thread in order to stop a conversation about something that is already fixed.
>Large scale customer bases are just a different beast. Once you experience it, you know what I mean. That doesn’t mean Google took the right path - only people with a comprehensive perspective can evaluate that, and I’m just some idiot on a forum who knows nothing about the specifics.
Supply side economics says this is what we supply, you buy. Thats how it works.
They're not pretending it didn't happen though? As per the article they acknowledged it and published a help center article on it. They named the software versions affected (notifying the affected users seems impossible, since the entire problem was that the data had not been synced). Following the links in the help center article, during the incident they posted in a pinned article in the support forum (multiple times) on how to avoid triggering the bug and how to avoid making it worse.
That's pretty much what you wanted to see except for a blog post with an RCA, no?
> Or if the problem really is solely with files that should have been uploaded but weren't, and nothing from the cloud ever actually got deleted, then explain and justify that as well -- because that's not what people are saying.
So the suggestion is that in addition to the bug that they acknowledged, there's a totally different one that appeared at the same time affecting totally different functionality and with different symptoms, and that they're covering up despite not covering up the other bug? That seems like a complicated explanation when there's an obvious and simpler explanation around.
That's also the kind of thing that's pretty much impossible to prove categorically, let alone communicate the proof in a way that's understandable to the average user. What are you going to say? "We've checked really hard and can't confirm the reports"?
(I mean, I guess it's possible to do it. Collect 100 credible reports of files going missing that can reliably identify the supposedly missing file by name and creation date rather than say that it was probably a .doc file sometime in March. Then do an analysis on e.g. audit logs on what the reality is. How many files were never there at all? How many were explicitly deleted by the user? How many were accidentally uploaded to a user's work account rather than personal account? How many were still in the drive, and the user just couldn't find them? And yes, once you've exhausted all the possibilities, how many disappeared without a trace? Then publish the statistics. But while doing such an investigation privately to make sure whether there is a problem makes sense, publishing the results seems like a stunningly bad PR strategy even if no data was indeed lost.)
From that help center article: "If you're among the small subset of Drive for desktop users on version 84 who experienced issues accessing local files that had yet to be synced to Drive"
Is "issues accessing local files" how anyone would describe deleting a user's local files?
EDIT: found it, not decades but almost 2, he left after working there 18 years.
https://ln.hixie.ch/?start=1700627373&count=1
See the HN comments
https://news.ycombinator.com/item?id=38381573
I also think there's a long tail of Beavis' out there that you need to lock things down to stop the rumors.
Genuinely, I would love an answer from someone that believes in both "never talk to the cops" and "corporations should be open about their fuck-ups" to articulate how they reconcile both concepts. For me they're the same side of the coin, but I'd enjoy to be convince otherwise.
Whereas in business, your public and private statements determine your entire company image.
Statements made by companies in public places cannot “only be held against them”. It’s completely different.
Is it a monopolistic behaviour by the book?
Since they struggle to innovate they use the strategy of cost cutting. What means moving development and support (if any) to low cost countries.
Low cost countries mean low cost country standards in both code quality and handling of the disaster after the code breaks.
Deleted Comment
We used to run an ad on reddit that read something like:
Your infra is on Amazon AWS and your backups are on Amazon AWS ... you're doing it wrong ..."
... and we had to stop because it made people angry.
They were quite irate and combative at the very notion that there was
any non-zero risk whatsoever* at AWS.They are not doing it wrong. What’s the threat model you’re saying people are not accounting for? eu-west-1 getting nuked?
“AWS” also isn’t a monolithic entity: for all intents and purposes AWS Backup is a separate vendor from AWS RDS, just with a unified billing and management pane.
I’d rather use that vendor, integrated with my AWS resources and managed with the same access controls, encryption, billing, etc that I use for everything else than ship it off to a random third party and maintain that connection.
Because the risk factor of multiple, isolated and separate AWS teams running different products with different infra having simultaneous large data loss incidents boils down to “nukes”.
So maybe people get irate in the same way as they might do with people who say stuff like “the cloud is a scam, why use it when you can host things on servers in a closet?”
Support just repeats the same things back that I've had too many tries and my account is restricted and I can't get a refund.
It's woeful how bad the support is to get such a simple thing sorted out. Don't miss that email if setting up a developer play account!
(If anyone can fix this my developer play account ID is: 7827257533299144892)
> Hey NoScript peeps, (or other users without Javascript), I dig the way you roll. That's why this page is mostly static and doesn't generate the list dynamically. The only things you're missing are a progressive descent into darkness and a happy face who gets sicker and sicker as you go on. Oh, and there's a total at the bottom, but anyone who uses NoScript can surely count for themselves.
> Rock on!
A lot of the time I go looking for shows or movies it’s no longer offered on the same service if quite literally at all. Many of my liked YouTube videos are now just [deleted].
Not to mention any data you store in the cloud when engineer(s) experience career altering events.
1) Western Digital (MyBook I think was the name?) with built-in HD -- notoriously buggly and unreliable pieces of junk
2) Then just... no NAS as cloud seemed to be the answer
3) Now, Synology NAS enclosures and their custom OS (DSM) exists which is just a dream of ease-of-use
If there is a new era of NAS, I'd say it's single-handedly being enabled by Synology, which is really quite remarkable.
Synology is cool but too limited in what it can do for a homserver, IMO.
Deleted Comment
So why weren't they able to recover from tape? Is the tape backup more limited than people reported, and this data wasn't backed up? Was it just too difficult and expensive to scan the tapes and decide which was the canonical version of each file?
[1] https://news.ycombinator.com/item?id=38427864
Charitably, the whole system is on tape and piecemeal restoration is impossible.
This is why professionals practice their restoration protocols regularly.
Not to imply that it's reasonable to treat customers of commodity SaaS as professionals.
I wonder if they might have suffered some invisible data corruption issue in Colossus or whatever they use now, and the effects on Drive just happen to be the most visible. Though presumably whatever broke wasn't part of GCP or we would have noticed by now, right?
Seems much more plausible that there's something wrong with the backend code for google drive (the product).
[1]: https://bofh.bjash.com/bofh/bofh1.html