Readit News logoReadit News
Posted by u/xstartup 8 years ago
Ask HN: How are you implementing GDPR-compliant soft deletes?
The idea: Customer requests account or item deletion, you set it to "deleted" in DB without actually deleting it and it helps for documentation purpose should the dispute arise over some issue in future.
snowwolf · 8 years ago
Soft deletes as you describe aren't allowed under GDPR.

But a possible solution may be to disassociate the data from the customer (as long as the data itself isn't considered 'personal data'). For example, if the reason the data falls under GDPR is because it is connected to the customers email address, you could clear the email address. But that wouldn't let you ever re-associate it. But you could maybe (but don't take my word for it) one way hash the email address (bcrpyt, etc.) and if a dispute arises in the future scan the hashes for a match of the person raising the disputes email address in order to re-associate the data.

samsari · 8 years ago
My understanding, as someone implementing the GDPR-compliance for my company right now, is that if you could produce the same one-way hash a second time from the same input email address then the hash is still considered PI.
snowwolf · 8 years ago
Fair point.

I can see the purpose of "right to be forgotten", but I think in some circumstances it is going to be abused. Any service that "bans" users for fraudulent/abusive activity and stores data about the banned user to prevent them from creating new accounts is going to have a problem. Banned user can just request to be forgotten and then create a new account. Unless there is some exception within GDPR that will support this use case.

tyfon · 8 years ago
Yep, we've been through this discussion where I work just a few days ago and one way hashing even with salt is _not_ compliant as you can search for whatever you hashed and get a hit (SSNs, emails etc).
pc86 · 8 years ago
Do you know how you are supposed to handle disputes in the future? If I ask that all my information be deleted and I say n months later I was charged for something I never received, how does the company disprove that?
politelemon · 8 years ago
This is the guidance we have been given as well.

It'll depend on the kind of data you're talking about. If you are obligated to keep that data for legal purposes then there's no working around that. But then make an effort to only keep exactly the data you need.

However in the context you're asking - absolutely do delete where possible. If not possible then pseudo-anonymization is a way of dealing with this.

eb0la · 8 years ago
Consult your Data Protection Officer first.

GDPR says you must delete information about the customer; but there are cases where you still might need to have that data available.

If your customer can interact with another one inside your app/platform, he/she can commit a crime, and you might be required by court (and by law) to disclose some information (even conversations! inside the platform).

Setting something to "deleted" might not be the best way to do the actual soft delete.

Sometimes you can "delete" that user moving it to a separate part of an LDAP branch (where nobody except someone with authority can access).

In other cases, you can add the "deleted" flag on the table. If so, MAKE SURE your app access the data from a view of the table where the 'deleted' users are not present. Even better: partition the underlying table based on the "deleted" field to physically separate active and deleted users.

But whatever you do, ask your Data Protection Officer first.

jacquesm · 8 years ago
I'm going to go out on a limb here and guess that 99% of the companies out there affected by the GDPR and the OP in particular do not have a DPO (yet), and may not realize they need one, and even if they do know that then they likely won't be able to fill the seat either in time or with someone competent.

Every year we look at quite a few companies, this is the first year that I've spotted a DPO in the wild, and impressively, they even knew their stuff.

ust · 8 years ago
Not every company needs a DPO though, e.g. check here:

https://www.eugdpr.org/key-changes.html

Maybe his company doesn't need one. Of course, whether he has a DPO or not, still the question remains of how to "properly" delete the personal data.

roel_v · 8 years ago
Is there anyone reading this whose company has a DPO already? Is it an internal or external person? How technical are they? I'm a developer and I have a law degree; would that put me in an advantageous position to become one? Is there a market for 'consulting DPO's', like companies hire accountants, if that's allowed? Or do the big consultancy firms have the GDPR market cornered already? I wouldn't want to go in a direction where I would become what today's 'security auditors' do - go through a checklist of mostly irrelevant topics, drum up a list of 'recommendations' that usually aren't relevant or misunderstanding the situation but nobody cares anyway because it's all just busywork to get 'certified' for this or that (or insurance requires it). But if it would be actually working with technical teams on questions like this, that would be interesting.
jacquesm · 8 years ago
> Is there anyone reading this whose company has a DPO already?

I've seen one in all of 2017 (out of ~20 companies).

> Is it an internal or external person?

In that case it was internal

> How technical are they?

More legal than technical, but that's a very small sample.

> I'm a developer and I have a law degree; would that put me in an advantageous position to become one?

Yes. In fact that's probably one of the most lucrative combinations of fields.

> Is there a market for 'consulting DPO's', like companies hire accountants, if that's allowed?

YES! In fact if you are halfway decent this would be an extremely lucrative thing to do, but it probably will become less so over time as the knowledge gets diffused.

raptorcomp · 8 years ago
THe answer varies depending on the size of the company. However, I have seen many IT related professionals taking over the GDPR issues. In larger companies it is a more legal role. As to your career question: We are currently involved in many different and exciting project that push the boundaries of law and technology with respect to data protection. Currently, everyone is a GDPR consultant but quality and nature of the work differ substantially. For most part it is an exercise in producing documents and procedures to prove compliance. However, when it comes to implementing technical solutions you can really stand out. So if this is an area that you are interested in you should move fast. There is also a growing number of international opportunities as even non-EU companies require GDPR experts.
yread · 8 years ago
My employer (3000 employees, medical research) has an internal DPO. I only dealt with him once but was impressed with his technical as well as legal knowledge.
outsideoflife · 8 years ago
Yes, but it was a token gesture to DPA and in reality

1) It was not their main role

2) It was a token gesture, someone to address post too!

Every consulting firm I know is trying to sell GDPR services atm...

kyriakos · 8 years ago
As far as I know the data must be properly deleted not just marked as deleted. Essentially you can zero out all fields which include personal information. Keeping the data as soft deleted you still risk it in case of a leak due to a hack.
Ezku · 8 years ago
> it helps for documentation purpose should the dispute arise over some issue in future.

If you are required to hold on to the data for legal purposes such as dispute settlement, there is no issue. The customer can request you delete such data but you have no obligation to do so. Issues arise when holding on to the data is no longer "necessary". At that point soft deletion is not enough and you must be able to remove personal data regarding that customer from your systems. Source: IP lawyer I talked with last week on this.

From what I can tell, one good way of going about this in case you really do not want to throw away data – such as when you're doing event journaling – would be to have all personal data encrypted. When the data reaches its expiration point or is requested for deletion, throw away the decryption key and all you're left with is is the metadata which you are allowed to hold on to for purposes of running your business.

EDIT: This concept is called 'cryptographic deletion'. Here's one whitepaper on the subject. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.397....

andriesvh · 8 years ago
Are you sure about throwing away encryption keys is sufficient to be GDPR complient? Does this comes from IP lawyer as well?
rbf · 8 years ago
If the information no longer is possible to decrypt, it would no longer be considered personal data.
Ezku · 8 years ago
That's a good thing to point out, thanks. No, I'm afraid that bit is my own speculation.
zimpenfish · 8 years ago
I don't think "soft deletes" are actually allowed, are they?

https://gdpr-info.eu/art-17-gdpr/

> the controller shall have the obligation to erase personal data without undue delay

raptorcomp · 8 years ago
Some comments on the legal aspect to deletion under the GDPR: 1. deletion can generally only be requested if the personal data is being processed under the individual´s consent. Thus, other personal data such as under legitimate interest or execution of a contract does not fall under it. 2. The rule on deleting the data is not absolute as data retention laws prevail over this rule. Thus, only if no data rentention law mandates the storing of the data (which is often the case for business communication) then you are obliged to delete the data or anonymize it. Hope this clarifies the non-technical aspect. Dominic Staiger https://www.raptorcompliance.com/en
apw808 · 8 years ago
OK. Here goes... Advice...

Firstly, download the regulation itself. At the very least, read article 17. Article 17 concerns something called "The right to be forgotten". It is about 25 lines of legalese. Once you have read it, have a think about it. Then read it again and have a longer thing about its implications. Then, just to be sure you have not gone stark staring bonkers, read the rest. Carefully. Be under no illusion, the GDPR is a game changer in information management terms, let alone anything else.

jjoergensen · 8 years ago
Deletion of backup-data is also an interesting topic
jakozaur · 8 years ago
The law itself was written by someone unaware of that. A lot of interpretations:

1. The most extreme, go back to all of your backups and delete them too.

2. You don't need to do anything, if you do not touch the backups and truly treat them for disaster recovery.

3. Your backups need to have reasonable retention (e.g. two year) and way to apply post requests after recovery.

4. A lot of in between.

5. My personal interpretation is that in first year of GDPR there will be so many companies that are not even trying to be compliant. Any companies showing any reasonable efforts will be just left alone and at worst heard some recommendations. Of course ad-tracking companies might get screwed, but their business model seems to be incompatible with GDPR.

Also right to erasure can be tricky (e.g. what if you keep records for support/warranty purpose). What you should do if someone exercise their right to be forgotten and than ask you for refund.

ZenoArrow · 8 years ago
In what world is two years worth of backups required for reasonable retention? Either the backups are tiny or the company involved has got more money than sense. I'd see no reason, in any company, for backups to be held for longer than 6 months, and that would be an outside estimate (many companies could get away with only having a couple of months worth of backups).
raptorcomp · 8 years ago
Answer to point 5: On first glance I would agree with this view, however, there is the factor of market competition you must take into account. If a company only receives a small fine for non-compliance (or is not prosecuted) then its competitor can make the argument that this is anti-competitive conduct as the non-compliant company has saved money through its non-compliance and the fine does not stand in relation to the money saved. Through this argument the fines could increase significantly over a very short timeframe placing great pressure on companies to observe the GDPR. As the money goes to the data protection authorities their ability to prosecute will grow steadily.
zimpenfish · 8 years ago
I believe (not a lawyer, etc.) that you don't have to delete from existing backups as long as you have a process to immediately wipe the customer details again if that backup is restored (and before any other processing can happen.)

[edit to clarify that you wipe the customer details]

dijit · 8 years ago
The way I'm handling it (probably non-compliant) is to store a list of the internal keys we use for people in a list as GDPR requests come in.

If I restore a backup it will go via this list and ensure that content in my backups which are keyed to deleted accounts are never restored.

In theory we have the data, but it's never reachable by internal systems. -- Anything else is essentially compromising the integrity of a backup.

jacquesm · 8 years ago
Spot on. There are two major issues with the law as written, backups and conflicts with (possibly local) data retention laws, so right now the local data retention laws will likely take precedence and backups are not going to be in-scope until a lot of low hanging fruit has been plucked.
Kiro · 8 years ago
I won't touch my backups even if it means my company is killed by fines or I go to jail. Still worth it just to refuse submitting to this nonsense.
jacquesm · 8 years ago
It's not nonsense, it just isn't quite put down in a way that is practical, on top of that it just makes demands and does not even begin to give guidance on how to comply with those demands which does not help for smaller companies.
unkoman · 8 years ago
We're building a consent framework API so our customers can consent for personal data use. Data is then cleaned and transformed (ETL) from personally identifiable to pseudo-anonymised. The data is also separated into two separate encrypted storages for anonymous and pseudo-anonymised data for generalisation and separation. Random (important) hashed identifiers are created and put into a metadata service which is used as a lookup-table. If right to be forgotten is invoked, the data is disassociated from the pseudo-anonymised and personally identifiable data thus making it anonymous.

Important is also how you handle data analytics and this is why we're deploying high restrictions on raw data. Analytics will only be able to be done through an analytics service which can give the employees access to only certain parts of the data which is approved for the use-case. We're using Apache Sentry for fine grained role based authorisation to data and metadata and a directory services for user auth.

Things we've learned:

* Minimise data usage

* Don't use personally identifiable data

* You will need to be able to prove consent when it comes to data usage and it cannot be consent by default, it has to be opt-in

* Log all data access so that use cases can be proved. This needs to be evaluated and audited

* Encrypt in transit and at rest

* Centralise mapping for all data