I seem to recall that the white paper speaks of a "visual derivative" without specifying it further.
>The decrypted vouchers allow Apple servers to access a visual derivative – such as a low-resolution version – of each matching image.
https://www.apple.com/child-safety/pdf/Security_Threat_Model...
A secret sharing scheme is used to drip-feed Apple the key: each time a positive match occurs, Apple learns a bit more about your key. Once the threshold is reached, Apple will have learned enough to recover your encryption key, and will be able to use it to decrypt all your matching thumbnails at once.
https://www.apple.com/legal/privacy/law-enforcement-guidelin...
(Note: I have worked with law enforcement in the past specifically on a case involving Apple and two iCloud accounts. You submit a PDF of the valid warrant to Apple. Apple sends two emails one with the iCloud data encrypted. A second email with the decryption key.)
To me it's pretty clear they are doing the absolute minimum possible to keep congress from regulating them into a corner, where they lose decision making control around their own privacy standards. The system they came up with is their answer for doing it in the most privacy conscious way (e.g. not decrypting user data in icloud) while balancing a lot of other threat model details, like what if CSAM-hash-providing organizations provide img hashes for a burning American flag, and lots of other scenarios outlined in the white paper.
Surely that's just the data, but resized?
But I'm unsure that the thumbnail is included with every CSAM "voucher" -- it's likely only included when you pass the 30 image limit. Need to read that section more clearly.
According to Apple only images that will be uploaded to iCloud will be scanned.
If this is the case there is zero reason to scan locally and you can just scan the uploaded image once it is on the server.
Apple has not implemented E2E nor has it released a statement indicating this will be implemented in the future.
In summary, I’m guessing they tried to invent a way where their server software never has to decrypt and analyze original photos, so they stay encrypted at rest.
The phrase "not stealing" is almost exclusively used in this context on HN: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
Btw a lot of words in English have multiple meanings, and transform meaning over time, which can be confusing sometimes. For example, in baseball you steal a base, which was being protected by the other team, but you don’t remove the base from the field and run off with it.
I think steal works better than copy here, more accurately conveying meaning and intention, and unjust access.