I feel like phone number lookup is the textbook example of homomorphic encryption not actually working because there's so few keys you can simply enumerate them.
I think here the query exposes who called who, which isn't as enumerable. By encrypting the query homomorphically on the client, the answering service has no knowledge of what number the lookup is for, and so Apple can't build a database of who calls you.
It includes both numbers? That wasn't clear. It sounded like they're just looking up the calling number for fancy caller id. How does the recipient affect the query?
I'm not sure what enumeration attack you have in mind, but if you were to encrypt the same value many times you would not get the same ciphertext under most schemes.
Edge's Password Monitor feature uses homomorphic encryption to match passwords against a database of leaks without revealing anything about those passwords: https://www.microsoft.com/en-us/research/blog/password-monit... So not the first, but definitely cool to see more adoption!
I tried to look homomorphic encryption up casually earlier this year. I saw references that it was being used, but I don’t think they said where.
This is one topic I have a very hard time with, I just don’t know enough math to really grok it.
It just seems crazy a system could operate on encrypted data (which is effectively random noise from the server’s point of view) and return a result that is correctly calculated and encrypted for the client, despite never understanding the data at any point.
I sort of understand the theory (at a very simple level) but my brain doesn’t want to agree.
Maybe it’s the fact it can be done with multiple operators and strong encryption that is hard to grok, but at least here is a very simple example of a limited partially homomorphic encryption:
You have a 7-bit character representation (e.g. ASCII) and your encryption is to add 1 mod 128. E.g. 0 -> 1, 1 -> 2, ... 126 -> 127, 127 -> 0.
As it turns out, all your operations can be represented as adding or subtracting constants. You can now encrypt your data (+1), send it to a remote server, send all the adding and subtracting operations, pull back the processed data, decrypt the data (-1).
Of course, this example is neither useful encryption nor generally useful operation, but can be useful for grokking why it might be possible.
Let's say I want you to add two numbers, but I don't want you to know what those numbers are, nor what the result is. What I can do is multiply both numbers by some other number you don't know. I then give you the premultiplied numbers, you add them, and give back a premultiplied answer. I can then divide out the number to get the true result.
What we've done here is this:
(a * key) + (b * key) = (c * key)
The rules of elementary algebra allow us to divide out the key on both sides because of a few useful symmetries that addition and multiplication have. Namely, these two equations are always the same number:
(a + b) * key = (a * key) + (b * key)
This is known as the distributive property. Normally, we talk about it applying to numbers being added and multiplied, but there are plenty of other mathematical structures and pairs of operations that do this, too. In the language of abstract algebra, we call any number system and pair of operations that distribute like this a "field", of which addition and multiplication over real[0] numbers is just one of.
A simple example of a field that isn't the normal number system you're used to is a 'finite field'. To visualize these, imagine a number circle instead of a line. We get a finite field by chopping off the number line at some prime[1] number that we decide is the highest in the loop. But this is still a field: addition and multiplication keep distributing.
It turns out cryptography loves using finite fields, so a lot of these identities hold in various cryptosystems. If I encrypt some data with RSA, which is just a pair of finite field exponents, multiplying that encrypted data will multiply the result when I decrypt it later on. In normal crypto, this is an attack we have to defend against, but in homomorphic crypto we want to deliberately design systems that allow manipulation of encrypted data like this in ways we approve of.
[0] Also complex numbers.
[1] Yes, it has to be prime and I'm unable to find a compact explanation as to why, I assume all the symmetries of algebra we're used to stop working if it's not.
Second: Google Recaptcha Enterprise can use Homomorphic Encryption to check whether your password has been compromised (searching the set of all breached passwords without disclosing which individual password you want to check)
Now, in practice, HaveIBeenPwned does the exact same thing with a k-anonymity scheme based off of MD5 collisions, which is wayyyy easier in practice and what most people actually deploy, but the Google thing is cool too.
FTA: “Live Caller ID Lookup uses homomorphic encryption to send an encrypted query to a server that can provide information about a phone number without the server knowing the specific phone number in the request”
So, this would require a distributed Secure Enclave or one of them on Apple’s server communicating with one on an Apple device (likely, certainly over time, with lots of different Apple devices fo lots of different iCloud accounts)
I don't see why it would? IIUC, the promise of homomorphic encryption is that I can encrypt my database of contacts and send it to an untrusted server, later send the encrypted query to that untrusted server, and get back an encrypted response, without the server being able to tell anything that couldn't be told from the wire (some bounds on how much data, timing of communication, that sort of thing) or provide an incorrect answer.
What is the processing that the server does on the encrypted phone number? I am not sure I understand. I always thought that this type of encryption was (roughly and imprecisely) - you send some encrypted blob to the server, it does some side effect free number crunching on the blob and returns the output blob. You decrypt the blob and everyone is happy.
But to return information if some number is spam it has to be either plaintext or hashed condition somewhere outside of the phone?
The "side effect free number crunching" in this case is: is <encrypted_phone_number> in <set_of_encrypted_bad_numbers>
You're on the right track with the idea of hashing -- I find it helpful to explain any fancy encryption scheme beginning with "if it were just hashing", then extend to "well this is a very fancy kind of hash", and <poof> now I kind of understand what's going on. Or at least it's no longer magic.
As far as I'm aware homomorphic encryption can keep even a single bit safe, but maybe I missed something.
more like, "move a computation into an progressed, but still unknown, state"
This is one topic I have a very hard time with, I just don’t know enough math to really grok it.
It just seems crazy a system could operate on encrypted data (which is effectively random noise from the server’s point of view) and return a result that is correctly calculated and encrypted for the client, despite never understanding the data at any point.
I sort of understand the theory (at a very simple level) but my brain doesn’t want to agree.
You have a 7-bit character representation (e.g. ASCII) and your encryption is to add 1 mod 128. E.g. 0 -> 1, 1 -> 2, ... 126 -> 127, 127 -> 0.
As it turns out, all your operations can be represented as adding or subtracting constants. You can now encrypt your data (+1), send it to a remote server, send all the adding and subtracting operations, pull back the processed data, decrypt the data (-1).
Of course, this example is neither useful encryption nor generally useful operation, but can be useful for grokking why it might be possible.
What we've done here is this:
(a * key) + (b * key) = (c * key)
The rules of elementary algebra allow us to divide out the key on both sides because of a few useful symmetries that addition and multiplication have. Namely, these two equations are always the same number:
(a + b) * key = (a * key) + (b * key)
This is known as the distributive property. Normally, we talk about it applying to numbers being added and multiplied, but there are plenty of other mathematical structures and pairs of operations that do this, too. In the language of abstract algebra, we call any number system and pair of operations that distribute like this a "field", of which addition and multiplication over real[0] numbers is just one of.
A simple example of a field that isn't the normal number system you're used to is a 'finite field'. To visualize these, imagine a number circle instead of a line. We get a finite field by chopping off the number line at some prime[1] number that we decide is the highest in the loop. But this is still a field: addition and multiplication keep distributing.
It turns out cryptography loves using finite fields, so a lot of these identities hold in various cryptosystems. If I encrypt some data with RSA, which is just a pair of finite field exponents, multiplying that encrypted data will multiply the result when I decrypt it later on. In normal crypto, this is an attack we have to defend against, but in homomorphic crypto we want to deliberately design systems that allow manipulation of encrypted data like this in ways we approve of.
[0] Also complex numbers.
[1] Yes, it has to be prime and I'm unable to find a compact explanation as to why, I assume all the symmetries of algebra we're used to stop working if it's not.
Deleted Comment
Now, in practice, HaveIBeenPwned does the exact same thing with a k-anonymity scheme based off of MD5 collisions, which is wayyyy easier in practice and what most people actually deploy, but the Google thing is cool too.
Dead Comment
This is a massive announcement for AI and use cases related to PII.
Zama uses TFHE, which allows any operation (eg comparisons) with unlimited depth.
So if you only need add/mul, BFV, BGV and CKKS are good options. For anything else, you better use TFHE
I think the real fix is secure enclaves, and those have proven to be difficult as well.
“Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs” - https://arxiv.org/pdf/2407.13055
We were a little worried, but quickly discovered that they used Swift as an adjective not as a programming language.
[Disclosure: I work on the team responsible for the feature]
FTA: “Live Caller ID Lookup uses homomorphic encryption to send an encrypted query to a server that can provide information about a phone number without the server knowing the specific phone number in the request”
So, this would require a distributed Secure Enclave or one of them on Apple’s server communicating with one on an Apple device (likely, certainly over time, with lots of different Apple devices fo lots of different iCloud accounts)
That makes HE anything but Swift (
Deleted Comment
But to return information if some number is spam it has to be either plaintext or hashed condition somewhere outside of the phone?
[Disclosure: I work on the team responsible for the feature]
You're on the right track with the idea of hashing -- I find it helpful to explain any fancy encryption scheme beginning with "if it were just hashing", then extend to "well this is a very fancy kind of hash", and <poof> now I kind of understand what's going on. Or at least it's no longer magic.