Secret Santa Protocol

This year, my friends and I are doing secret santa. Since it's so hard to get all of us together before the holidays to pick names out of a hat, I wrote a script to distribute the pairings over email. This gives me complete control over the pairings, so how do you trust that the results are fair? If you're a person who just received a pairing from this supposedly "fair" algorithm, you want to verify that the other people involved have fair-looking results. This is especially true if you get my name as the person for whom you have to buy the gift.

Suppose you are precisely such a suspicious person: you got a pairing, and want to be pretty sure that you will be the only person buying a present for your given person. One way would be to allow any 2 people, after receiving a name, to communicate and find if they were assigned the same name or not. There are 2 considerations here: if you are the asker, you do not want to disclose the name you have, for fear of ruining the game. Similarly, if you are the person being asked, you do not want to reveal the name assigned to you, especially if it is the name of the asker.

Stated more formally, there are N people who know each other, and each receives some message from the set {M} N , which is known to all participants. We want to design a protocol such that for some participant in Secret Santa (named Peggy, who receives message M i ) can verify whether another particpant (named Victor, who receives message M j ) have received the same message (i == j). The goal for such a protocol is for it to be impossible (or computationally infeasible) for Peggy to figure out M j (when i ≠ j), or for Victor to figure out M j (when i ≠ j), without using any third-party.

A naive way protocol is for Peggy to compute SHA(M[i]) (or some other one-way function), and send the result to Victor. Because it's one way, Victor cannot reverse the SHA, so he cannot figure out M i . Victor would then compute SHA(M[j]) and compare it to Peggy's value. If it's a match, then the messages must match as well. Unfortunately, these sorts of protocols fail when the set of possible plaintexts is known. In particular, we can safely assume that each person participating in the secret santa knows the names of all the other participants. For Victor to figure out M i , Victor may iterate over all N possible messages/names, compute SHA(M[index]) and compare it to SHA(M[i]) , thus figuring out Peggy's message in time O(N) .

My idea to solve this (and I'm guessing I am not even close to the first person to come up with this) is the following protocol, with a mysterious function F(a,b) , which I will explain later:

Peggy generates a random key K1 , and Victor generates a random key K2 Peggy computes C1 = F(M1, K1) and Victor computes C2 = F(M2, K2) Peggy sends C1 to Victor, and Victor sends C2 to Peggy Peggy computes X1 = F(C2, K1) and Victor computes X2 = F(C1, K2) Peggy sends X1 to Victor and Victor sends X2 to Peggy Both then check whether X1 and X2 are equal. If yes, then the messages are equal. If not, then the messages are different.

First, correctness. Peggy computes X1

X1 = F(C2, K1) = F(F(M2, K2), K1)

And Victor computes X2

X2 = F(C1, K2) = F(F(M1, K1), K2)

And these values are equal precisely when M1 == M2 , as long as F(F(a,b),c) == F(F(a,c),b) for any a, b, c.

Now, secrecy. This entirely depends on F. In particular, we want 2 properties:

Given F(a, b) , it is not feasible for an attacker to determine either a, b, or both. If an attacker can provide a and view F(a, b) , it is not feasible to determine b

If this is the case, then consider the case when Peggy is evil and wants to determine M2 . Since Peggy does not know K2 (which is random), iterating over all possible messages and imputting them into F makes no sense, because the second argument is not known. To find K2 , she must either break F(M2, K2) (which she can't do by assumption 1), or she must break F(x, K2) , where x is some string which Peggy provides. This is precisely something that cannot be done for F, by assumption 2.

Similarly if Victor is evil, he may find M1 either by breaking C1 from the first exchange (obtaining either K1 or M1 ), or else provide a suitable string s to break X1 in the second exchange, obtain K1 , then bruteforce M1 .

So now we come to the crux: what is this mysterious F which has the needed properties? I am sure there are quite a few, but the one that comes to mind is the following function, utilizing the fact that discrete log is (for now) hard to compute over the integers mod p:

F(a, b) = ab mod p

where p is a known prime. First the correctness property:

F(F(a, b), c) = F(a,b)c mod p =(ab)c mod p =a(bc) mod p =(ac)b mod p =(F(a,c))b mod p =F(F(a,c), b) mod p

The first security property is trivial: if b is chosen at random and in secret, then given some number F(a, b), it would not be possible to learn a without knowing at least b. The second security property is solving discrete log, for which no known efficient algorithm exists. Note that the discrete log portion is to determine K, which is a random number, rather than M, which is one of only N choices.

As always, let me know what you think on Twitter! In particular, if this is some widely-known zero-knowledge proof algorithm, I would really like to know!