Problem: What is the probability of one specific person to have the same pin digits (assume the pin is four digits) with the person next to him?

Motivation: The “obvious” approach, that is 4!/10^4 is wrong.

As a proof of concept, the following Python program:

pins = ['{0:04}'.format(i) for i in range(0, 10000)] pins_a = set() for p in pins: pins_a.add(frozenset(p)) print(len(pins_a))

Therefore, the probability is at least 1/385 or 0.26% because some tuples are produced more often than some other. UPDATE follows.

Explanation:

If your PIN consists of only one number, say 1111, then it is much harder that the other person only uses this digit to form his PIN (only 1/10^4) while for, say 1234, there are many combinations (4! to be exact).

From now on, we assume that we pick a PIN according to the uniform distribution.

Let’s count (explanation follows):

PINs with only 1, 2, 3 and 4 digit(s):

10

Sanity check: Total number is 10000.

Let’s explain how these numbers are produced:

With only one digit, you can only form 10 different numbers. There are to choose two number out of ten (without ordering). Now, we replicate them. So, in total we can either have {a, a, b, b} or {a, b, b, b} or {a, a, a, b}. Three different ways to replicate them. Let’s take the first way {a, a, b, b}. There are possible orderings. For the other two (they are similar), we have . The way we count the orderings is 4! for the four possible position divided by the number of the same elements in every group, due to symmetries. If you understood the previous one, this one is a piece of cake. Again, to choose three different number. Now, we have three ways of replicating one of them. Therefore, the possible orderings are We pick four different numbers and we care about their ordering.

In order to compute the probability, we need to know how often a specific set of number can occur. We have already argued how often a set with size 1, 2, 3 or 4 can occur. We omit the arguments since they are similar to what we have already seen.

A specific set with one number can be derived by 1 PIN. A specific set with two numbers can be derived by 14 PINs. A specific set with three numbers can be derived by 36 PINs. A specific set with three numbers can be derived by 24 PINs.

Therefore, the total probability is:



Where the factor 10^8 is because now we count matching pairs and there are in total 10^8 pairs.

If this analysis does not convince you, then here are some Python programs, the first one counts the sets with a specific size while the second one emulates the random process.

pins = ['{0:04}'.format(i) for i in range(0, 10000)] counter1, counter2, counter3, counter4 = 0, 0, 0, 0 for p in pins: p = set(p) if len(p) == 1: counter1 += 1 elif len(p) == 2: counter2 += 1 elif len(p) == 3: counter3 += 1 elif len(p) == 4: counter4 += 1 print(counter1, counter2, counter3, counter4)

from random import randint def rand_pin_digits(): a = '{0:04}'.format(randint(0, 10000)) a = set(a) return a match = 0 for i in range(10**6): a = rand_pin_digits() b = rand_pin_digits() if a == b: match += 1 print(match)