Adapted from The Improbabilty Principle: Why Coincidences, Miracles, and Rare Events Happen Every Day, by David J. Hand, by arrangement with Scientific American/Farrar, Straus and Giroux, LLC (North America), Transworld (UK), Ambo|Anthos (Holland), C.H. Beck (Germany), Companhia das Letras (Brazil), Grupa Wydawnicza Foksal (Poland), Locus Publishing Co. (Taiwan), AST (Russia). Copyright © 2014 by David J. Hand.
A set of mathematical laws that I call the Improbability Principle tells us that we should not be surprised by coincidences. In fact, we should expect coincidences to happen. One of the key strands of the principle is the law of truly large numbers. This law says that given enough opportunities, we should expect a specified event to happen, no matter how unlikely it may be at each opportunity. Sometimes, though, when there are really many opportunities, it can look as if there are only relatively few. This misperception leads us to grossly underestimate the probability of an event: we think something is incredibly unlikely, when it's actually very likely, perhaps almost certain.
How can a huge number of opportunities occur without people realizing they are there? The law of combinations, a related strand of the Improbability Principle, points the way. It says: the number of combinations of interacting elements increases exponentially with the number of elements. The “birthday problem” is a well-known example.
The birthday problem poses the following question: How many people must be in a room to make it more likely than not that two of them share the same birthday?
The answer is just 23. If there are 23 or more people in the room, then it's more likely than not that two will have the same birthday.
Now, if you haven't encountered the birthday problem before, this might strike you as surprising. Twenty-three might sound far too small a number. Perhaps you reasoned as follows: There's only a one-in-365 chance that any particular other person will have the same birthday as me. So there's a 364/365 chance that any particular person will have a different birthday from me. If there are n people in the room, with each of the other n − 1 having a probability of 364/365 of having a different birthday from me, then the probability that all n − 1 have a different birthday from me is 364/365 × 364/365 × 364/365 × 364/365 … × 364/365, with 364/365 multiplied together n − 1 times. If n is 23, this is 0.94.
Because that's the probability that none of them share my birthday, the probability that at least one of them has the same birthday as me is just 1 − 0.94. (This follows by reasoning that either someone has the same birthday as me or that no one has the same birthday as me, so the probabilities of these two events must add up to 1.) Now, 1 − 0.94 = 0.06. That's very small.
Yet this is the wrong calculation to consider because that probability—the probability that someone has the same birthday as you—is not what the question asked. It asked about the probability that any two people in the same room have the same birthday as each other. This includes the probability that one of the others has the same birthday as you, which is what I calculated above, but it also includes the probability that two or more of the other people share the same birthday, different from yours.
This is where the combinations kick in. Whereas there are only n − 1 people who might share the same birthday as you, there are a total of n × (n − 1)/2 pairs of people in the room. This number of pairs grows rapidly as n gets larger. When n equals 23, it's 253, which is more than 10 times as large as n − 1 = 22. That is, if there are 23 people in the room, there are 253 possible pairs of people but only 22 pairs that include you.
So let's look at the probability that none of the 23 people in the room share the same birthday. For two people, the probability that the second person doesn't have the same birthday as the first is 364/365. Then the probability that those two are different and that a third doesn't share the same birthday as either of them is 364/365 × 363/365. Likewise, the probability that those three have different birthdays and that the fourth does not share the same birthday as any of those first three is 364/365 × 363/365 × 362/365. Continuing like this, the probability that none of the 23 people share the same birthday is 364/365 × 363/365 × 362/365 × 361/365 … × 343/365.
This equals 0.49. Because the probability that none of the 23 people share the same birthday is 0.49, the probability that some of them share the same birthday is just 1 − 0.49, or 0.51, which is greater than half.
Winning the Lottery
For another example of how a seemingly improbable event is actually quite probable, let's look at lotteries. On September 6, 2009, the Bulgarian lottery randomly selected as the winning numbers 4, 15, 23, 24, 35, 42. There is nothing surprising about these numbers. The digits that make up the numbers are all low values—1, 2, 3, 4 or 5—but that is not so unusual. Also, there is a consecutive pair of values, 23 and 24, although this happens far more often than is generally appreciated (if you ask people to randomly choose six numbers from 1 to 49, for example, they choose consecutive pairs less often than pure chance would).
What was surprising was what happened four days later: on September 10, the Bulgarian lottery randomly selected as the winning numbers 4, 15, 23, 24, 35, 42—exactly the same numbers as the previous week. The event caused something of a media storm at the time. “This is happening for the first time in the 52-year history of the lottery. We are absolutely stunned to see such a freak coincidence, but it did happen,” a spokeswoman was quoted as saying in a September 18 Reuters article. Bulgaria's then sports minister Svilen Neikov ordered an investigation. Could a massive fraud have been perpetrated? Had the previous numbers somehow been copied?
In fact, this rather stunning coincidence was simply another example of the Improbability Principle, in the form of the law of truly large numbers amplified by the law of combinations. First, many lotteries are conducted around the world. Second, they occur time after time, year in and year out. This rapidly adds up to a large number of opportunities for lottery numbers to repeat. And third, the law of combinations comes into effect: each time a lottery result is drawn, it could contain the same numbers as produced in any of the previous draws. In general, as with the birthday situation, if you run a lottery n times, there are n × (n − 1)/2 pairs of lottery draws that could have a matching string of numbers.
The Bulgarian lottery that repeated numbers in 2009 is a six-out-of-49 lottery, so the chance of any particular set of six numbers coming up is one in 13,983,816. That means that the chance that any particular two draws will match is one in 13,983,816. But what about the chance that some two draws among three draws will match? Or the chance that some two draws among 50 draws will match?
There are three possible pairs among three draws but 1,225 among 50 draws. The law of combinations is coming into play. If we take it further, among 1,000 draws there are 499,500 possible pairs. In other words, if we multiply the number of draws by 20, increasing it from 50 to 1,000, the impact on the number of pairs is much greater, multiplying it by almost 408 and increasing it from 1,225 to 499,500. We are entering the realm of truly large numbers.
How many draws would be needed so that the probability of drawing the same six numbers twice was greater than one half—so that this event was more likely than not? Using the same method we used in the birthday problem results in an answer of 4,404.
If two draws occur each week, making 104 in a year, this number of draws will take less than 43 years. That means that after 43 years, it is more likely than not that some two of the sets of six numbers drawn by the lottery machine will have matched exactly. That puts a rather different complexion on the Bulgarian spokeswoman's comment that it was a freak coincidence!
And that's just for one lottery. When we take into account the number of lotteries around the world, we see that it would be amazing if draws did not occasionally repeat. So you won't be surprised to learn that in Israel's Mifal HaPayis state lottery, the numbers drawn on October 16, 2010—13, 14, 26, 32, 33, 36—were exactly the same as those drawn a few weeks earlier, on September 21. You won't be surprised to learn that, but scores of people flooded Israel's talk radio programs with calls to complain that the lottery was fixed.
The Bulgarian lottery result was unusual in that the duplicate sets of numbers occurred in consecutive draws. But the law of truly large numbers, combined with the fact that there are many lotteries around the world regularly rolling out their numbers, means we shouldn't be too surprised—and so we shouldn't be taken aback to hear that it had happened before. For example, the North Carolina Cash 5 lottery produced the same winning numbers on July 9 and 11, 2007.
Another, rather frustrating way in which the law of combinations can generate lottery matches is illustrated by what happened to Maureen Wilcox in 1980. She bought tickets containing the winning numbers for both the Massachusetts State Lottery and the Rhode Island Lottery. Unfortunately for her, however, her ticket for the Massachusetts Lottery held the winning numbers for the Rhode Island Lottery, and vice versa. If you buy tickets for 10 lotteries, you have 10 chances of winning. But 10 tickets mean 45 pairs of tickets, so the chance that one of the 10 tickets will match one of the 10 lottery draws is more than four times larger than your chance of winning. For obvious reasons, this is not a recipe for obtaining a vast fortune because matching a ticket for one lottery with the outcome of the draw for another wins you nothing—apart from a suspicion that the universe is making fun of you.
The law of combinations applies when there are many interacting people or objects. Suppose, for example, that we have a class of 30 students. They can interact in various ways. They can work as individuals: there are 30 of them; they can work in pairs—there are 435 different pairs; they can work in triples—there are 4,060 possible different triples; and so on, up to, of course, them all working together—there is one set of all 30 students working together.
In total, the number of different possible groups of students that could be formed is 1,073,741,823. That's more than a billion, all just from 30 students. In general, if a set has n elements, there are 2n − 1 possible subsets that could be formed. If n = 100, the result is 2100 − 1, which is approximately equal to 1030, a truly large number in anyone's terms.
But if even 1030 isn't large enough for you, consider the implications of the World Wide Web, which has around 2.5 billion users, any and all of whom can interact with any of the others. This gives 3 × 1018 pairs and 10750,000,000 possible groups of interacting members. Even events with very small probabilities become almost certain if you give them that many opportunities to happen.
Next time you experience a seemingly strange coincidence, think of the Improbability Principle.
*Editor's Note (2/10/14): This article has been reposted. The original posting held incorrect information due to technical errors that resulted in the loss of superscript formatting.