The Boston Globe reported last week on an elderly couple who were making– and spending– huge amounts of money on the Massachusetts Cash WinFall state lottery. In my usual habit, I started poking around in the mathematics involved, and also as usual, found more interesting effects than I expected.

The game is pretty simple: for each $2 ticket, you select 6 numbers from 1 to 46 (without replacement). The parimutuel semi-weekly jackpot for matching all 6 numbers varies from $500,000 to about $2.5 million, with additional fixed payouts of $2, $5, $150, and $4,000 for matching 2 to 5 numbers, respectively. As is usually the case, this game has negative expected return, with state lottery management taking about $1.10 from each $2 ticket sold.

What makes this game interesting is that when the jackpot reaches about $2 million without a winner– which happens several times a year– the excess jackpot is “rolled down” into larger fixed payouts for matching fewer than all 6 numbers. This last happened on the 14 July drawing, with payouts increased to $26, $802, $19,507, and $2,392,699 for matching 3 through 6 numbers, respectively. The result is a game with positive expected return for the ticket buyer… even if you never expect to win the jackpot!

But as Steve Jacobs used to say (or still says, for all I know), “Expectation isn’t everything.” You still have a lousy 2% probability of actually making any money at all with just one ticket. Expected value is only realized in the long run, considering the average result of a large number of repeated trials.

So the couple in the Globe article simply realized the long run, and actually executed that large number of repeated trials, buying about $614,000 worth of tickets. This strategy not only has a nice expected return of about $184,000, but it is extremely low risk; the probability of making money jumps from 2% with one ticket to over 98% with 307,000 tickets. The couple have already won nearly $1 million this year alone.

What got my attention and motivated this post was the following paragraph from the article:

Mark Kon, a professor of math and statistics at Boston University, calculated that a bettor buying even $10,000 worth of tickets would run a significant risk of losing more than they won during the July rolldown week. But someone who invested $100,000 in Cash WinFall tickets had a 72 percent chance of winning. Bettors like the Selbees, who spent at least $500,000 on the game, had almost no risk of losing money, Kon said.

The interesting question is, how to calculate that 72% probability figure? As far as I can tell, this problem does not have a nice analytical solution. The “big hammer” approach is to estimate by simply simulating the drawing many times. But I wondered if there might be a more efficient way, either analytically or via a simpler form of estimation.

My first idea was to replace the actual game with a more tractable one: consider a lottery where each $2 ticket yields only two possible outcomes: either it wins a payout with probability , or it loses with probability . We can compute the necessary values of and so that the single ticket outcome has the same (positive) expected value and variance as the actual game… but let us also conservatively assume that we never win the jackpot, so that the distribution is not so skewed.

In this simplified game, each ticket wins $4,510.95 with probability about 0.00052. The expected return on a single ticket is the same as the actual (jackpot-less) game, about $.34, and the variance is the same as well. The actual distribution is different, of course, but we should see the same general behavior as we buy more and more tickets: the overall expected return will increase linearly, and more importantly, the probability of making money will also increase, from the lowly 0.00052 with one ticket, approaching 1 for a sufficiently large number of tickets.

Right?

Well, not exactly. The problem turned out to be more interesting than that. Yes, we can efficiently compute the probability of winning money from tickets in our simplified game (I will leave it to the interested reader to work out the details). But I was surprised to find that that probability is not monotonic as a function of , as the following plot shows.

The red curve corresponds to the actual lottery, and was generated via simulation; see the source code at the end of this post. The behavior is what I expected: the more tickets you buy, the higher the probability that you win money.

The blue curve, corresponding to the simplified game, is more interesting. First, just fixing the first and second moments of the distribution did indeed approximate the real behavior as well as I had hoped. We can efficiently estimate the probabilities in the article: even buying $10,000 worth of tickets still only wins money about half the time; buying $100,000 wins about 75% of the time; and buying $614,000 wins money about 97% of the time.

But the unexpected and interesting behavior is the jaggedness of the blue curve. There are many large jumps, at times nearly 20%, where the probability of an overall win decreases with the purchase of a single additional ticket.

ObPuzzle: what is going on here?

For reference, following is the source code for simulating the actual game, with commented parameters for the simplified game as well.

#include "math_Random.h" #include <iostream> int main() { const int num_samples = 10000; // The simplified single-payout game. //const int num_outcomes = 2; //const double probability[] = {0.9994806467243879, 1.}; //const double payoff[] = {-2, 4508.953581737101}; // The actual game during the 14 July rolldown week. const int num_outcomes = 6; const double probability[] = {0.8312777261949867, 0.9776294385532591, 0.9987251808751723, 0.999974270881075, 0.9999998932401705, 1.}; const double payoff[] = {-2, 0, 24, 800, 19505, 2392697}; math::Random rng; // Evaluate buying increasing numbers of tickets. for (int num_tickets = 25000; num_tickets <= 300000; num_tickets += 25000) { int count = 0; for (int i = 0; i < num_samples; ++i) { // Buy and cash in tickets. double win = 0; for (int j = 0; j < num_tickets; ++j) { double p = rng.nextDouble(); for (int k = 0; k < num_outcomes; ++k) { if (p <= probability[k]) { win += payoff[k]; break; } } } // Record whether we make or lose money. if (win >= 0) { ++count; } } // Display probability of winning (i.e., not losing money). std::cout << num_tickets << "\t" << static_cast<double>(count) / num_samples << std::endl; } }