In a previous post I wrote about this question:
On average, how many times do you have to flip a (fair) coin to get an equal number of heads and tails?
and suggested that the answer was probably infinite (on average), although half the time you get there in just two flips: HT or TH, which are half the possible cases for two flips: HH, HT, TH, TT.
To review: there are 4 possible combinations of two flips: HH, HT, TH, TT. Each is equally likely. Two of them – HT and TH – reach equal heads/tails equality. Let’s call this “reaching delta zero” where delta refers to the difference in heads vs tails count. Since two out of four equally-likely sequences reach delta zero, there is a 50% chance of getting there in just two flips.
The next stage has four flips (three flips will never have delta zero), and there are sixteen possibilities for four flips. However, half of them were eliminated at two flips, leaving only the sequences that start with HH or TT:
HHHH, HHHT, HHTH, HHTT TTHH, TTHT, TTTH, TTTT
Two of these sequences reach delta zero: HHTT and TTHH. So there is a 2 out of 8 ( i.e., 1 out of 4) chance of reaching delta zero at four flips, except we have to account for the 50% chance of never even getting to four flips, so there is a 1 out 8 = 12.5% chance of finishing at four flips.
In the prior post I discussed the catalan numbers and how they relate to the number of these sequences. From that discussion we can compute the probabilities of a given trial run ending (reaching delta zero) at any particular stage. This works out as follows:
# FLIPS % reaching equality here 2 50% 4 12.5% 6 6.25% 8 3.91% 10 2.73% 12 2.05% 14 1.61%
Unlike the chart shown in the previous posting, these numbers are the chance of the trial ending at the specific level (e.g., a 3.91% chance any given trial run will take exactly 8 flips, no more / no fewer).
I wrote a simulation in python to test this, and I have been running it for over three weeks. Highlights of key results:
- Trials Completed: 3,414,318. This is how many runs were made where each run consisted of python code along these lines (except the actual code used some unrolling and optimization techniques I’ve written about before). This code is essentially simulating flipping a coin and counting the “delta” between the number of heads and tails, so each flip either adds one or subtracts one from that running delta and equality of heads and tails is reached when the delta becomes zero.
delta = random.choice((-1, 1)) flips = 1 while delta != 0: delta += random.choice((-1, 1)) flips += 1
- Total number of coin flips: 3,197,243,199,696. Yes, you are reading that correctly, over 3 trillion simulated flips. I will point out that python, like many (most? all?) runtimes these days uses the Mersenne Twister random number generator with period 2**19937-1; no older style random number generators that have much much shorter periods would be suitable for this sort of experiment.
- Longest run: 1,204,531,561,152. Yes, that’s another “you read that right”. The longest string of coin flips it took to reach equality was a little over 1.2 trillion. That is, after the first flip generated a delta of +1 or -1, in that sequence it wasn’t until 1.2 trillion flips later that the delta was back down to zero.
- Runs ending at 2 flips: 1,707,887 (50.02%). This is as we predicted. experimental results error is 728 out of over 3.4M trials, which comes to a difference of about 213 ppm (or however you want to think of that difference from the perfect 50%).
Here’s a breakdown of the percentages of trials completed in N flips, and the corresponding prediction:
|length (flips)||actual %||theoretical %|
I found this write-up nicely describing the math of this and in particular analyzing that the average will indeed diverge to infinity.
The results of seemingly-simple probability questions are often counter-intuitive. “How long will it take me to get an equal number of heads and tails while flipping a coin?” and the answer is “fairly often it could take a very very long time”