(An archive question of the week)
One of the discussions we looked at last time involved rolling three dice and getting at least one six. I didn’t go into detail on the calculation there; but I found another place where we discussed it at length. We’ll look at that here.
A wrong way and a right way
The question came in 2013:
At Least One "6" in Three Dice Rolls, Summed Simply
I know this is wrong, but not why. If you have 3 fair six-sided dice, and you wish to find the probability of rolling at least one 6, why can't you add the individual probabilities of getting a 6? In other words, if 1/6 is the probability of getting a 6 in one throw, why is 1/6 + 1/6 + 1/6 not the probability of getting a 6 in three throws? I know it is wrong to do so, but why, exactly? Where is the logical error in reasoning when adding the probabilities? In case you wonder whether or not I know how to actually solve such a simple problem, the actual solution is P(at least one 6 out of three throws) = 1 - P(no six) = 1 - (5/6)^3 And I know that calculating the probability for 6 tosses by adding them would give a probability of 1, which is impossible. But again, where is the logical error in adding them?
Eway has already made two comments I would have made if he hadn’t!
First, we can tell that adding probabilities can’t be right, because no matter what the probability of an event happening on one trial, if you add up enough of them, the total will exceed 1, and no probability can be greater than 1.
Second, the quickest way to find an “at least one” probability is to first find the probability of “none”. These are complementary events: If the outcome is not “at least one”, then it is “none”, and if it is not “none”, then it is “at least one”. So the probability we want is 1 minus the probability of none, which is easily found by multiplication if the individual trials are independent.
Why the wrong is wrong
I replied, starting with what you can’t do:
You can only add the probabilities of mutually exclusive events (unless you make adjustments, as I'll show). This is because if you add events that "overlap," you are counting some possible outcomes more than once.
I’ll have more to say about this below. But I can use a Venn diagram to illustrate the issue. Rather than put the number of items in each region of the diagram, as we usually do, I will put an example of a roll; for example, (1, 2, 3) will mean we roll first a 1, then a 2, then a 3, so that we rolled no sixes.
Because of the overlaps, adding the number (or probability) of rolls with the first, the second, and the third being six will greatly overcount the rolls. For example, the roll (6,6,3) would be counted twice, for the first roll and for the second roll.
More right ways
But the Venn diagram suggests several other ways to do it:
There are, however, alternative ways that do involve adding. One way to break up the event "at least one 6" into mutually exclusive events is: P(exactly one 6) + P(exactly two 6's) + P(exactly three 6's) This gives C(3,1)*(1/6)^1*(5/6)^2 + C(3,2)*(1/6)^2*(5/6)^1 + C(3,3)*(1/6)^3*(5/6)^0 = 3*25/216 + 3*5/216 + 1*1/216 = 91/216
What does this mean? The probability that we roll exactly one six is 3 (the number of ways to choose which one will be a six — the number of one-set regions in the diagram), times the probability that that roll will be a six (1/6), times the probability that the other two rolls will not be six (5/6*5/6). Similarly for the other two probabilities.
This takes some work, but is very straightforward.
This agrees with your calculation: 1 - (5/6)^3 = 216/216 - 125/216 = 91/216
This is what Eway had done, which is the method we usually teach. The probability that none of the three rolls is a six is 5/6 (the probability that an individual roll is not a six), raised to the 3rd power. The probability that this does not happen (that is, that at least one roll is a six) is 1 minus that.
So now we have two ways to calculate the same answer.
You asked about calculating this probability this way: P(first is 6) + P(second is 6) + P(third is 6) This would count rolls with more than one 6 more than once; e.g., the roll 3, 6, 6 is counted as having the second roll be 6 AND having the third roll be 6. In order to correct for this, you would have to use the inclusion-exclusion principle, subtracting out the outcomes counted twice, but then adding back in the outcome that was added three times and then subtracted out three times: P(1st is 6) + P(2nd is 6) + P(3rd is 6) - P(1st and 2nd are 6) - P(2nd and 3rd are 6)
- P(1st and 3rd are 6) + P(all three are 6) = 1/6 + 1/6 + 1/6 - 1/36 - 1/36 - 1/36 + 1/216 = 36/216 + 36/216 + 36/216 - 6/216 - 6/216 - 6/216 + 1/216 = 91/216 So there are three valid ways to do it. Which do you prefer?
Looking back at our Venn diagram, we are adding the three circles together, which counts each of the two-circle regions twice, and counts the middle region three times. Then we subtract each of the intersections of two circles to compensate for the overlaps; but this subtracts the middle region three times, totally removing it from our count (as it had been counted three times initially). So we have to add it back in.
This is an extension of the formula for the union of two sets:
\(P(A\cup B) = P(A)+P(B)-P(A\cap B)\)
For three sets, it is
\(P(A\cup B\cup C) = P(A)+P(B)+P(C)-P(A\cap B)-P(B\cap C)-P(C\cap A)+P(A\cup B\cap C)\)
For more on inclusion-exclusion, see