Conditional Probability and Multiple Choice

(A new question of the week)

A recent question about probability has ties to Venn diagrams, tables, and Bayes’ Theorem. Questions about answering multiple-choice questions are common; this one offers a twist that provided opportunity to discuss several important concepts.

Here is the initial question, from August:

On a multiple choice question, only one answer is correct. A student can mark it knowingly or make a wild guess. Probability of a student knowing the answer is 2/3. If an answer is correct, find the probability that it was marked knowingly.

(Adapted from IUT 2016-17 Admission Test MCQ 85)

now P(K | C) = P(K ∩ C) / P (C)
How to find P(K ∩ C) ? I don’t think these are independent events.

The actual question from a sample test, as suggested in the description, is itself a multiple-choice question:

It is best to state an entire problem exactly as given, in case unrecognized details are important; in particular, the list of choices often plays an essential role in solving a problem, by suggesting possible approaches or indicating the kind of answer needed. Sometimes it turns out that the problem is defective, because the correct answer is not in the list; knowing that can save a lot of struggle. But in this case, we lost nothing by not being given the choices. What was lost was the number of choices in the hypothetical question! We’ll get that eventually. I also notice that some paraphrasing in the last clause slightly modified the question, though I don’t think it hindered our discussion.

Fida has defined two events:

  • K: Student knows the answer
  • C: Answer is correct

So we are told that P(K) = 2/3; and we want the probability of K, given C, which is the conditional probability P(K | C). Fida expects to find this using the definition of conditional probability, which requires knowing the probability that the student both knows the answer and is correct. (If you ponder what I just said long enough, you might realize that it is “obvious” based on real-world conditions that are not stated explicitly; the hard part of math is often to learn to see the obvious!) The non-obvious part will be to find P(C).

I replied; my first task was to fill in the gap in the question, while also giving a hint:

It can be helpful to start by writing out what we do know:

P(K) = 2/3
P(C | K’) = ?

That second one is implied, because if you don’t know, you are guessing randomly. But we need to know how many choices there are — did you omit that?

Once you have these, see if you can work out P(K ∩ C). It may take several steps.

For the sake of readers who are familiar with a different notation, Fida uses K’ to mean “not K”. Other authors use “~K” or \(\overline{K}\). Similarly, we have been using set notation’s “intersection” symbol where some authors would just use “and”.

We needed to know the number of answer choices for the question in order to work out any probabilities. Fida responded with the missing information from the problem:

I forgot to mention that there are 4 options.

I can’t figure out how to find the intersection.

This provided an additional probability to work with:

So now we know that P(C | K’) = 1/4, right? That is, if you don’t know the answer, you will guess one of the 4 options randomly.

Now, replace the left hand side with its definition, solve for P(C ∩ K’), and see where you can go from there.

This hint may not have been the most useful way to begin; I was at this point just “brainstorming”, thinking of possibilities as I commonly do in initially approaching a problem. If I had actually solved the problem at this point, I might not have considered this a good hint, but it is still worth encouraging a student to pursue any possibilities he might see.

But Fida followed my suggestion:

P (C | K’) =1/4
1/4 = P(C ∩ K’) / P(K’)
P(C ∩ K’) = (1/4) * (1/3) = 1/12

Now, P(C) = 1/4. Since only one is correct,
P (C ∩ K) = 1/4 – 1/12 = 1/6
P(K | C) = P(K ∩ C) / P(C)

So this equals = (1/6) / (1/4) = 2/3, which can’t be true.

Here, Fida probably assumed my suggestion led directly to a next step, and assumed (optimistically but wrongly) that P(C) is 1/4, having just seen that in fact it is P(C | K’) that is 1/4. We have to take a step back.

It is wrong to say that P(C) = 1/4; what you know is that P(C | K’) = 1/4, which is different.

Have you thought about P(C’ ∩ K) or P(C’ | K)?

I had been too busy to take the time to actually solve the problem, so I hadn’t seen that this is the essential key. When I did solve it, I used a common technique of making a square like this and filling it in:

         C     C'   total
K 2/3
K' 1/3
total 1

We could also have used a Venn diagram, though I think this kind of table is easier to work with. The probability that the student knows the answer is 2/3, so that represents the “area” of the Knows circle; the area outside that circle is 1/3. The question now becomes, what does the 1/4 represent?

As we’ll see later, it might be easier, if you use the Venn, to use the set Doesn’t Know, with area 1/3, instead of Knows.

Fida made an attempt, but accidentally put our 1/12 in the wrong spot:

         C      C'    total
K 7/12 1/12 2/3
K' 1/3
total 1

Now what do we do? How do I get P(C)

I asked a further question:

You still haven’t answered the main question.

What is the probability that you know the answer but are not correct? Put that where you put 1/12, and move 1/12 to the right place, recalculating the other numbers.

After another wrong attempt, I made a more direct comment:

If you give an answer that you know, then it will be correct! So P(K∩C’) = 0 — you can’t be wrong if you know it.
Try again.

This is an “obvious” but subtle point; the author of the problem didn’t say that someone who knows the answer will necessarily write the correct answer, and in fact it is a social assumption, not a mathematical certainty. Might not someone who knows the right answer “throw” the test for some reason, such as to avoid being put into the “smart” class to stay with friends? Or might we have said that someone “knows” the answer, when he really only had (too much) confidence? So it is entirely forgivable not to have seen this.

Now Fida got the answer quickly:

        C       C'      total
K      2/3      0       2/3
K'     1/12    1/4      1/3
total  3/4     1/4      1

So the answer is 8/9, thank you very much!!

Here is the reasoning: Because if you know, your answer will not be incorrect, we put a 0 under K∩C’. Therefore, P(K∩C) = P(K), which we know is 2/3. Then we can fill in P(K’∩C) = 1/12 from our work before (which does turn out to be useful), and add up to find that P(C) = 2/3 + 1/12 = 3/4.

Finally, P(K | C) = P(K∩C) / P(C) = (2/3) / (3/4) = 8/9.

Now, after solving a problem, it’s a good practice to look back and see what we have learned. Was there a quicker way we could have seen? Are there ideas for remember for the future?

Above, I made a comment about the “probability that the student both knows the answer and is correct”. Have you pondered it? This is P(K∩C), and since knowing implies being correct, it is identical to P(K), which we were told is 2/3. Combining this with our fact that 1/4 of the 1/3 who don’t know will be correct, so that P(K’∩C) = 1/12, we find that P(C) = 2/3 + 1/12 = 3/4, and so the 2/3 who know are 8/9 of the 3/4 who are correct. So we could work this out without the table (though the table helps me see more clearly what we know).

Finally, let’s fill in that Venn diagram:

We have, in effect, filled this in from left to right.

If we had used K’ (doesn’t know, therefore Guesses) as the left set in the diagram, it would have looked like this, which is perhaps easier to think through:

So, at the end of this, we know the answer. And we are correct. No guessing.

Now, can we make some connections to other problems? The first thing I saw when I read this problem was a reversal of information. We are (implicitly) given conditional probability in one direction (that the answer is correct, given that the student doesn’t know) and are asked about the conditional probability in the other direction (that the student knows, given that the answer is correct). This is a typical Bayes Theorem problem, though that never came up in this discussion. Here are some references to very practical problems of this sort:

Test for Tuberculosis (Doctor Anthony, 1997)

Probability in Virus Testing (Doctor Anthony, 1999)

Lost Town and Finder's Town: Bayesian Probability (Doctor Anthony, 1999)

Two-Headed Coin and Bayesian Probability (Doctor Mitteldorf, 2003)

Doctor Anthony used a table much like mine. Doctor Mitteldorf used a probability tree. One could also just plug numbers into a formula, as you can see here:

Bayes’ Theorem (Wikipedia)

1 thought on “Conditional Probability and Multiple Choice”

  1. Pingback: Bayes and Virus Testing – The Math Doctors

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.