Dealing with Infinity in Probability Problems

Last week we looked at the probability that one of infinitely many possible triangles is an acute triangle, and ended up thinking about continuous probability distributions in general. I thought it might be good to look at some old questions dealing with similar issues in various ways. One will even come back to that same problem, in a new way!

Infinite but discrete: Probability that an integer is a multiple

First, this is from Sheri in 1996:

Probability that a Random Integer...

I started trying to figure out what the probability that an integer chosen randomly would be an integer multiple of a given integer "n" (or perhaps of a prime "p")...and then the probability that it would be a multiple of, say, the integers 2...7, and so on. I think I got stuck on trying to figure out how to formulate  the "or" probabilities and add things up properly. I think my idea was to generate a sequence P(n) where P(n) would be the probability a random integer would be a multiple of at least one of the integers from 2 up to n. Can you head me in the right direction?

Doctor Tom answered:

Well, to be precise (and that's what we mathematicians have to be!), I have to know exactly what you mean by "an integer chosen randomly".

If you're talking about the entire infinite set of integers, there is no way to do this without some sort of a distribution function over the integers, and there is no such function that gives an "equal probability" for all integers.

We’ll see more about why as we go along; the basic idea is that if every element of a finite population of N items is equally likely, then the probability of one item being chosen is 1/N. But if N is infinite, that would make each probability zero – and so is the probability for any finite set of them. We can’t work with that.

But there’s a workaround for this problem:

To make things precise, here's what I'll assume you mean:  If M is a very large integer, and we randomly (with equal probability) choose an integer between 0 and M, what's the probability that it is divisible by n?  For any fixed M, you can work this out, and then if we take the limit of these probabilities as M goes to infinity, I think that will be what you want.

So if your question just concerns a single number n, the answer is that the limiting probability is 1/n.  For large M, roughly 1/n of the integers less than M are divisible by n, and the error is smaller and smaller as M gets larger and larger.

So, if n is, say, 6, then every sixth integer is divisible by 6, which will be true no matter how large M is; so the probability of a multiple of 6 is \(\frac{1}{6}\).

He went on to answer the rest of the question, which was essentially just to use the LCM of all the numbers given in place of our single n.

Infinite and continuous: Probability that a point is in a triangle

Next, from 2003:

Probability in the Infinite Plane

Three randomly drawn lines intersect so as to form a triangle on an infinite plane. What is the probability that a randomly selected point will fall inside that triangle?

Should points falling on one of the three lines be considered as a possibility?

Considering just one of the lines, I believe that there are three possibilities: above, on, or below the line. Thus each probability for the interior of the triangle would be 1/3, and the overall probability would be 1/27. My teacher disagrees.

That first problem had an infinite but discrete sample space. Here, it is infinite and continuous, compounding the difficulty.

Jim, though, has several issues, starting with what probability is.

Doctor Wallace answered:

Hello Jim,

This is an interesting problem. It is one that intrigued me to go and do some research to further my own knowledge of such problems.  I would like to share with you what I discovered.

First, your problem seems to fit into the category of "geometric probability," that is, probability that is computed using the principles of geometry and models of area. Here is a sample:

A triangle of base 10 and height 5 is drawn on the coordinate plane.  It is surrounded by a rectangle of area 100. What is the probability that a randomly selected point inside the rectangle lies within the triangle?

We express this probability as the ratio of the area of the triangle to the area of the rectangle. This would be 25/100 or 1/4, which makes sense, since the triangle comprises 1/4 of the area of the rectangle.

Geometric probability is what we used last time, and will explore more below. There are infinitely many points in the rectangle, so we can’t just divide the number of points in the triangle by the number in the rectangle; but we can assume that probability is the ratio of areas.

Examining your problem about the lines on an infinite plane, I do not think that the approach of trying to calculate the probability of the random point lying on, over, or above the triangle's sides will yield anything meaningful to the larger problem.  1/27 would be the correct answer to (1/3) cubed, but that assumes that the probabilities that the point lies above, on, or under the line are equal. Are they?

This is a common issue for beginners to probability. Similarly, the fact that a coin could land heads, tails, or on edge does not mean that the probability is 1/3, because those are not equally likely. This is discussed in How Do You Know That Events Are Equally Likely?

Start with a finite region

As in the first problem, we can start with a finite region, which still contains infinitely many points, but at least has finite area:

Simplify the problem a little. Suppose you were to draw a line on the wall of your room, cutting the wall in half. Now throw a dart at random at the wall. Would you expect that the probability of the dart landing ON the line to be equal to it landing above or below? Surely not. There is more "space" for the dart to land above and below. There is a small probability that it will land on the line, yes, but it vanishes when considered against the larger spaces of the rest of the wall. This is why an "area model" is useful. If the line is drawn halfway across the wall, we would expect the probability to be about 1/2 for above or below, because the areas are about equal.

We can ignore points on the line itself, because its area is zero (in principle; and very near zero for an actual drawn line). But the probability of hitting above the line is 1/2 only if that area is 1/2 of the wall.

Now back to your triangle. Suppose we forget about the point ON the line. Does the point have an equal chance, 1/2 and 1/2, of landing above or below? Yes. But you now have three lines, and the probability of one point independently landing below all of them would be 1/2 individually, yes. So the probability would be 1/2 cubed, or 1/8 of landing below all of them. But again, independently. When you have the lines form a triangle, this is no longer the same question! The lines are interacting with each other, and the resulting area of the triangle can now vary considerably.

All three lines will not cut the wall in half; and although there will be 7 (not 8!) regions, they will not have equal areas, so the probability will be neither 1/8 nor 1/7:

Think back to the wall of your room again. Imagine your three lines crossing it, but sloped in such a way that the triangle formed is a very, very tiny one. Now imagine another wall, again with three lines, but sloped so that the triangle formed is very large - it could even take up most of the area of the wall.

Would you expect a randomly thrown dart to land in each of the two triangles with equal probability? Surely not. Again, the randomly selected point will have a greater chance of landing in the triangle with the larger area. So 1/8 can't be meaningful any more, since we would get 1/8 for the probability of either triangle, or, for that matter, for any triangle we drew. Again, this is because the 1/8 is the answer to a completely different problem.

What about an infinite plane?

Let's return finally to your original problem. You asked about a triangle on an infinite wall (plane). You also gave no specifics about the triangle. With three random lines, it is possible to form a triangle of any area we like. However, the triangle formed will definitely have a finite area. It may be very, very large, but it will definitely be a bounded area. The plane, however, is unbounded. So if we try the area method for probability now, we would get a ratio of finite to infinite. 

Imagine the wall of your room again. The wall is infinitely large. It is limitless. It goes on and on and on...  And somewhere on it, is a finite triangle, formed by your lines. This triangle, no matter how large its finite area, pales in comparison to limitless infinity. The triangle is swallowed up into boundlessness like a tiny drop in a vast ocean.

Now what is the probability that your randomly selected point, somewhere on that vast plane lands in the triangle?  Yes... vanishingly small. Effectively zero. Will it really be zero?  Theoretically no, but practically yes.

Actually … theoretically yes! A zero probability does not mean absolute impossibility in an infinite setting like this.

You can think of the second picture above, with the smaller triangle, as the same triangle on a larger wall. We can see that as the wall gets larger, the ratio of area taken up by the triangle approaches zero.

This result bothered me, since the whole question is a theoretical one.  We can't really investigate a true plane, since there is no such thing as an actual infinity. We would have to bound the plane somewhere, and then you will have an actual area for the denominator of the ratio, and so you would be able to calculate the probability.  But you would also have to know the area of the triangle formed.

Technically, as mentioned in the first answer, the question itself is meaningless, as there is no way to define a uniform distribution on the entire plane, in order to say we are choosing a point randomly. A proper problem about a random point lying in a random triangle would have to be in a bounded region; but then we’d have to think more about what it means for the lines to be random. We won’t be going there.

Back to the probability of an acute triangle!

The story doesn't stop there, however. I went digging on the Internet and I found a paper published by two researchers [Falk and Samuel-Cahn] at the Hebrew University of Jerusalem. They work at the Center for Rationality there, which studies interactive decision theory. This is an exciting field. The paper I found is in Microsoft Word format, and can be found at this URL: 

   http://www.ma.huji.ac.il/~ranb/DPs/dp235.doc 

In this paper, they discuss a problem that was posed by Lewis Carroll, author of _Alice's Adventures in Wonderland_ and a mathematician and logician of some renown. His problem is not the same as yours, but the two share a very similar characteristic - the idea of the infinite plane.

This is an alternative interpretation of the acute triangle question, which I mentioned in passing last time, when I said, “A third option might be to randomly choose three vertices, but again we would need to restrict the region somehow.”

Carroll posed this:

Three Points are taken at random on an infinite Plane. Find the chance of their being the vertices of an obtuse-angled Triangle.

The two researchers show Carroll's answer to the problem, and they investigate his underlying assumptions. These have a direct bearing on your problem about the point lying in the triangle formed by three lines on an infinite plane. They note that there are fundamental contradictions inherent in assuming things about the infinite plane.  They say that Carroll's answer was the right answer to a different problem, and that seems to be exactly what happened to you.

The article I mentioned last time also dealt with this Lewis Carroll version of the acute triangle problem. Here is part of their explanation:

Incidentally, Charles Dodgson posed the question in this form in his 1893 book of mathematical puzzles, entitled “Pillow Problems Thought Out During Wakeful Hours”. His Problem 58 posits “three points taken at random on an infinite plane”, and asks for the probability that they are the vertices of an acute triangle. (Actually he asked about obtuse triangles, which is obviously just the complement.) However, as plausible as this question might sound, the premise of choosing three points “randomly” in the plane is not valid – at least not without giving a probability distribution for the points. There is no uniform distribution over the plane, so the notion of choosing three points randomly on the plane is inherently ambiguous.

They continue by showing that his method leads to contradictions.

The Falk article mentioned here starts with a similar comment:

The disturbing element in the problem’s text is the random sampling of points on an infinite plane. How could this be? This is a practically and conceptually impossible procedure. Put explicitly, Carroll assumed (1) that the sample-space for his “statistical experiment” is infinite, and (2) that the probability-density function on that space is uniform. However, these are two contradictory assumptions that are sooner or later bound to entail paradoxical results (see Falk & Konold, 1992).

They go on to show a different contradiction, and then use a method similar to our first problem above (restricting the sample space to a large square, and taking the limit as it increases) to obtain an answer, namely 0.7249, in comparison to Carroll’s 0.6394.

Finite and continuous: Probability that a random sum is no more than 5

We’ll close with a question from 2007, which brings us back to the main ideas from last time:

Probability of a Sum Meeting a Condition

Two real numbers x and y are randomly chosen on a number line between 0 and 12. Find the probability that their sum is less than or equal to 5.

I am not sure how to do this.  My idea below is probably wrong, and I need to know how to do the problem.

0+1=1 0+2=2 0+3=3 ..................... 12+12=24

  x + y <= 5
------------- = probability
 # of trials

This would be a good start, if the problem were a finite one involving only 13 possible numbers. But this is a continuous problem, which makes it more interesting.

Start with a discrete problem

Doctor Greenie answered:

Hi, Darren -

Your thoughts about the problem only use whole numbers.  The problem says you can pick ANY two real numbers between 0 and 12--like 5.189238945 and the square root of 93.  So you will need a different method to attack the problem.

To find an approach to the problem, let's START with your work with whole numbers and then modify that approach to include all numbers between 0 and 12.  We can make a chart showing all the sums of pairs of whole numbers from 0 to 12; we get our familiar addition table, but I'm going to arrange it in an unusual manner.

           0  1  2  3  4  5  6  7  8  9 10 11 12
       -------------------------------------------
   12  |  12 13 14 15 16 17 18 19 20 21 22 23 24 |
   11  |  11 12 13 14 15 16 17 18 19 20 21 22 23 |
   10  |  10 11 12 13 14 15 16 17 18 19 20 21 22 |
    9  |   9 10 11 12 13 14 15 16 17 18 19 20 21 |
    8  |   8  9 10 11 12 13 14 15 16 17 18 19 20 |
    7  |   7  8  9 10 11 12 13 14 15 16 17 18 19 |
    6  |   6  7  8  9 10 11 12 13 14 15 16 17 18 |
    5  |   5  6  7  8  9 10 11 12 13 14 15 16 17 |
    4  |   4  5  6  7  8  9 10 11 12 13 14 15 16 |
    3  |   3  4  5  6  7  8  9 10 11 12 13 14 15 |
    2  |   2  3  4  5  6  7  8  9 10 11 12 13 14 |
    1  |   1  2  3  4  5  6  7  8  9 10 11 12 13 |
    0  |   0  1  2  3  4  5  6  7  8  9 10 11 12 |
       -------------------------------------------

This amounts to what Darren described, listing all possible sums, so we can count those less than or equal to 5; but it does so in a way that will be easier to count. (You don’t actually have to write out the entire table; Doctor Greenie only showed some of the numbers, though I’ve filled them all in to make it neater. You just have to visualize the idea!)

Suppose for now that we were restricting our numbers to whole numbers, and that we wanted the sum of the two numbers to be 5 or less.  Then we could get the answer directly from this table.  There are 13 whole numbers from 0 to 12, so the number of sums in the table is 13*13=169.  And we can count the number of sums that are 5 or less; it is 21.  So using only whole numbers, the probability that our sum is 5 or less is 21/169.

I put the desired sums in red above to make this visible. If we didn’t actually write it all out, we could see in our  minds that the count will be \(1+2+3+4+5+6=21\); and if the numbers were larger, we could use knowledge of triangular numbers or of arithmetic series to simplify the work. The probability comes to 0.124, nearly 1/8.

Now make it continuous

To modify this approach so that we consider ALL real numbers instead of just whole numbers, we can keep the same basic picture, but instead of having separate, discrete numbers horizontally and vertically, we have a continuous range of numbers.  So we can think of our picture as a square 12 units wide and 12 units high:

       0  1  2  3  4  5  6  7  8  9 10 11 12
    12 -------------------------------------
    11 |                                   |   
    10 |                                   |   
     9 |                                   |   
     8 |                                   |   
     7 |                                   |   
     6 |                                   |   
     5 |                                   |   
     4 |                                   |   
     3 |                                   |   
     2 |                                   |   
     1 |                                   |   
     0 -------------------------------------

The complete set of combinations of two numbers we could select from anywhere in this figure is represented by the AREA of the figure, which is 12*12 = 144.  Our objective is to determine what fraction of that total area represents pairs of numbers whose sum is 5 or less.

Every pair \((x,y)\) of numbers in the interval \((0,12)\) is represented by a point in this square. (The rows were numbered from bottom to top to match the coordinates in the first quadrant.)

We can get an idea of how to do that by looking at the figure we had when we were using whole numbers.  The pairs of numbers whose sum is exactly EQUAL to 5 lie along a diagonal line from (0,5) to (5,0).  So in our figure for the case where we are allowing ANY real numbers, we can draw a boundary for the sums that are 5 or less along that diagonal line:

       0  1  2  3  4  5  6  7  8  9 10 11 12
    12 -------------------------------------
    11 |                                   |   
    10 |                                   |   
     9 |                                   |   
     8 |                                   |   
     7 |                                   |   
     6 |                                   |   
     5 \                                   |   
     4 |  \                                |   
     3 |     \                             |   
     2 |        \                          |   
     1 |           \                       |   
     0 ---------------\---------------------

The sets of combinations of two numbers we can choose whose sum is 5 or less is represented by the AREA of the triangle formed by that boundary line.  That triangle is a right triangle with legs of length 5, so its area is (1/2)(5)(5) = 12.5.

The boundary line has equation \(x+y=5\), and has intercepts \((0,5)\) and \((5,0)\).

Here’s a better graph of these regions:

So the total area from which we can pick our two real numbers between 0 and 12 is 144, and the total area for which the sum of those two numbers is 5 or less is 12.5.  Therefore, the probability that the two numbers we pick between 0 and 12 have a sum of 5 or less is

  12.5/144 = 25/288

This probability is 0.0868, about 1/12.

The method for solving the problem is the same regardless of what the maximum value of our sum is supposed to be.  For the case where the sum is supposed to be at most 12, you can probably see that the "boundary line" will cut the rectangle exactly in half, so the probability will be 1/2.

Here is that region:

For the case where the sum is supposed to be (for example) at most 18, the picture is a bit different, and the calculations might be done a bit differently.  Our picture would be

       0  1  2  3  4  5  6  7  8  9 10 11 12
    12 ------------------\------------------
    11 |                    \              |   
    10 |                       \           |   
     9 |                          \        |   
     8 |                             \     |   
     7 |                                \  |   
     6 |                                   \   
     5 |                                   |   
     4 |                                   |   
     3 |                                   |   
     2 |                                   |   
     1 |                                   |   
     0 -------------------------------------

In this case, the area containing the allowable pairs of numbers is represented by the whole rectangle, MINUS the small triangular area.  The area of the whole rectangle is 144; the area of the small triangle is (1/2)(6)(6) = 18.  So the area of the allowable region is 144-18 = 126.  And the probability that our sum is at most 18 in this case is then 126/144 = 7/8.

This area model of probability is what we used last time; and the issue there was that different models of how we choose a random triangle would give different areas, and different probabilities.

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.