Having just discussed some pattern or sequence problems that were poorly posed, let’s look at some recent questions about sequences, some of which are quite complicated, and others seem to be just wrong.

Here is the initial question from Zehra:

Please make me understand how I complete these series:

1, 2, 5, 1, 3, 6, 2, 5, 8, 4 ………

10, 2, 4, 40, 8, 10, ……….

These are not, like some we looked at last time, too short to have sufficient information; but they look complex enough to require context. I wrote back:

Perhaps you have seen my posts on questions like this:

Pattern and Sequence Puzzles RevisitedOne important point made there is that problems like this are

puzzles(or riddles): without a context limiting the type of pattern to expect, there isno certaintythat there will be only one answer, andno routine methodthat will find it. We are really justguessingwhat sort of pattern would have interested the poser.So, can you tell me the context of the questions? That will include the age or grade of the student for whom it was given, what they have been learning that it might be related to, and anything that was said about the problems.

Also, if you didn’t give the complete statement exactly as given to you, please do; as some of my examples show, it can be very important whether you are asked for a

formula, or thenext term, or something else. If there were other similar questions you have solved, it may help to see them for comparison.I can suggest a way to find one possible answer to the first: Make the sequence of first differences and look for a pattern. The pattern I see is not a mathematical formula, but an alternation between three different patterns in the differences. (Compare the second example in my second post.)

I don’t yet see a pattern in the second, but I see reason to wonder if you might have made a small mistake in typing it. I’ll give it more thought when I hear back from you!

Zehra replied:

Well, it is for a student of Grade 5. Basically there is some pattern that we have to follow and

complete the serieslike I am sharing some examples:15, 18, 21, 24 ….

27, 30, 33, 36 ….

another:

10, 15, 10, 50, 10, 16 …..

10, 60, 10, 17, 10, 70 …..

I think these are two examples, in each case the first line being the **given values**, and the second being the **next few terms** that were asked for. The fact that only terms were requested clarifies that no formula is expected.

These examples are also useful in showing what sort of ideas have been taught, suggesting the types of patterns to look for in the unsolved problems Zehra gave initially.

I answered by examining those patterns:

Okay. You have not given the exact wording of the problem, because “

completethe series” would mean writing an infinite number of terms (and these aresequences, notseries)!It looks like you were told perhaps to

fill in a given set of blanksfor the next few terms, not to give a formula; that’s helpful.The first example is an

arithmetic sequencewith common difference 3, and is very easy.The second is an

alternation between two sequences: 10, 10, 10, … and 15, 50, 16, …, where you appear to have guessed (with insufficient information) that the second subsequence alternates between 15, 16, 17 and 50, 60, 70 (or maybe you would describe it differently: the process to obtain the next term alternates between using10,addingincreasing numbers 5, 6, 7, andmultiplyingby the same numbers). That tells me that an alternating rule is likely for the ones you asked about, which confirms my thoughts.As I said, these are mere puzzles, and should not be graded assignments or otherwise considered to be a test of ability! In fact, better students might do worse at these, as they (like me) will be more concerned about having sufficient evidence and not just guessing.

Have you considered my suggestion for the first sequence?

I now have an idea for the second: Rather than look at the differences between successive terms, look at their

ratios.

The second example is much like one I discussed last time; the apparent pattern is complicated enough that even with six terms given, a guess has to be made. To clarify what the pattern is, consider this colored version:

10, 15, 10, 50, 10, 16, **10, 60, 10, 17, 10, 70**

Odd-numbered terms are all 10; even terms alternate between the sequence 15, 16, 17 (increasing by 1) and 50, 60, 70 (increasing by 10). But if only the first six terms were given, we have no actual evidence of increasing by 10! The only number in the blue sequence that was given is 50.

One possible reason for making that assumption, as I suggested, is that 15 and 16 are respectively 5 and 6 *more* than the 10 they follow, while 50 is 5 *times* the 10 it follows. That is, we have more reason to claim a pattern when we treat it not as four interleaved sequences, but as an alternation of four actions, adding a number, reverting to 10, multiplying by a number, and again reverting. But this is still entirely conjecture. So the author *expects* students to make such guesses, but evidently doesn’t state (in the immediate context) that other valid answers are possible.

As I pointed out, I am troubled when I see a problem that a poor student will have no trouble guessing at (being used to guessing in math), but a careful student who wants to be sure she is right will take a long time with, and maybe give up.

Zehra didn’t exactly try my suggestions; instead, she told me something I didn’t yet know:

Well, I know the answers of both as it is written in back of book

1, 2, 5, 1, 3, 6, 2, 5, 8, 4 ………. 8, 11, 7, 12, 15, 11

10, 2, 4, 40, 8, 10, ……… 100, 20, 22, 220

but I do not understand how we get these

Evidently the typo I had wondered about wasn’t, since she didn’t change anything (I think my guess had been that the 10 should be 16, so the second set of three are 4 times the first set); but now we have more information to judge the problems by. I answered, first confirming my guess for the first problem:

Okay, I can stop trying to guide you to

findan answer, and justanalyzethe one you’ve been given.Here are the

differencesI said to find for the first sequence:1, 2, 5, 1, 3, 6, 2, 5, 8, 4 1 3 -4 2 3 -4 3 3 -4As indicated by the colors, we alternately add 1, 2, 3, …; add 3; and subtract 4. Continuing, we add 4 to 4 to get 8, then add 3 to get 11, then subtract 4 to get 7, and so on:

1, 2, 5, 1, 3, 6, 2, 5, 8, 4,8, 11, 7, 12, 15, 111 3 -4 2 3 -4 3 3 -4 4 3 -4 5 3 -4There may be a nicer way to describe it, but my answer does agree with theirs.

Here there are enough terms given that, given the expectation of alternating additions, some constant and some changing, we could be reasonably confident that this was what was expected (but not that this is “correct”, or the *only* possible answer!).

Continuing:

How about the second sequence?

Here are the ratios I said to find:

10, 2, 4, 40, 8, 10 1/5 2 10 1/5 5/4We see a repeated division by 5; but we can’t be sure of anything else. But having already tried differences, we have this:

10, 2, 4, 40, 8, 10 -8 +2 +36 -32 +2So it looks like we alternately divide by 5, then add 2, then … what? Maybe

multiply by 10— but with only one example, we can’t be sure it isn’tadd 36, or something else entirely. But if we guess the multiplication by 10 (just because it feels nicer), we get this:10, 2, 4, 40, 8, 10,100, 20, 22, 220÷5 +2 ×10 ÷5 +2 ×10 ÷5 +2 ×10That gives their answer, so we are “correct”.

Here that multiplication by 10 step had no evidential support, because there were not enough terms given to support three alternating rules. Clearly, the more complicated the rule, the more data are needed to guess it confidently.

I concluded:

But I very much dislike this sort of problem: it is entirely wrong to give a “correct” answer without at least saying “other answers are possible”, and without giving an explanation for the pattern they followed, since there really is not enough evidence to make this conclusion. They are not teaching mathematics, but either (a) how to jump to a conclusion, or (b) how to feel stupid because you didn’t see the answer they wanted. But this sort of thing seems to be common (and as a mere game, it can be fun).

The important thing, again, is not to be anxious when you can’t “solve” it. It took me a while, and I have a lot of experience with these.

That was the end of this thread, but the next day Zehra wrote another question, closely related to the first:

Using any proper pattern (e.g., ×2, +2 , ÷2, -1, ×3, +3, ÷3, -1, ×4), write a series in following questions:

×2, +2, ÷2, ×2, ……………

×10, ÷2, ×9, ÷2, ……………

Apparently what the textbook means by “proper pattern” is one like those we have been examining, with a notation much like what I used in my last response: a sequence of operations that either repeat or change in a simple way. These questions ask the student to recognize the pattern they intend in these operations, so they are sort of a second-level pattern question. But we have the same difficulties as before. I answered:

I’m not entirely sure what they are asking for; if I were helping you in person, I would be looking at the book for an example of this kind of question, or checking the back for their answers to similar exercises, to make sure what kind of answer they want.

They may be just asking you to turn these into

repetitive patterns of operations(much like those I described for the other problems):×2, +2, ÷2, ×2,

+2, ÷2, ×2, +2, ÷2, …But their example suggests that (like the previous problems) some or all might be meant to change each time, like this (changing the addition):

×2, +2, ÷2, ×2,

+3, ÷2, ×2, +4, ÷2, …And since they said explicitly “Using

anyproper pattern”, it appears that anything like this would be a valid answer, and they would not give a single “correct” answer in the back.

Or maybe that’s *not* what they mean:

But since they said, “write a

series“, not “write apattern“, it’s also possible that they are asking for an actual sequenceusingthis pattern; but then they would probably have given a first term so you’d have something definite to go by. For example, if the first term were 1, my first answer above would generate this sequence:1, 2, 4, 2, 4, 6, 3, 6, 8, 4 ×2, +2, ÷2, ×2, +2, ÷2, ×2, +2, ÷2, ...while following my second pattern would yield this:

1, 2, 4, 2, 4, 7, 3.5, 7, 11, 5.5 ×2, +2, ÷2, ×2, +3, ÷2, ×2, +4, ÷2, ...You can try the second example.

Zehra wrote back:

Plz also tell me about this pattern too, that how we complete series of this one:

×10, ÷2, ×9, ÷2, ……..

I suggested a couple possibilities:

I might just repeat all four terms over and over, with optional modification each time. That is,

×10, ÷2, ×9, ÷2,

×10, ÷2, ×9, ÷2, ×10, ÷2, ×9, ÷2, …Since the ÷2 appears twice, we probably wouldn’t want to modify that each time (though it wouldn’t be wrong to do so, ÷2, …, ÷2, … ÷3, …, ÷3, … ÷4, …, ÷4, …). But you might choose to see the multiplier decreasing each time, as

×10, ÷2, ×9, ÷2,

×8, ÷2, ×7, ÷2, ×6, ÷2, ×5, ÷2, …Again, I’m not very happy with the way this problem is stated, because I can’t be sure what is actually expected, and it is

teaching guesswork rather than mathematical thinking; but that is what seems to be intended.

The bottom line seems to be that, if you are being taught material like this, you need to accept the fact that guessing is allowed. This can be described as inductive thinking rather than deductive thinking, and though the former doesn’t guarantee a right answer, and is not properly a part of mathematics, it is a part of science (and life); and within math, it can be used to make hypotheses (that is, fancy guesses) that you can then try to prove.

]]>First, from 1997:

Alternating Sequence Dear Dr. Math, I need help. I have a problem that neither my friends nor I can get. I thought you could help. We have to find a pattern andfind the next three numbersfor this:0, 8, 27, _, _, _. I hope you can find the answer. Sincerely, Shoushou

Three terms are not enough to determine a sequence, unless something *very* obvious is going on. If this had been **1, 8, 27**, it would have been obvious: Even though there are many options, ranging from the repeating sequence

**1, 8, 27**, 1, 8, 27, … ,

to the quadratic sequence \(a_n = 6n^2 – 11x + 6\), which produces

**1, 8, 27**, 58, 101, 156, … ,

any reasonable person would guess that the intended sequence is \(a_n = n^3\).

But what of the sequence that was given? It isn’t at all obvious, and therefore is really impossible to know. Doctor Wallace answered, suggesting one idea to start with:

For this series, the first thing to notice is thatall of the numbers are perfect cubes: 0 = 0^3, 8 = 2^3, and 27 = 3^3. If we list the numbers that are the cube roots of these numbers, we get this series: 0, 2, 3... Now, all you have to do isfind a pattern among these three. The one I see is an alternating series. There may be others. Let me know how your work goes on this problem.

Since the cubes are so obvious, his idea is that the sequence might be the cubes of this sequence. But what is this sequence? He suggests what may be the simplest idea, an alternating sequence (that is, alternating terms follow different rules). We could instead use \(\displaystyle -\frac{x^2-7x+6}{2}\), which generates

**0, 2, 3**, 3, 2, 0, …,

so that the cubes would be

**0, 8, 27**, 27, 8, 0, …;

but that certainly isn’t the first thing that comes to mind.

Shoushou wrote back:

Thanks for giving me that great hint. I'm not sure if I have it right, but I think it might be +2, +1, +2, +1.This doesn't quite satisfy me, but it's the best I can come up with for now.

His suggestion is that the sequence is formed by starting with 0 and alternately adding 2, then 1, then repeating, so that the sequence would continue

**0, 2, 3**, 5, 6, 8, … ,

and the sequence of cubes would be

**0, 8, 27**, 125, 216, 512, … .

Doctor Wallace agreed:

I think you're right. That's what I came up with, too. And I feel the same way - I wasn't quite satisfied. Somehow, it didn't feel just right. I think it wasbecause there were only 3 numbers given, and it's hard to come to just one pattern with only three numbers. But it seems that, since the number 1 was missing, +2 +1 +2 +1 ... is the only thing that makes sense. Then the next two numbers in the pattern would be 5^3 and 6^3. What did your teacher have in mind for the answer? If you can come up with another pattern that fits the series, please write and let me know.

We never heard what the teacher intended; I certainly hope that any answer that made some sort of sense was accepted.

I’ve mentioned a couple other possibilities. If we find a quadratic function that gives the required values, it’s \(\displaystyle a_n = \frac{11x^2-17x+6}{2}\), and continues

**0, 8, 27**, 57, 98, 150, … .

Or, we could continue decreasing the amount added, so that we next add 0, then -1. The base sequence is then 0, 2, 3, 3, 2, 0 — which is my quadratic sequence above!

To find other possibilities, we could go to OEIS, the Online Encyclopedia of Integer Sequences, which catalogs sequences that people have found interesting. The site recommends entering about 6 terms, which will filter out many alternatives. Entering “0, 8, 27” gives me sequences like “Sum of cubes of primes dividing n,” or “Numbers that are the sum of cubes of distinct primes,” to list only the most comprehensible of those that start with the 0. And it doesn’t include any of those I’ve mentioned.

But my guess is that the zero was a typo.

Here’s another from 2001:

Finding a Pattern My daughter received this in a homework assignment, andI don't believe there is enough specific informationto logically give thenext four numbers in the sequence: 2, 8, 7, 28.

Four terms might be enough to support a very simple sequence; for example, if the difference from one term to the next was the same, we’d have three identical numbers and a reasonably strong case for an arithmetic progression. But the differences here are 6, -1, 21, which reveals nothing.

I agreed with Ray, but I had the same idea Doctor Wallace had above, and with a little more justification:

I agree, there really is not enough information here. I can guess what they probably want, however; most likely they have had other examples where theyalternated two simple operationsto get successive terms, and you are expected to assume that this pattern is similar. If so, then we are first multiplying by 4 (2*4 = 8), then subtracting 1 (8-1 = 7), then multiplying by 4 again (7*4 = 28), so you would continue in the same way: 27, 108, 107, 428.

The evidence we have is that multiplication by 4 appears twice, and subtraction of 1 in between is a simple thing to do. So I felt justified in using the word “probably”. My sequence, then, is

**2, 8, 7, 28**, 27, 108, 107, 428, … .

But there isn’t a lot of evidence for the subtraction step, as we see only one instance of it; and some other alternation is quite reasonable:

But another perfectly valid pattern would be "for odd terms, add 5 each time; for even terms, add 20." That would give 12, 48, 17, 68.

That is, the sequence could arise from merging the arithmetic progressions 2, 7, 12, 17, … (common difference 5) and 8, 28, 48, 68, … (common difference 20). This is a different kind of alternating sequence:

**2, 8, 7, 28**, 12, 48, 17, 68, …

On the other hand, if we’re merging sequences, we have only two terms of each, so they could be just about anything. What if the even terms were a *geometric* progression where we multiply each term by 28/8 = 3.5 to get the next: 8, 28, 98, 343, … ? And, interestingly, the odd terms would have the same common ratio, 7/2 = 3.5, and we’d get 2, 7, 24.5, 85.75, … . So our answer would be

**2, 8, 7, 28**, 24.5, 98, 85.75, 68, … .

Not very pretty, but just as logical!

Who is to say whether any of these is “correct”? All yield the same first four terms, so a teacher would have to give credit to all of them — unless there is some information we haven’t been given, such as that the class has only learned about merging arithmetic progressions, or all the examples given had alternated addition and multiplication.

I continued:

If a problem merely says "give the next four numbers" or "find the pattern in this sequence,"there are infinitely many possible answers,since the word "pattern" has no precise definition; it's really a matter of guessing what pattern they had in mind, which is not math but psychology or ESP. To make this a valid problem, they should say something at least as clear as "This sequence was formed by a pattern similar to those you saw in this chapter. Make a reasonable guess as to what the pattern is, and show how it continues." Or, I suppose, they could say "Find a pattern in this sequence, explain how it works, and use that pattern to predict the next four numbers. There may be more than one correct answer." But to imply that students can determine _the_ correct answer by looking at four numbers isa misleading lesson in what math is all about. It's not a guessing game.

Here is another with four terms, from 2002:

The Perils of Predicting PatternsWhat is the next number in the pattern 1,3,6,10 ___? If the pattern is to add 2, 3, 4, and then 2, 3, 4 again and again, it should be 12. But if the pattern is to add 2, 3, 4, 5, and so on, then it should be 15.Which is correct?

To my mind, it’s “obvious” what they intended, because this is a well-known arithmetic progression called the triangular numbers (1, 1+2, 1+2+3, 1+2+3+4, …). If so, then Su’s second guess is “correct”. But Su has the makings of a mathematician, and sees that there are other possibilities, one being a mere repetition of the pattern of differences. Doctor Greenie had an excellent answer:

There can't be a single "correct answer" for any question like this, sinceno matter what list of numbers you give me, I can find a formula that will fit them to any following number. Usually what is desired is the "simplest" answer, and unfortunately, different people's definition of what it means to be "simple" varies. Obviously, adding 2, then 3, then 4, then 2, then 3, then 4, then 2, and so on, is one way to extend the sequence; and adding 2, then 3, then 4, then 5, then 6 is another. To my mind, the second is slightly simpler but only because the pattern presented so far does not give us any reason to think that we should go back to adding 2 to obtain the next number.

In other words, in line with my thoughts above, in the absence of evidence that a change is to be made (such as going back and repeating), it is more likely that the author intended just to continue the existing pattern. But that doesn’t make either answer more correct than the other.

He continued:

These sorts of patterns are used in intelligence tests, and the "correct" answer is "whatever very intelligent people think the correct answer is". That's not much help, is it? I remember a wonderful example shown to me once that illustrated how silly this sort of question is. Here it is: What comes next in this sequence? 33, 23, 14, 9, ___ The answer is "Christopher Street". The reason is that the numbers are the exits of the 6th Avenue subway in New York City.

(Looking at a current subway map, it looks like things have changed a little, but the idea is clear!)

A 1996 question provided a chance for a different approach to the issue:

Predicting the Next Number When given a series of numbers and asked topredict the next number, what is the formula for doing so? Example:2,5,12,23, ?This question appears on psychological exams, federal employment exams and many others. Is there a mathematical way of determining the next number in this series?

As we’ve been saying, there is no truly mathematical way to do it; but Doctor Jerry provided a very mathematical way to show that it can’t really be done. He supposed that we knew a polynomial formula for the sequence, and showed that we can find another polynomial that will give the very same four terms, AND whatever fifth term we choose! This is the idea Doctor Greenie mentioned in his first sentence above.

First, if the first several terms of a sequence are given,there is no method for determining the general term. Suppose I'm given the numbers a, b, c, d and asked to determine the fifth and sixth terms of the sequence. I'll show that any number of different solutions is possible. I start by determining a polynomial p(x) = Ax^3 + Bx^2 + Cx + D such that p(1) = a, p(2) = b, p(3) = c, and p(4) = d. Then consider f(n) = p(n) + (n-1)(n-2)(n-3)(n-4) or g(n) = p(n) + (n-1)(n-2)(n-3)(n-4)(n-5). Notice that both f and g determine sequences whose first four terms are a, b, c, and d. Remaining terms are wildly different. This idea could be elaborated.

His function *f* is identical to *p* for *n* = 1, 2, 3, 4, because the added product is zero in all four cases. But for *n* = 5, he is adding 24 to the value of *p*(5), giving a new polynomial with a different fifth term; and we could adjust that to make it any number we want. And his function *g* will match *p* for the first 5 terms, but differ in the sixth.

So even if we required a really “mathematical” answer in the form of a polynomial, there are infinitely many “correct” answers. You’d have to specify that the function must have the lowest possible degree, to make only one answer correct.

But, of course, that is not what problems like this are looking for:

You can, however, try toguess what was most likely in the mindof the person who made up the question. For the sequences 2,4,6,8,... 1,4,9,16,... I suppose most persons would say 10,12 or 25,36.

The two examples he gives are “skip counting” (an arithmetic progression) and a list of perfect squares, both of which most of us would recognize and suppose that anyone writing a test would be more likely to think of than of other possibilities.

For the sequence you gave, 2, 5, 12, 23, I noticed that 5 - 2 = 3, 12 - 5 = 7, and 23 - 12 = 11. Since 3, 7, and 11 can be viewed as the odd numbers, leaving every other one out, one could argue that the next two terms are 23 + 15 and 38 + 19. Other, more or less natural answers are possible. However, one has no choice but to accept whatever the text makers decree is the correct answer!

Here, he observes that the differences between successive terms are 3, 7, 11, which increase by 4 each time; so the natural thing to do is to continue with 15, 19.

The problem (like all the others in this post) didn’t ask for a formula, just for the next term, so that is all that is needed. The differences imply (as we’ll see when we get to the Method of Finite Differences) that the sequence has a second-degree polynomial (quadratic) formula, which turns out to be \(2x^2 – 3x + 3\), which generates the terms

**2, 5, 12, 23**, 38, 57, … ,

just as Doctor Jerry found by adding 15 and 19.

If we take that as \(p(x)\), then the \(f(x)\) above is $$f(n) = p(n) + (n-1)(n-2)(n-3)(n-4) =// 2x^2 – 3x + 3 + n^4-10n^3+35n^2-50n+24 = n^4-10n^3+37n^2-53n+27.$$ The first six terms of this are

**2, 5, 12, 23**, 62, 177.

As intended, the first four terms agree, but the next terms are different.

]]>We can start with this question from 2002:

How Many Hidden Faces? I have to do an investigation at school on hidden faces. When you look at a cube you cannot see one face - this face is hidden. So if you have one cube then two cubes all in a straight line up to eight you will work out a pattern 1-1 2-4 3-7 4-10 5-13 6-16 ... Can you please find a formula to find out the number of hidden faces, eg. 3n-2? I also need proof. Can you please help me? I've spent 5 hours on it already!

What Amy means by “hidden” faces is faces of cubes that are not visible from any direction, because they are “glued” to another cube, or because they are on the table:

Amy had correct data, and a correct formula, as it turns out:

- 1 cube: 1 hidden face (bottom), 5 visible
- 2 cubes: 4 hidden faces (2 on bottom, 2 between cubes), 8 visible
- 3 cubes: 7 hidden faces (3 on bottom, 4 between cubes), 11 visible
- 4 cubes: 10 hidden faces (4 on bottom, 6 between cubes), 14 visible
- 5 cubes: 13 hidden faces (5 on bottom, 8 between cubes), 17 visible
- 6 cubes: 16 hidden faces (6 on bottom, 10 between cubes), 20 visible

What she needed was to be able to convince someone (perhaps herself) that she was right: a **proof**. I replied, first clarifying the statement of the problem:

Actually, if I look at a cube I can see only three of its faces at one time; I presume you mean that it is sitting on a table and you are allowed to look all around; only the face in contact with the table (or with another cube) is "hidden". One way to approach this kind of problem is tothink about how things change from one step to the next. If I have a row of cubes, say your three in a row with 7 hidden, and add another to the end,how many new hidden faces are there?There's the one on the end of the row you had, which is now covered and becomes hidden; and there are two on the new cube that will be hidden, one on the bottom and one that touches the row. So each time you add a cube, you add three more hidden faces. Sounds like anarithmetic sequenceto me. Another approach is to think about one whole row andbreak the hidden faces down into groups. There are the faces on the bottom (one per cube) and the faces between cubes (two wherever a pair of cubes meet). That easily gives you a formula.

The second approach is what I used in doing my counting above. We can see that with *n* cubes, there are *n* bottom faces, and \(2(n-1)\) interior hidden faces, giving us the formula \(H = n + 2(n-1) = 3n – 2\). This is reminiscent of what we talked about in the post Counting Faces, Edges, and Vertices, finding an organized way to count.

The first approach tells us that we start with 1 for *n* = 1, and add 3 for each of \(n-1\) cubes we add, giving the formula \(H = 1 + 3(n-1) = 3n – 2\). It’s always encouraging when we get the same result two different ways.!

What if we had been asked for the number of **visible** faces? We could use either of these ways to find it directly; or, having found the number of hidden faces, we could subtract that from the total number of faces on *n* cubes: \(V = 6n – (3n – 2) = 3n + 2\). (Interesting – there are always 4 more visible faces than hidden faces!) You can check this against the numbers I listed above.

For another example of the same sort of problem, see

Visible/Hidden Sides on Stacked Cubes

Here is a more complicated question of the same type, from 2001:

Hidden Faces of Cubes My daughter has to produce an equation to show the number of hidden faces when three rows of cubes are placed together on a flat surface. The data look like this Number of Cubes(n) Hidden Faces(h) 3 7 6 20 9 33 12 46 15 59 18 72 21 85 24 98 27 111 30 124 I've worked out the difference between n squared and h, and the differences between these results. The difference between the differences I calculate as 18 constantly, so the next set of differences will obviously be 0. I can't see how to get a formula from this, though, and am not sure if I'm heading in the right direction. I haven't done algebra for 20 years and used to hate it, but I must say I'm quite enjoying this problem!

Yes, math is more fun when you do it because you find it interesting!

I think a better description would be, “when *rows of three cubes* are placed together”, though I can also see it Nigel’s way. It looks something like this:

Here again, a “hidden” face is one that is either on the table or against another cube. For *n* = 1, there are 3 faces on the table (the bottom) and two pairs glued together, for a total of 7 hidden (and 11 visible).

Nigel has done some of the things discussed in the previous posts, Pattern and Sequence Puzzles and Pattern and Sequence Puzzles Revisited. He compared each term in his list (which must have taken a lot of work to compile!) with the square, and he also used another technique (the method of finite differences, which will be the subject of some posts here eventually) to determine that the sequence has a quadratic formula. Unfortunately, he’s wrong about that, apparently because he started with the differences between two related sequences, rather than between successive terms of *one* sequence.

I began by commenting on what he had done, and what he might do to correct it:

Before I get started, I should ask whether you are doing this, or your daughter. I hope you are giving her opportunities to learn from this as you work together; learning together is great! You haven't told me how old she is; I'll assume she is learning algebra, so what you are doing is relevant to her. The approach you are taking, analyzing the data after the fact to find a formula that fits it, uses a method calledfinite differences, which is explained in our archives; here is one such page: Method of Finite Differences http://mathforum.org/dr.math/problems/gillett.10.12.00.html

I recommended not starting with the numbers:

But I don't like this method for a problem like yours. Why? Because when you get a formula,all you will know is that it fits the particular data you used. It doesn't tell you whether you made amistake in your numbers, or whether thepattern will continuewhen the numbers get larger. And in math, I like to KNOW, not just ASSUME. If she was told to use this method (and was told how), then it is fine to use it; and you will certainly enjoy learning the method anyway. But here's how I would prefer to do the problem:rather than looking at the data you gathered, I would look at HOW you gathered it, and find the pattern BEHIND the numbers. So what causes hidden faces, and how do they grow?

Since we always have a multiple of 3 cubes, I started by simplifying the variable, and then using the first method of the answer above:

I'm going to start out talking not about the number of cubes, but the number of columns of three, so that all natural numbers will work, not just multiples of three. With one column c = 1 X X X there are three hidden faces on the bottom (one per cube), and four between cubes (two per pair of adjacent cubes). This gives seven in all. When I add a second column c = 2 X X X X X X I have added another seven under and between the new cubes; I have also added six more, between the old and new columns (two per pair of adjacent cubes). So my new total is 7+7+6 = 20.

This could be something like what Nigel did to get his table.

Now each time I add another column, the same thing will happen, and I will add 7+6 more hidden faces. (In other similar problems, it will be a little more complicated.) So for c = 1, I have 7, and for each increase of 1, I add 13. This is a linear equation: H = 7 + 13(c-1) since with c columns I have added c-1 columns to the first. Since my c is 1/3 of your n, your formula (for n a multiple of 3) will be H = 13c - 6 = 13n/3 - 6 Go ahead and try the other method, and verify that it works. Then try some more complicated patterns, such as making layers of cubes in a three-dimensional block.

The method I used here to write the equation is the point-slope form for an equation of a line, \(y – y_1 = m(x – x_1)\). When *x* is 1, *y* is 7, so the line passes through \((x_1, y_1) = (1, 7)\); and the slope (increase in *y* per unit increase in *x*) is 13. You can also think of this as starting with 7, and then adding 13 for each of *c* – 1 steps to get from 1 to *c*.

We can also check this formula against Nigel’s data. When *n* = 30, the last row of his table, we get \(H = \frac{13(30)}{3} – 6 = 124\), just what he found.

For a similar count (of cubes put together to make a large cube), see

Hidden Faces in a Set of Cubes

Let’s look at one more from 2002, this time not looking for *hidden faces*, but for the *number of cubes added* at each step. This one needs a little background: Kathy had initially just given us a sequence and her attempt at a formula, asking why the formula didn’t work for the first term; what follows is her restatement of the question after we asked for the context of the question:

Finding the nth Term This is the investigation. This was the starting point: A colony of bees is making a honeycomb consisting of hexagonal cells. They start with the center cell. The next day they add the six cells round the center cell, and on the third day they add the remaining ring of cells in the picture. Every day they add a new ring of cells right round the honeycomb. The investigation posed two questions: 1. How many cells will there be at the end of the sixth day? 2. On which day will the number of cells pass the 1000 mark? I managed to find the formula for this.I extended this by using 3D shapes instead of 2D shapes.I used cubes, with day 1 starting with 1 cube; on day 2 I surrounded the first cube with more cubes, both the sides and edges, completely surrounding it. I.e., making a 3x3x3 cube on day 2. I continued this for each day: 3x3x3, 5x5x5, 7x7x7, etc.. I wanted to find the relation between the number of days and thenumber of NEW ADDITIONAL cubes. These are the results I got: Number of days(n) New additional days(y) 1 0 2 26 3 98 4 218 5 386 The formula I got was: Y = (2n-1)^3-(2n-3)^3 This works for all except the first day! I would be grateful for any help you give me.

Kathy is my kind of student — taking an interesting question and extending it in two directions (no pun intended): making it 3-dimensional, and asking a different question. Though she didn’t ask about it, let’s take a quick look at the honeycomb question first. We have something like this, where each color represents a new day’s cells:

Our sequence representing the total number of cells starts off 1, 7, 19, … , which turns out to follow the formula \(a_n = 3n^2 – 3n + 1\). (See Centered Hexagonal Number.) If the problem had been to count the number in each layer, we would have the sequence 1, 6, 12, …, adding 6 each time except for the first, which would be the arithmetic sequence 0, 6, 12, … if it weren’t for the first term being 1 instead. So this problem would have raised the same issue Kathy raised in her own problem. It was the choice to count the new cubes (or hexagons) that caused her issue.

I responded to the cube problem:

I would have said that on day 1 you have added 1 block, rather than zero, since you probably had none before beginning. Further, since on each day the total is a complete cube, as on the first and second days, the total on day N is (2N-1)^3, so the number added on day N would be (2N-1)^3 - (2(N-1)-1)^3 which is just what you got. (I see now that you did not get this formula from the numbers, but directly from the problem, which is good!) This can be simplified to [8N^3 - 12N^2 + 6N - 1] - [8N^3 - 36N^2 + 54N - 27] = 24N^2 - 48N + 26

Does this formula agree with the data (after correcting the first line from 0 to 1)?

I would check this by making a table including the total each day: day total new formula 0 0^3= 0 - - 1 1^3= 1 1 2 2 3^3= 27 26 26 3 5^3=125 98 98 4 7^3=343 218 218

Hmmm … no, the formula is still off.

Now,we can see whyday 1 would be anomalous: on the other days, we started with a cube, and added two to each dimension, going to the next odd cube. But on the first day, we didn't start with a cube -1 unit on a side;we started with nothing, and added a single cube. That is, the number on the first day did not arise by following the same rule of surrounding an existing cube, so it is not surprising that your formula doesn't apply. Notice that my formula for the _total_ applies to day 1, but not to my fictional day 0; so that your formula for the number of blocks added applies only _after_ day 1, since it depends on the previous day's total.

Is this a problem?

It is not unusual in mathematics to require a starting condition apart from the formula that applies elsewhere. You just have to say thatthe formula holds for N > 1. In some cases it makes some sort of sense to continue the pattern back in time before you started, but here that doesn't work, so we don't need to be troubled by the formula not being extensible to N < 2.

Real life formulas commonly have a “piecewise” aspect to them; and that, I think, is the main answer to Kathy’s question.

One additional thought:

Incidentally, I don't think your description of the process by which you are building larger blocks is quite right. You are not just adding to the faces and edges, but also to the corners.If you added only cubes that touch on a face, you would have a considerably more interesting problem, since the shapes you are building would not be mere cubes. But the question you raised turns out to be very interesting, and deserves the thought we've given it! It's all too easy to ignore starting conditions and imagine that all formulas work for all cases, and checking against reality is a very good practice to avoid this.

Adding to faces only would look like this (for *n*=2 and *n*=3):

Have fun working out a formula …

]]>Having just looked at L’Hôpital’s Rule, we can conclude with a look at a recent question about it, to illustrate the reality of struggling to apply it (and the process we go through to help a student find an error).

Subhajit wrote this a couple months ago:

I am unable to solve the below problem. I will be highly obliged if you kindly solve this problem.

\(\displaystyle\lim_{x\rightarrow 0}\frac{\sinh x – x}{\sin x – x \cos x}\)

This is a classic example calling for L’Hôpital’s Rule, with the form 0/0.

If you are not familiar with “sinh”, the hyperbolic sine, see:

Hyperbolic Functions

They are similar to trigonometric functions, but based on a hyperbola. The value of \(\sinh 0\) is 0. The derivative of \(\sinh x\) is \(\cosh x\), and the derivative of \(\cosh x\) is \(\sinh x\).

After trying it out to see what difficulties he might face, I replied:

Have you tried L’Hopital’s Rule? You will have to apply it three times, until you no longer get the form 0/0.

If you need more help, please do as we ask on the submission page, and show your work (including why you are stuck), so we can know what kind of help you need.

I was supposing that the need to apply the rule repeatedly might be his issue, and hoping to see from his work if something else was going wrong.

He wrote back, only *describing* what he did, and giving his answer:

I apply L’hospital’s rule five times and the result comes out. Ans is -1/2.

Thanks for your support.

Both the number of repetitions, and the answer, were wrong; so I needed more in order to help. I said,

I applied the rule

3 times, and got+1/2, which agrees with the graph of the function:I wonder if you might have failed to notice after three times that the expression no longer satisfied the requirements for L’Hopital, and should just be substituted. But if I do that, I get -1/4, not your -1/2. Perhaps you just made an arithmetic error somewhere.

This time, he responded by showing his work, using an image of handwritten work, as students often do (which was very difficult under the old system):

Sir, after applying three time L’Hospital’s rule I get the below form. If I put the limit it gives 1/0 which is undefined.

This is very well written, so I could easily see everything he did. He is close; he has apparently corrected one error (continuing too far – a very easy mistake to make), but has left another uncorrected. So we’ve made good progress.

I replied:

There is a sign error on your next to last line. Of course, since 1/0 is not a form suitable to L’Hopital’s rule, if you were correct, the limit would be undefined as you say.

That was the help he needed; he answered,

Thanks sir, I got it.

Mission accomplished; but let’s carry it to completion, for your sake! Here is the entire work, for the three steps:

$$\displaystyle\lim_{x\rightarrow 0}\frac{\sinh x – x}{\sin x – x \cos x} =\\ \lim_{x\rightarrow 0}\frac{\cosh x – 1}{\cos x – \cos x + x \sin x} = \lim_{x\rightarrow 0}\frac{\cosh x – 1}{x \sin x} \text{ [= 0/0]} =\\ \lim_{x\rightarrow 0}\frac{\sinh x}{\sin x + x \cos x} \text{ [= 0/0]} =\\ \lim_{x\rightarrow 0}\frac{\cosh x}{\cos x + \cos x – x \sin x} = \lim_{x\rightarrow 0}\frac{\cosh x}{2\cos x – x \sin x} = \frac{1}{2}$$

And we’re finished.

Now, if he had not noticed that his incorrect third step was 1/0, and continued, he would have done what he originally said, taking two more steps and reaching -1/2. (Try it.) So it is very likely that his error was two-fold: the sign error, and not stopping when it was no longer 0/0. Those are both easy mistakes to make, once you have momentum! So this is a good lesson in what to watch out for, and also good practice in differentiating carefully.

]]>The first example last time showed one approach to limits of the form \(\infty^0\), by rewriting it as, in effect, \(e^{\infty \cdot 0} = e^{\frac{\infty}{\infty}}\). Here, we’ll do something equivalent, using logarithms. The question comes from 1996:

L'Hopital's Rule

Hi! I have a problem with a limit that can be solved with L'Hopital's Rule because it is part of the first partial, but for which I don't see how to apply L'Hopital's Rule: Limit x^(1/x) x-> Inf I am thankful for any help.

Doctor Paul answered (I’ve made a couple edits to correct errors and change notation):

Let's assume it is this instead: Limit 1^(1/x) x->Inf Note that as x goes to infinity, 1/x goes to zero, right? so 1 to the zero power is just 1. I think it's pretty safe to assume that x^(1/x) also goes to 1 as x goes to infinity. Let's prove it:

He is saying that \(1^0\) is 1, so maybe \(\infty^0\) is also 1. But assumptions or guesses are dangerous; this is actually an indeterminate form, and we have to take it carefully. The proof is not optional!

But the function does not have the form appropriate for L’Hôpital, so we have to transform it. We can take the log of the function first:

You have: Limit x^(1/x) x->Inf Let's say that's equal to some number 'n'. We want to solve for n... Limit x^(1/x) = n x->Inf Take the natural log of both sides: Limit (1/x)*ln(x) = ln(n) x->Inf

We can do this because the log is a continuous function, so that logs of limits are equal to limits of logs, and then the log inside the limit can be simplified.

As written above, the function has the form \(0 \cdot \infty\); but it can be easily written as the quotient \(\displaystyle \frac{\ln(x)}{x}\), which has the form \(\infty / \infty\), and the derivatives are easy:

Now we have to evaluate the left side. It's a limit that evaluates to infinity/infinity. Now we can use L'Hopital's Rule on the left-hand side. Take the derivative and re-evaluate the limit: Limit (1/x) / 1 = ln(n) x->Inf Now we can see that the left-hand side evaluates to zero. 0 = ln(n) Exponentiate both sides: e^0 = n so n = 1 There's proof that the limit evaluates to one.

For another explanation of the same method for the same limit, see:

Limit Evaluation

Another indeterminate form involving an exponent is \(1^\infty\), which appears in this problem, from 1997. The problem is from probability, but the question asked is about a limit, and we’ll focus on that:

Intriguing Limit

I am curious to see what chance you have of winning a contest if you have a 1 in 10 chance of winning each time you play and you play 10 times. I computed this as follows: chance = 1-.9^10 = .65... What would it be for a 1 in 20 chance if you play 20 times? How about a 1 in 100 chance when you play 100 times? And so on. The function then looks like y = 1-(1-1/x)^x, where x is the chance you have of winning and the number of times you play. So if x = 100 you have a 1 in 100 chance of winning each time you play and you play 100 times. y is then the chance that you will win at least once. What is the limit of this as x->infinity? When I graph it, it looks asymptotic at about .632..., but the function looks to me like it has a limit of 0. (1/x goes to 0 and 1 to any power is one.) I asked a professor about this, and he told me that (1-1/x)^x has a special limit, namely 1/e. This then gives the expected result. My question is: Why is lim (1-1/x)^x = 1/e?

Looking at the probability part for a moment, one might naively think that if you have a 1/10 probability of winning each time, the when you play 10 times, you will win 1 time out of those ten, so the probability of at least one win would be 1. Of course, that isn’t true, because wins are random: You may get no wins, or more than one, out of the ten tries. The actual probability is, as Oliver said, the probability of *not always losing*: \(1-\left(1-\frac{1}{10}\right)^{10} = 0.65132…\).

Moving to the limit, as we replace 10 with larger and larger numbers, Oliver thinks that \(1^\infty = 1\); but again, this is indeterminate, because the base is only *approaching* 1, not *always* 1.

Doctor Jerry took the question:

Hi Oliver, There are several different reasons "why" this limit has this particular value. I'll give one answer below, based on a calculation using l'Hopital's Rule, from calculus. If you're asking why is this limit the number 2.71828..., the base of the natural logarithms, the answer is that this limit comes up in working with exponentials and logarithms and was long ago given the name e, after Euler. Usually, it comes up in the form (1+1/n)^n, which approaches e as n->oo.

That last line can be taken as the definition of the number *e*. Notice how similar it is to our limit.

Again, we take the log, make it a quotient, and apply L’Hôpital:

Let y = (1-1/x)^x. Take natural logs of both sides. ln(y) = x*ln(1-1/x)

As x->oo, this has the form of oo*0 and so must be rearranged. We have: ln(y) = ln(1-1/x)/(1/x) Now the form is 0/0 and so l'Hopital's Rule is applicable. After differentiating numerator and denominator separately (according to l'Hopital's Rule) and simplifying (I'll use lim to mean limit as x->oo), we have: lim ln(y) = lim (-1)/(1-1/x) = -1. So ln(y) -> -1. This means that y->e^(-1).

Moving away from exponents, another kind of indeterminate form involves subtraction, \(\infty – \infty\). Here is a question from 1997:

Limits - Indeterminate Forms

I cannot do the problem lim ((1/x) - (cot x)) x->0 I realize that this needs to be converted into the form 0/0 and then I must use L'Hopital's Rule; however I have tried and I cannot do this. I am having a similar problem with the question lim (1/x)(ln (7x+8)/(4x+8)) x->0 Any help you can give me would be greatly appreciated.

Doctor Anthony took this, rewriting the first problem as a quotient by combining fractions:

1 cos(x) Write this as --- - ------ x sin(x) sin(x) - x.cos(x) = ----------------- x.sin(x)

From here, he uses series expansion, as he did in his proof of L’Hôpital’s Rule that we saw last time. For continuity, I will solve it using the rule. (Feel free to read the original, and see which method you like more.) Having verified that the numerator and denominator both approach 0, we take the derivatives:

$$\displaystyle \lim_{x\rightarrow 0}\frac{\sin x – x \cos x}{x \sin x} = \lim_{x\rightarrow 0}\frac{\cos x – \cos x + x \sin x}{\sin x + x \cos x} = \lim_{x\rightarrow 0}\frac{x \sin x}{\sin x + x \cos x}$$

Hmmm … this still has the form \(0/0\), so what do we do? We can repeat the process:

$$\displaystyle = \lim_{x\rightarrow 0}\frac{\sin x + x \cos x}{\cos x + \cos x – x \sin x} = \lim_{x\rightarrow 0}\frac{\sin x + x \cos x}{2\cos x – x \sin x} = \frac{0}{2} = 0$$

The second problem has the form \(\infty \cdot 0\), which easily turns into \(0/0\):

$$\displaystyle \lim_{x\rightarrow 0}\frac{\ln\frac{7x+8}{4x+8}}{x}$$

This is not easy to differentiate, but when we do it, we get $$\displaystyle \lim_{x\rightarrow 0}\frac{\frac{4x+8}{7x+8}\cdot\frac{7(4x+8)-4(7x+8)}{(4x+8)^2}}{1} = \lim_{x\rightarrow 0}\frac{24}{(4x+8)(7x+8)} = \frac{3}{8},$$ which is what Doctor Anthony got by series.

Now let’s try a more difficult example of \(1^\infty\), from 1998, where we’ll see an excellent explanation of the process of using the log, as well as repeated application of L’Hôpital’s Rule:

Finding Limits Using Natural Logs and L'Hopital's Rule

Use L'Hopital's Rule to find the following limits: lim (x goes to 0) [(cos(2x))^(3/(x^2))] I know that if I use the chain rule, the equation gets very messy. I think that I need to use the e function or bring a ln (the natural logarithm function) into the picture, but I am not quite sure how. Please help me.

Doctor Sam started with a thorough justification of taking the log:

You are quite correct. L'Hopital's Rule only applies to functions that are quotients that approach the indeterminate form 0/0 or infinity/infinity. This function approaches the indeterminate form 1^infinity. Here is the idea. Either lim f(x)^g(x) approaches a limit or it does not. If it does, then lim f(x)^g(x) = L. In this case, if we take natural logarithms of both sides of the equation, we get: ln [lim f(x)^g(x)] = ln (L) Now, the natural logarithm function iscontinuous. One of the properties of continuous functions is that you can "bring them inside a limit." For example: [lim sqrt(x)]^2 as x -> 3 = lim [sqrt(x)]^2 = lim x = 3. Since "squaring" is continuous, we can bring it inside the limit sign. So in general, when you have a limit, like lim f(x)^g(x) = L, we can "take logs" to get: ln [lim f(x)^g(x)] = ln (L) lim ln [f(x)^g(x)] = ln (L) lim g(x) ln [f(x)] = ln (L) using a property of logarithms to simplify the expression, namely that ln(a^b) = b ln(a). Notice that the result of this gives ln(L) -- the logarithm of the original limit.

Now we can turn our limit into a quotient:

In your problem, if we take logarithms, we get: lim [(cos (2x))^(3/(x^2))] as x -> 0 ln [lim [(cos (2x))^(3/(x^2))]] as x -> 0 lim [ln [(cos (2x))^(3/(x^2))]] as x -> 0 lim (3/x^2) ln (cos (2x)) as x -> 0 Finally, we have a quotient: 3 ln (cos (2x)) lim --------------- as x -> 0 x^2 Note that cos(2x) -> 1 and ln(1) = 0, so the numerator approaches zero. The denominator also approaches zero. These two conditions mean thatL'Hopital's Rule applies.

Now, take derivatives:

L'Hopital's Rule states that this limit, if it exists, is the same as the limit of the ratio of the derivatives of the numerator and denominator. So: 3*ln (cos (2x)) -6sin(2x)/cos(2x) lim ---------------- = lim ----------------- x^2 2x I used the Chain Rule to find the derivative of 3*ln(cos(2x)).

Like the last example, we need to repeat the process: Rearrange to make the work simpler, check that it still has an appropriate form, and differentiate:

Since it is awkward to keep a fraction in the numerator of another fraction, I am going to simplify this to: -6sin(2x) lim --------- 2xcos(2x) As x->0, thenumerator approaches 0and thedenominator still approaches 0. That means thatwe can apply L'Hopital's Rule a second time: -6sin(2x) -12cos(2x) lim ----------- = lim --------------------- 2xcos(2x) -4xsin(2x) + 2cos(2x) I needed the product rule to find the derivative of 2xcos(2x). Now what happens as x->0? The numerator approaches -12 and the denominator approaches 2. The limit is, therefore, -12/2 = -6.

One more thing left:

But we are not quite done. Remember that we took the logarithm of your original function. We said that if we could find its limit that this would be ln(L) -- the logarithm of the original limit. So now we have to solve the equation: ln(L) = -6 Use both sides of the equation as exponents of the exponential function e: e^[ln L] = e^-6 One of the basic properties of exponential functions and logarithm functions is that they undo each other, that is, they are inverses. So e^ln(L) = L. Therefore, L = e^-6 or L = 1/e^6.

Here is another example where repetition is needed:

Evaluate the Limit

In Calculus today we were trying to evaluate the limit of (1-cos x) --------- x^2 as x approaches zero. I hope that looks right on your screen. The book said the answer was 1/2 but I graphed it on my TI-82 and after zoom-boxing quite a few times (until the x values were expressed as N * 10^A with A being -3 or -4, I found that it seemed to oscillate infinitely and there was no limit. I also tried to calculate the value of y when x was 1*10^-7 and other very small positive and negative numbers. Anything that small came out as zero. I wasn't sure if the book was right and I'm not seeing something, or if it doesn't exist, or if it is zero. I assume the zero result when very extremely small numbers are used is because the calculator rounds and makes it zero.

Doctor Pete, who answered, speculated that the reported oscillation in the graph was due to rounding error in the calculator, and used that as a reason to discuss the danger of relying on calculator answers for limits; but upon graphing this, I realized what Justin must have done. Here are the graphs he’d get by typing (1 – cos(x))/x^2, the correct function (in blue), and 1 – cos(x/x^2), which he would get on many calculators if he failed to use parentheses (in red):

The red graph fits Justin’s description; the blue graph has the correct limit, 1/2.

But here is the answer to the limit question:

Well, here's how you would evaluate the limit using mathematics. I will write it as Limit[(1-Cos[x])/x^2, x->0]. Now, notice that (1-Cos[x]) -> 0 as x -> 0, and similarly, x^2 -> 0 as x -> 0. Sowe have an indeterminate form of type 0/0. We apply L'Hopital's rule, which states that if f[x] -> 0, and g[x] -> 0 as x -> a for some differentiable functions f[x], g[x], then Limit[f[x]/g[x], x -> a] = Limit[f'[x]/g'[x], x->a]; hence with f[x] = 1-Cos[x], g[x] = x^2, we find Limit[(1-Cos[x])/x^2, x->0] = Limit[Sin[x]/(2x), x->0], which again, is an indeterminate form of type 0/0. So weapply L'Hopital's rule again, which gives Limit[Sin[x]/(2x), x->0] = Limit[Cos[x]/2, x->0] = 1/2. Therefore the limit is 1/2.

Again, applying the rule twice yielded the answer.

Here is a particularly tricky example, from 2003:

Escaping L'Hopital's Loop

Given F(x) = e^(-1/x^2) when x does not = 0 = 0 when x = 0 use L'Hopital's rule to show that F'(0) = 0. The problem is that when I find the derivative of the function I get F'(x) = -[e^(-1/x^2)] / x^2 which gives 0/0 when you put in x=0. So I use L'Hopital's rule here.

But when I do that I end up with F'(x)= [-2 e^(-1/x^2)] / x^3 which is still 0/0. I am not sure how to proceed, becauseeach time I use the rule I keep getting 0/0. The exponent in the denominator gets bigger, but that doesn't change anything. Any help would be greatly appreciated!

Here again, the limit is part of a bigger problem, which we have to understand in order to see what is being asked. To find the derivative at 0, Derek presumably wrote the limit of the difference quotient $$\displaystyle F'(0) = \lim_{x\rightarrow 0} \frac{F(0+x) – F(0)}{x} = \lim_{x\rightarrow 0} \frac{e^{-\frac{1}{x^2}}}{x},$$ where we are using *x* where one might normally use *h* or Δ*x*. I think he copied incorrectly, in addition to calling the quotient \(F'(x)\), which it is not.

My version of the limit, which we can write as $$\displaystyle \lim_{x\rightarrow 0} \frac{e^{-x^{-2}}}{x},$$ has the form \(0/0\) as stated; applying L’Hôpital’s Rule directly results in $$\displaystyle \lim_{x\rightarrow 0} \frac{-2x^{-3}e^{-x^{-2}}}{1} = \lim_{x\rightarrow 0} \frac{-2e^{-x^{-2}}}{x^3},$$ which is just what he said he got, and is still \(0/0\) but no simpler. In fact, it is more complicated. What can we do?

Doctor Luis answered, just focusing on the requested limit. I will slightly modify what he wrote so it applies to the limit as I wrote it above:

The good thing about L'Hopital is that it doesn't have to be used on the indeterminate form 0/0. It can also be used on +oo/+oo. We can make use of that byrewriting the fraction a/b as (1/b)/(1/a). Inspired by this, we write e^(-1/x^2) / x = (1/x) / e^(1/x^2) where I've used the fact that 1/e^(-1/x^2) = e^(+1/x^2). Now you can happily apply L'Hopital's rule to find your limit.

Applying the Rule to \(\displaystyle \lim_{x\rightarrow 0} \frac{x^{-1}}{e^{x^{-2}}}\), we get \(\displaystyle \lim_{x\rightarrow 0} \frac{-x^{-2}}{-2x^{-3}e^{x^{-2}}} = \lim_{x\rightarrow 0} \frac{x}{e^{x^{-2}}},\) which is 0 as required.

]]>The next few posts will look at a powerful technique for finding limits in calculus, called L’Hôpital’s Rule. Here, we’ll introduce what it is, and why it works. In the next post we’ll examine some harder cases.

The method we will be discussing is used to find limits that have an **indeterminate form**. We touched on this in the post Zero Divided By Zero: Undefined and Indeterminate, where we saw that 0/0 is indeterminate, because although one might think that anything divided by itself is 1, if the numerator and denominator approach 0 in different ways, you might approach any number as a limit.

Here is an introductory question from 1998:

Why Are 1^infinity, infinity^0, and 0^0 Indeterminate Forms?

I am a senior in Advanced Placement Calculus, and my class is having a hard time understanding some indeterminate forms. We know that these are indeterminate forms: 0/0 infinity/infinity infinity - infinity, But why are these indeterminate forms? 1^infinity infinity^0 0^0 We feel that 1^infinity = 1, infinity^0 = 1, and 0^0 = 1. Part of these conclusions come from the fact that 0^infinity = 0 and 0^(-infinity) = infinity. Could you please explain these determinate and indeterminate forms?

Doctor Rob explained:

These forms are called indeterminate because if you replace 1, 0, and infinity by functions the limits of which are 1, 0, and infinity as x -> 0, then the limit of the compound function does not exist, in the sense thatthe limit depends on which functions you choose. For an example of this discrepancy for1^infinity, on the one hand, take f(x) = 1 and g(x) = 1/x. Then: lim f(x)^g(x) = lim 1^(1/x) = lim 1 = 1 On the other hand, if we take f(x) = 1 + x and g(x) = 1/x, then: lim f(x)^g(x) = lim (1 + x)^(1/x) = e = 2.718281828459... > 1 For an example of0^0, on the one hand, take f(x) = 0, g(x) = x. Then: lim f(x)^g(x) = lim 0^x = lim 0 = 0 On the other hand, if we take f(x) = x, and g(x) = 0, then: lim f(x)^g(x) = lim x^0 = lim 1 = 1 > 0 All of the seven indeterminate forms are the same: 0/0 [(k*x)/x -> k, for any real k] 0*infinity [(k*x)*(1/x) -> k, for any real k] infinity/infinity [(k/x)/(1/x) -> k for any real k] infinity - infinity [(k+1/x)-(1/x) -> k for any real k] 1^infinity [(1+x)^(ln[k]/x) -> k for any positive real k] infinity^0 [(1/x)^(ln[k]*x/[1-x]) -> k for any positive real k] 0^0 [x^(ln[k]/ln[x]) -> k for any positive real k] It is a useful exercise to prove all these statements.

Often there are special ways to determine the limit in such cases; but there is a single powerful technique that can be applied to all of them.

Now we’ll start with a question from a student who was curious how some limits can be proved:

Limit Proofs with L'Hopital's Rule

I have been just introduced to calculus. In limits, we have the following identities as results without proof:

1) lim (x + 1/x)^x = e

x-->0

2) lim (a^x - 1)/x = ln(a)

x-->0

3) lim (ln(1+x))/x = 1

x-->0

Certain limits (e.g. \(\displaystyle\frac{\sin{x}}{x}\)) are commonly presented without proof, because they are needed early in calculus, but the proof would be beyond the students. These are not among those I would expect to see in that context; perhaps they were just examples. Since Kumarpal clearly hasn’t been taught specific methods for proving these, Doctor Rob used it as an opportunity to introduce one method to prove all of them:

Later on in your study of calculus you will learn something called L'Ho^pital's Rule. This will allow you to compute and prove these limits. It says:

Theorem:

If as x -> a (where a is any real number or infinity),

lim f(x) = 0 and lim g(x) = 0, then:

lim [f(x)/g(x)] = lim [f'(x)/g'(x)],

x->a x->a

provided either limit exists (in which case both do).

Corollary:

If lim f(x) = infinity and lim g(x) = infinity, then:

lim [f(x)/g(x)] = lim [f'(x)/g'(x)],

x->a x->a

provided either limit exists (in which case both do).

The name in modern French is L’Hôpital, often spelled for English keyboard convenience L’Hopital, or sometimes L’Hospital (the older French spelling). We vary in how we choose to write it.

The main theorem provides a way to find a limit of the form **0/0** (that is, the limit of a fraction whose numerator and denominator both approach 0). The corollary (a theorem that follows easily from the main theorem) says that you can apply the same method to the form **∞/∞**.

We’ll look at a couple proofs of the theorem below; there was no need to prove it for this student, but Doctor Rob chose to show how the corollary follows from the main theorem:

Proof of Corollary:

We start with:

lim f(x)/g(x) = lim [1/g(x)]/[1/f(x)]

x->a x->a

and we can apply the Theorem to this quotient:

lim f(x)/g(x) = lim [-g'(x)/g(x)^2]/(-f'(x)/f(x)^2]

x->a x->a

= {lim [g'(x)]/f'(x)]}*{lim [f(x)/g(x)]}^2

x->a x->a

1/lim [g'(x)/f'(x)] = lim [f(x)/g(x)]

x->a x->a

lim [f(x)/g(x)] = lim [f'(x)/g'(x)] Q.E.D.

x->a x->a

This idea of rewriting a function to make the rule apply to it is common in applying either form of the rule. If you are having trouble following this, the next to last line comes from dividing both sides of the previous line by {lim [f(x)/g(x)]}^2.

Now, Doctor Rob applied the theorem to each of the three examples Kumarpal had asked about. The first is the most complicated, because it is not in the form required by the theorem, and has to be transformed into that form first. You can skip this example if you aren’t ready for it; we’ll be digging into this kind next time:

If you accept this theorem without proof, the above three limits can be computed using it: 1) lim (x + 1/x)^x = lim e^(x * ln[x + 1/x]) x->0 x->0 = e^lim ln[x + 1/x]/(1/x) x->0 Now the limit has the form lim f(x)/g(x), where f(x) = ln(x + 1/x) and g(x) = 1/x, and lim f(x) = infinity, and lim g(x) = infinity as x -> 0. Apply the corollary to L'Ho^pital's Rule: f'(x) = (x^2 - 1)/[x * (x^2 + 1)] g'(x) = -1/x^2 f'(x)/g'(x) = -x * (x^2 - 1)/(x^2 + 1) The limit of this as x -> 0 is 0. Thus: lim (x + 1/x)^x = e^lim -x*(x^2-1)/(x^2+1) x->0 x->0 = e^0 = 1

What Doctor Rob did was to first use the fact that \(a^b = \left(e^{\ln a}\right)^b = e^{b \ln a}\), and then rewrite the exponent as a fraction rather than a product, so that the theorem applies. He also used the fact that the exponential function is continuous, so that the limit of an exponential is the exponential of the limit. Next time we’ll see examples of a different way to do the same thing.

The next problem is more straightforward:

2) lim (a^x - 1)/x x->0 This is already in the correct form for L'Ho^pital's Rule, with f(x) = a^x - 1 and g(x) = x. The hard part here is computing f'(x), but a = e^ln(a), so a^x = e^(x*ln(a)), and: f'(x) = ln(a)*e^(x*ln(a)) = a^x*ln(a) g'(x) = 1 Putting this together: lim (a^x - 1)/x = lim a^x*ln(a)/1 = ln(a) x->0 x->0

Some textbooks teach the derivative of \(a^x\) as a formula to memorize, so you wouldn’t have to do any extra work for \(f’\).

Finally,

3) lim [ln(1 + x)]/x = lim [1/(x + 1)]/1 = 1 x->0 x->0

That one was easy.

Now, suppose you aren’t willing to accept a “rule” on a teacher’s authority, but want to see why it works. You’re in luck; we have the same attitude! And we have been asked several times why the rule works.

We can start with a relatively relaxed approach, focused on seeing somewhat intuitively why it works, rather than a formal proof. Take this 2001 question:

Explanation of L'Hopital's Rule

In certain cases, L'Hopital's Rule connects the limit of a quotient (f/g) to the limit of the quotient of the derivatives (f'/g'). This is true when f and g go to 0 or infinity at the point where the limit is taken. I understand how to use this rule, and I somewhat understand the proof, butI still do not understand why this happens. Can you help? Please also try to describe explicitly how to think of the roles of the limit, the derivative, and the quotient.

Doctor Fenton answered, focusing on the graph and dealing only with the easiest case:

Thanks for writing to Dr. Math. You've posed a very good question. One way you can think of this is to use the idea of derivative: a function f(x) is differentiable at x=a if f(x) is very close to its tangent line y = f'(a)*(x-a) + f(a) near x = a. Specifically, f(x) = f(a) + f'(a)*(x-a) + E1(x) where E1(x) is an error term which goes to 0 as x goes to a. In fact, E1(x) must approach 0 so fast that E1(x) lim ----- = 0 x->a x-a because E1(x) f(x)-f(a) ----- = --------- - f'(a) x-a x-a and we know from the definition of derivative that this quantity has the limit 0 at a.

Here is an example showing what \(E_1\) is, here called \(\epsilon\):

The important thing is that tangency causes \(\epsilon\) to decrease rapidly, so the line is a very good approximation close to *a* (which is 0 in the example).

Similarly, if g is differentiable at x = a, g(x) = g(a) + g'(a)*(x-a) + E2(x) where E2(x) is another error term which goes to 0 as x->a. If you're computing the limit of f(x)/g(x) as x->a and if g(a) is not equal to 0, then as x->a, the numerator becomes indistinguishable from f(a) and the denominator from g(a), so the limit is lim f(x) f(a) x->a ---- = ---- g(x) g(a)

It would be easy if things *always* worked that way; that works when *f* and *g* are both continuous, and the denominator is not zero. But here, the denominator *is* zero, so we use the approximations:

If both f(a) and g(a) are 0, then we must use the tangent approximations to say that f(x) f(a) + f'(a)*(x-a) + E1(x) ---- = -------------------------- g(x) g(a) + g'(a)*(x-a) + E2(x) f'(a)*(x-a) + E1(x) = --------------------- g'(a)*(x-a) + E2(x) f'(a) + [E1(x)/(x-a)] = --------------------- g'(a) + [E2(x)/(x-a)] and we have seen that the second term becomes negligible as x->a. In other words, when both function values approach 0 as x->a, the ratio of the function values just reduces to the ratio of the slopes of the tangents, because both functions are very close to their tangent lines.

In the last line, we divided the numerator and denominator by \((x-a)\), which is legal because while we are taking the limit, *x* is never equal to *a*.

One thing worth noticing is the importance of both functions approaching zero; a common mistake (which we’ll see in a later example) is to apply the rule when this is not true.

Here is an example, based on the second example in Kumarpal’s question above:

Here, \(f(x) = 2^x – 1\) (blue) and \(g(x) = x\) (red); their quotient is \(\displaystyle h(x) = \frac{f(x)}{g(x)} = \frac{2^x – 1}{x}\) (purple). The tangent to the two curves are shown as dotted lines (the tangent to *g* being *g* itself). The ratio of the tangent lines is just the ratio of the derivatives at \(x=1\), which is \(\ln 2\); and that is the limit of \(h(x)\).

That was convincing, but not quite a formal proof, and did not cover all cases. Now let’s look at this question, from later in 1998, where we find two fuller proofs:

Proof of L'Hopital's Rule

Can you show me a proof of L'Hopital's Rule?

Doctor Anthony answered first, with a short explanation based on the Taylor expansion, which as used here is the same idea as the tangent line with an error term:

To prove L'Hopital's Rule (sometimes spelled L'Hospital's Rule), we use the Taylor expansion: f(a+h) = f(a) + hf'(a) + terms in h^2 and higher g(a+h) = g(a) + hg'(a) + terms in h^2 and higher So: f(a+h) f(a)+h*f'(a) Lt ------ -> ------------ h->0 g(a+h) g(a)+h*g'(a) so with f(a) = g(a) = 0 we get: f(a+h) h*f'(a) f'(a) Lt ------- -> ------- -> ------ h->0 g(a+h) h*g'(a) g'(a) We can use l'Hopital's also if f'(a) -> infinity and g'(a) -> infinity: f(a) infinity 1/g(a) 0 ---- -> -------- so -------- -> --- g(a) infinity 1/f(a) 0 and applying l'Hopital's to this latter expression, we get: f(a) -g'(a)/[g(a)]^2 g'(a)*[f(a)]^2 ------ -> ---------------- -> ---------------- g(a) -f'(a)/[f(a)]^2 f'(a)*[g(a)]^2 and cross-multiplying: f'(a) f(a) ------- -> ------ g'(a) g(a) Therefore whether we have 0/0 or infinity/infinity we can use l'Hopital's rule.

You may recognize the first part as equivalent to Doctor Fenton’s explanation, and the second as Doctor Rob’s proof of the corollary above.

Next, Doctor Rob gave a careful proof, for those who want more. He started by stating the theorem carefully:

This is a rather complicated business. First of all, there areseveral formsof the Rule. I will state and prove one of them, and try to indicate howother forms are corollariesof this one. Theorem (L'Ho^pital's Rule): Let f(x) and g(x) be differentiable on the interval a <= x < b, with g'(x) nonzero throughout. If: (i) lim f(x) = 0 and lim g(x) = 0 x->b- x->b- or if (ii) lim f(x) = infinity and lim g(x) = infinity x->b- x->b- and if: lim f'(x)/g'(x) = L x->b- then: lim f(x)/g(x) = L x->b- Note: b may be any real number or infinity, and L may be any real number or infinity.

As written, these two cases are the 0/0 and ∞/∞ forms we saw above, both assuming that *x* approaches a finite value and the ratio of derivatives approaches a finite limit, and requiring only a one-sided limit for now. We’ll see the other (infinite and two-sided) cases below.

He first proved the 0/0 finite case:

Proof: We restrict our attention to the case where b and L are real numbers. Assume first hypothesis (i) is satisfied. We may set f(b) = 0 and g(b) = 0, and then f(x) and g(x) are continuous on the closed interval a <= x <= b. Applying Cauchy's Mean Value Theorem (see below), we see that for any x with a < x < b, there is a point x0 with x < x0 < b such that: f'(x0)/g'(x0) = [f(b)-f(x)]/[g(b)-g(x)] = f(x)/g(x) Since: lim f'(x0)/g'(x0) = L x0->b- given any epsilon > 0, we may choose delta > 0 so that: |f(x)/g(x) - L| = |f'(x0)/g'(x0) - L| < epsilon whenever b - delta(n) < x < b. This proves that: lim f(x)/g(x) = L x->b- This is the desired conclusion for hypothesis (i).

Notice that where Doctor Fenton quietly assumed that *f* and *g* were continuous, so that the values at *a* equal their limits, Doctor Rob, in this more formal context, explicitly makes that true by defining *f*(*a*) and *g*(*a*) to equal the limits, filling in a removable discontinuity if one exists. Also, his proof uses Cauchy’s Mean Value Theorem (which he proves below) where the others used facts about approximations. In effect, this theorem approximates *a* rather than *f*(*a*)!

Next, he proves the ∞/∞ case directly rather than as a corollary:

Now assume hypothesis (ii). Given epsilon > 0, we first choose delta > 0 so that, whenever b - delta < x1 < b, then: |f'(x1)/g'(x1) - L| < epsilon Set x0 = b - delta, and take any points x with x0 < x < b. By Cauchy's Mean Value Theorem (see below), there is a choice of x1 with x0 < x1 < b, such that: [f(x)-f(x0)]/[g(x)-g(x0)] = f'(x1)/g'(x1) Thus: |[f(x)-f(x0)]/[g(x)-g(x0)] - L| < epsilon Now, defining: h(x) = [1-f(x0)/f(x)]/[1-g(x0)/g(x)] we can rewrite this as: |h(x)*f(x)/g(x) - L| < epsilon which is valid for all x with x0 < x < b. Now by using hypothesis (ii), it follows that: lim h(x) = 1 x->b- Choose x1 with x0 < x1 < b so that |h(x)-1| < epsilon and h(x) > 1/2, whenever x1 < x < b. For such values of x, we have: |h(x)*[f(x)/g(x)-L]| = |h(x)*f(x)/g(x) - L*h(x)| <= |h(x)*f(x)/g(x) - L| + |L*[1-h(x)]| < epsilon + |L|*epsilon and: |f(x)/g(x) - L| < (1+|L|)*epsilon/h(x) < 2*(1+|L|)*epsilon This proves that: lim f(x)/g(x) = L x->b- Q.E.D.

He then proves Cauchy’s Mean Value Theorem, which he used within the proofs above:

Now the proof above depends in two places on the following. Cauchy's Mean Value Theorem: Suppose that two functions f(x) and g(x) are continuous in the closed interval a <= x <= b and differentiable in the open interval a < x < b. Suppose further that g'(x) is nonzero for x in a < x < b. Then there exists at least one number c with a < c < b such that: [f(b)-f(a)]/[g(b)-g(a)] = f'(c)/g'(c) ---------------------------------------------------------------------- Proof: Let: h(x) = [g(b)-g(a)]*[f(x)-f(a)] - [f(b)-f(a)]*[g(x)-g(a)] Then clearly h(a) = h(b) = 0. Since h is continuous in [a,b] and differentiable in (a,b), we can apply Rolle's Theorem, which tells us that there exists at least one number c in (a,b) such that h'(c) = 0. Then: h'(c) = [g(b)-g(a)]*f'(c) - [f(b)-f(a)]*g'(c) = 0 Rearranging this equation, and using the fact that g'(c) is nonzero, we get: [f(b)-f(a)]/[g(b)-g(a)] = f'(c)/g'(c) Q.E.D.

Now, he discusses modifying the proof for variants where the limit is infinite, and for limits on the other side or both sides:

Now let's talk about variations on L'Ho^pital's Rule. The first is to see thatif L = +infinity or -infinity, the same ideas work to give a proof, although we have to modify the parts dealing with epsilons and deltas to accommodate the definition of an infinite limit. Instead of |F(x) - L| < epsilon, you need |F(x)| > 1/epsilon, and so on. The second is to see that if we replace the limits with x->b- with ones withx->b+, and replace the condition a <= x < b with b < x <= a, the conclusion still holds. This is a corollary of the stated theorem, obtained by substituting y = 2*b - x, and applying the theorem. The third is to see that if we replace the one-sided limits bytwo-sidedones, the conclusion still holds, as a corollary of the theorem and the immediately preceding paragraph.

Finally, he covers the case where the limit is taken at infinity rather than a finite value of *x*:

The fourth is to see that if we takeb = +infinity or -infinity, the conclusion of the theorem still holds. We prove this by substituting x = 1/y, and then as x -> +infinity, y -> 0+, and as x -> -infinity, y -> 0-. Also, dx/dy = -1/y^2, and by the chain rule: f'(x) = d/dy[f(1/y)] = f'(1/y)*d/dy[1/y] = f'(1/y)*(-1/y^2) and similarly for g'(x). Thus: lim f'(x)/g'(x) = lim [f'(1/y)*(-1/y^2)]/[g'(1/y)*(-1/y^2)] x->+infinity y->0+ = lim f'(1/y)/g'(1/y) y->0+ = lim f(1/y)/g(1/y) y->0+ by the theorem, so: lim f'(x)/g'(x) = lim f(x)/g(x) x->+infinity x->+infinity which was to be shown. The case x -> -infinity is handled in the same way.

This is a good example of a thorough proof. Whereas for most purposes the earlier demonstrations that the rule makes sense are satisfying, a mathematician wants to make sure there are no loopholes. In doing so, he’s reminded us of all the different ways in which this theorem can be applied.

Next time, we’ll look at some relatively complicated applications of the rule, where we have to do some preliminary work before applying the theorem, or have to apply it more than once.

]]>We can start with a question about the basics, from 2002:

Implications in Logic I don't understand the first four rules to do with implications:Modus Ponens,Modus Tollens,Hypothetical Syllogism, andDisjunctive Syllogism. Can you explain them step by step, please?

These are four kinds of argument (deductive reasoning) that have been discussed since ancient times. (The first two names are Latin, the last two Greek.) Doctor Achilles started by clarifying how he would be typing logical symbols using the keyboard; I’m going to insert in brackets the special characters we use when they are available:

First, just so we're on the same page, here are the symbols I use: (P -> Q) [P→Q] Means: "If P, then Q." This sentence is true unless P is true and Q is false. (P ^ Q) [P∧Q] Means: "P and Q." This sentence is true if and only if P and Q are both true. (P v Q) [P∨Q] Means: "P or Q." This sentence is true if P is true, and it is also true if Q is true. The only way it can be false is if both P and Q are false. ~P [¬P] Means: "Not P." This is true if P is false.

Now he states and explains each of these “laws”:

Modus Ponens:The rule for this is: If you have: (P -> Q) And you also have: P Then you can conclude: Q This follows directly from the definition of (P -> Q) Which is "If P is true, then Q is true." So, if I know that if P is true, then Q must be true, AND I know that P is true, then I can validly conclude that Q is true also.

More compactly, Modus Ponens (Latin for “method of affirming”) looks like this:

```
p→q
p
∴q
```

It is also called “Implication Elimination”, for obvious reasons; “Affirming the Antecedent”; or the “Law of Detachment”.

An example in English:

If it is raining, the ground is wet. It is raining. Therefore, the ground is wet.

Next,

Modus Tollens: This is a little trickier. The rule here is: If you have: (P -> Q) And you have: ~Q Then you can conclude: ~P Here's why: We know first of all that "If P is true, then Q is true." And we know that Q is false. With Q false, is it possible for P to be true? No! Because if P were true, then Q would have to be true. So if Q is false, then P has to be false also.

More compactly, Modus Tollens (Latin for “method of denying”) looks like this:

```
p→q
¬q
∴¬p
```

It is also called “Denying the Consequent”, because it involves saying that the conclusion of a conditional statement is false. You may notice that it can be thought of as applying Modus Ponens to the contrapositive of the conditional statement.

An example in English:

If it is raining, the ground is wet. The ground is not wet. Therefore, it is not raining.

Next,

Hypothetical Syllogism: The rule here is: If you have: (P -> Q) And you have: (Q -> R) Then you can conclude: (P -> R) Here's why: We know that "if P is true, then Q is true." And we know that "if Q is true, then R is true." But we don't know ANYTHING about whether any of the letters are actually true or not. Let's assume (or hypothesize) for a second that P is true. Then, by modus ponens, Q is true. And then by modus ponens again, R is true. So: IF we assume P is true, THEN we conclude R is true. Since we didn't KNOW P was true, we cannot take R home with us, but we can say that "If P was true, then R would be true." This is equivalent to saying "If P, then R" or (P -> R).

More compactly, Hypothetical Syllogism (Greek for “suppositional reasoning”, meaning a syllogism in which each premise contains “if”) looks like this:

```
p→q
q→r
∴p→r
```

An example in English:

If it is raining, the ground is wet. If the ground is wet, cars skid. Therefore, if it is raining, cars skid.

Finally,

Disjunctive Syllogism: The rule here is: If you have: (P v Q) And you have: ~P Then you can conclude: Q [This also works if you have (P v Q) and ~Q, you can conclude P.] Here's why: We know first of all that "P or Q is true." We also know that P is false. If P or Q is true, and P is false, then Q has no choice but to be true. So we can conclude that Q is true.

More compactly, Disjunctive Syllogism (Greek for “pulling-apart reasoning”, meaning a syllogism in which one premise contains “or”) looks like this:

```
p∨q
¬p
∴q
```

An example in English:

It is raining, or the sprinkler is on. It is not raining. Therefore, the sprinkler is on.

A 1998 question had asked about this last one:

Disjunctive Syllogism I am enrolled in a Elementary Logic and Set Theory class, and we have an assignment to find out definitions of the words: Modus Ponens, Modus Tollens, and Disjunctive Syllogism. I have already found the first two, but am stuck on the last one. So my question is, what does it mean?

Doctor Mike answered:

The disjunctive syllogism is (P v Q) and notQ --> P. In words: "If one or the other is true, and one of them is false, the other must be true." Here is an example: - Given: Either Congress meets in Washington D.C. OR pigs can fly. - Given: Scientific evidence shows conclusively that pigs cannot fly. - Conclusion: Congress meets in Washington D.C. I hope this helps. "Thanks for writing." OR "Pigs can fly."

Students can be given the premises of an argument and asked to draw a conclusion, if possible. This requires a little extra knowledge of how forms work, as shown in this question from 2011:

If It Doesn't Follow a Form, Don't Draw a Conclusion Sometimes with thelaw of syllogism, the p's and q's don't line up with the formula. What are some examples where there isno conclusionto the laws of syllogism? What about for thelaws of detachment? I've tried books and Internet sites and found nothing to help me with my problem. I can't even find any examples online.

This was the start of an extended conversation that was archived in two parts. I replied, just asking for clarification:

I'm not sure I understand. These laws always DO have a conclusion, and they are always correct. It sounds like you are asking about something very different:situations in which it may LOOK like these laws apply, but they really don't. The answer there would depend on what YOU think is close enough to the real thing as to confuse you. Is this an assignment you were given? How is it actually worded?

It turned out that Crystal was referring to an *argument* with no valid conclusion, not the *law* itself. She nicely stated and illustrated the laws she was referring to (Hypothetical Syllogism and what I have called Modus Ponens above), showing what kind of problems she was asking about:

I know what thelaw of syllogismis, and what the law of detachment is. I know how to draw conclusions from syllogisms like this: If a quadrilateral is a square, then it contains four right angles. If a quadrilateral contains four right angles, then it is a rectangle. It would be: If a quadrilateral is a square, then it is a rectangle. And for thelaw of detachment, when it is in this form, I know how to solve it: p->q is a true statement, if p is true then q is true These are easy because they follow the form.

But she had some problems that either didn’t fit any form, or were hard to identify:

However, there are some examples that I am not sure about. I get really confused and have no idea how to proceedwhen the statements are mixed up-- when they are not in this form: p->q and q->r are true then p->q is a true statement For example, Conditional: If a road is icy, then driving conditions are hazardous. Statement: Driving conditions are hazardous. Since this is with the law of detachment, wouldn't you conclude "it is icy outside"? My teacher said thatbecause there are two conclusions, you can't make up another conclusion. I don't get it. Here's another, which deals with the law of syllogism: Conditional 1: If you spend money on it, then it is a business. Conditional 2: If you spend money on it, then it is fun. My teacher said that this has no conclusion, either. I asked about it, but she didn't really explain it to me well. And my math book also doesn't help me at all. Why does this one have no conclusion? I just don't understand these different forms. I've labeled these examples as p and q and stuff, but no matter how much I try to figure it out, I can't understand why they would have no conclusion. It makes absolutely no sense!

Now I had a really good idea of her confusion, and could answer:

First, the Modus Ponens (Detachment) example:

I suppose what your teacher meant is that the second statementaffirms the conclusionof the first statement rather than its condition. The law of detachment has the form p->q, and p, therefore q That is, if you know that q happens whenever p happens, and you also know that p did happen, then q must happen. The example about icy roads and driving conditions is NOT like that; it has the form p->q, and q You can't apply this law; in fact,there is no law that you can apply, so you have no way to make a conclusion. Look at the details of the example, which illustrates the issue nicely. (Not all examples are even true, much less showing why the logic makes sense.) The conditional statement says that if the road is icy, then driving becomes hazardous. That makes sense. What it does NOT say is that ice is the ONLY thing that can make driving hazardous! There are other reasons to be careful when you drive -- flooding, sun glare, bars having just let out, or whatever. So if someone tells you, "Look out, the driving is hazardous right now," you can't know what the cause is. It might be hazardous for any number of reasons. You CAN'T conclude with any confidence that the road is icy. Does that make sense?

In fact, attempting to conclude “*p*→*q*, and *q*, therefore *p*” is such a common mistake that it has a name: the **Fallacy of the Converse**, or the error of **Affirming the Consequent**. That is, it involves confusing the statement *p*→*q* with its converse, *q*→*p*, from which *p* could be concluded — the arrow goes the wrong way.

Next, the syllogism example:

The law of syllogism says that ... if p->q and q->r, ... then you can conclude that ... p->r. It's like apipelineof reasoning: if p is true, then q is true, and therefore so is r.The conclusion of one statement has to be the condition of the next, so the reasoning flows in the right direction. The example about how you spend money has the form p->q, and p->r The pipes don't connect the right way to be able to say that q->r or r->q This is a less clear example; here's a situation of the same form that may be easier to follow: If it rains, I will be wet. If it rains, the road will be slippery. Can I conclude that if I am wet, the road will be slippery? No, maybe I'm wet because I just took a shower. Can I conclude that if the road is slippery, I am wet? No, maybe there was an oil spill on the road. So we can't make any definite conclusion; we don't know enough to say more than we have been told.

Summarizing:

The idea here is that if you can't put the argument in some form from which you can make a conclusion, thenyou can't make a conclusion. There are other forms you don't know yet, soyou can't always be sure that no conclusion can possibly be drawn; but when statements do not conform to any of the forms you know, that is all you can say. YOU can't draw a conclusion, simply because none of the rules you know applies. Put another way: when given logical statements that do not follow the forms you know, conclude "No conclusion is possible (for me)"!

The next day, Crystal wrote back in a new thread, about a slightly different issue:

Disorderly Deduction I know what the law of syllogism is, and honestly I've written to Dr. Math about it a lot and I'm starting to understand it better (thank you!). But here is a problem that I don't get because it doesn't follow the "if p->q and q->r then p->r" law of syllogism form. If the sum of the angles of a polygon is 720 degrees, then it has six sides. If the polygon is a hexagon, then the sum of the angles is 720 degrees. My teacher said that the conclusion would be If a polygon is a hexagon, then it has 6 sides. That doesn't follow (to use an analogy introduced to me by a math doctor recently) the "pipes" of the law of syllogism. I just keep getting more of these weird questions from my teacher, but when I look online for help, the only syllogisms I find are ones I already know how to do.

I responded again:

Hi, Crystal. Let's translate this into symbols. First, what are the simple statements here? If the sum of the angles of a polygon is 720 degrees, then it has six sides. If the polygon is a hexagon, then the sum of the angles is 720 degrees. I see the following, just taking them in order as they come: p = the sum of the angles of a polygon is 720 degrees q = it [the polygon] has six sides r = the polygon is a hexagon Using those symbols, we have if p then q if r then p Or, in fully symbolic logic terms, p->q r->p Clearly we don't have EXACTLY "p->q, q->r," but that'sjust because we didn't define p, q, and r in the most helpful order. Since you have absorbed my comparison to plumbing, let's see whether we canarrange these "pipes" so they line up correctly: r->p p->q Can you see how this becomes this? r->q

The pattern doesn’t care about order:

The key here is that the rules are not all about the particular letters you use, but about relationships. (A plumber doesn't get confused if he pulls out two pipes from his truck in the wrong order!) If you have two conditional statements such thatthe conclusion of one is the condition of the other, then the law of syllogism says you can connect them together to make one new conditional statement. Translating our conclusion, r->q, back into words using our definitions, we have if r, then q If the polygon is a hexagon, then it [the polygon] has six sides. Or, smoothing out the readability a bit, If a polygon is a hexagon, then it has six sides.

Crystal had more questions:

Okay, so this is another question on the law of syllogism! I understand the hexagon question now (thank you!); but is it because these statements ... p->q r->p ... are basically biconditionals that you can rewrite them like this? r->p p->q I'm just trying to understand it more, and my teacher isn't doing a good job explaining it to me. But I've emailed about this question before and it's getting a lot easier for me to understand!

I answered:

No, nothing here is a biconditional. In fact, for this particular example, everything would have been true if the conditionals were replaced with biconditionals -- which is whythis example is not a good example to illustrate the issues. The conclusion would still be valid if we replaced everything with different statements -- even total nonsense: If a borogove is mimsy, then mome raths outgribe. If toves are slithy, then a borogove is mimsy. We can conclude that If toves are slithy, then mome raths outgribe. That's because if toves are slithy, then a borogove is mimsy, and since a borogove is mimsy, mome raths outgribe. Do you see why?

(If you are a Lewis Carroll fan, you will recognize my use of terms from the nonsense poem *Jabberwocky*; the author was a logician.)

Crystal again:

So in your nonsense example, you can draw that conclusion because some of the p's and q's "cross out"? and because the statements are assumed true, you can just write the remaining hypothesis and conclusion as an if-then statement? I just want to make this clear. I've emailed a lot to Dr. Math and it's helped a LOT!!!

I concluded:

Yes, you can think of it as canceling p's or q's in the sense that it doesn't matter what the letters are, only whether they match up. But they have to match up in the right way: one a conclusion; the other a condition. In the form "p->q, q->r implies p->r," we have p->q q->r ------- p---->r In my example, "p->q, r->p" becomes this: p->q r->p ------- r---->q Here, the p's connect the two conditional statements. The order is different, but the relationship is the same.]]>

Usually when we discuss converses (and inverses and contrapositives) we use clear, idealized examples. But statements in real life — even in real math — are not quite so straightforward. The difficulty is not merely in the language, but in the complexity of our statements. A question in the beginning of 2017 brought out some interesting issues.

Before I go to the question itself, I have to show the context, so some comments in the exchange will make more sense. Navneet first asked a question that didn’t get archived:

My book proves this statement- "Ifthe midpoint of a chord is joined to the centre of the circle,thenthe line passes through the midpoint of the corresponding minor arc." I understood the proof. But I have a confusion regarding its converse.What should be the converse of this theorem?The converse of- If p, then q, where p and q are sentences, is If q, then p. So, I think the converse of this should be- "Ifa line passes through the midpoint of the minor arc,thenthe line should join the midpoint of the corresponding chord to the centre of the circle"!!! Please visit the link- http://www.themathpage.com/abooki/logic.htm . Under the converse category you will find this sentence- "If a statement has two hypotheses -- If a and b, then c -- then apartial converseis: If a and c, then b." Do you think that the theorem stated in the beginning has two hypotheses?

I was obligated to answer this, because it was on a thread I “owned”; but I was not in a position to say much. I replied briefly:

I am on vacation and do not have good equipment, so I can't give long answers, but I will do what I can. You are right that the converse of this theorem as stated isnot reasonable, and of course isnot true. The converse of a theorem is not in general true, so this is not a problem. If you were assigned to state the converse and whether it is true, you could state it more or less as you did. But if your goal is simply tofind a converse that makes senseand try to prove it, then a "partial converse" like what you stated is appropriate. In effect, you are restating the theorem as something like, "Given thata line passes through the center of a circle,ifit passes through the midpoint of a chord,thenit passes through the midpoint of the arc."

The point here is that we can’t just mechanically swap two halves of a sentence to make the converse. The statement here consists of more than just “if *p* then *q*“. Navneet’s initial attempt took *p* as “the midpoint of a chord is joined to the centre of the circle”, and *q* as “the line passes through the midpoint of the corresponding minor arc”. Merely swapping these produces nonsense. My suggestion was to rewrite the original in three parts: “Given *c*, if *p* then *q*“, so that the converse would be “Given *c*, if *q* then *p*“. Specifically, “Given that *a line passes through the center of a circle*, if *it passes through the midpoint of a chord*, then *it passes through the midpoint of the arc*.”

The converse, then, taken mechanically, would be, “Given that *a line passes through the center of a circle*, if *it passes through the midpoint of the arc*, then *it passes through the midpoint of a chord*.” This makes sense, and it does sound like the “partial converse” idea he had discovered — that was a good insight. But I don’t find that to be a common concept.

Now, before I made that response, Navneet had already asked another very similar question, on a new thread; but this time his textbook had stated a converse. This one was archived, because it got a fuller answer:

Converses in Construction According to my text, the midpoint theorem states: "Ifthe midpoints of any two sides of a triangle are joined,thenthe line is parallel to the third side." My book goes on to phrase the converse of this theorem like this: "The line drawn through the midpoint of one side of a triangle and parallel to another side bisects the third side." Do you feel that this is the converse of the midpoint theorem? We know that the converse of "if p, then q" is "if q, then p." So I think that the converse of the midpoint theorem should be this: "Ifa line is parallel to a side of a triangle,thenthe line should join the midpoints of the other two sides." Now clearly, this converse is false. But am I wrong somewhere? If yes, please correct me and explain in detail.

Note that the book’s converse is not written in if-then form, and also that it does not have the same meaning as Navneet’s.

Doctor Rick replied (a couple hours after my reply), expanding on what I had said:

Hi, Navneet. Let me borrow an idea from Doctor Peterson, who responded to another, related question of yours separately just now: "If your goal is simply to find a converse that makes sense and try to prove it, then a 'partial converse' like what you stated is appropriate."There is some latitude in what we call the converse of a theorem.We could take your statement of what a converse is, and add a bit to it to express how the concept is really applied to theorems: Theorem:In context C,if p, then q. Converse of the theorem:In context C,if q, then p. There is always a context, and this context is assumed to be the same for both the theorem and its converse. The latitude I spoke of consists in what else we reasonably choose to include as part of the context.

We can divide a statement into context, hypothesis, and conclusion somewhat arbitrarily, so things may not be as clear as the usual explanation of converses suggests. Combining this with the fact that, as in the book’s converse, not all theorems are stated explicitly as “if … then …”, things can look pretty confusing.

At minimum, a theorem in Euclidean geometry brings with it the entire context of Euclidean geometry; that is, the postulates and definitions of Euclidean geometry. So, in this case, we could restate the theorem as: Context: In Euclidean geometry,given a triangle and a line passing through the midpoint of one side and a point on another side of the triangle, ... Condition: If the second point is the midpoint of its side ... Conclusion: Then the line is parallel to the third side of the triangle. And the converse of the theorem in this form is: Context: In Euclidean geometry, given a triangle and a line passing through the midpoint of one side and a point on another side of the triangle, ... Condition: If the line is parallel to the third side of the triangle ... Conclusion: Then the second point is the midpoint of its side. That's what your book did. Why? Becausethe goal was to write a "converse theorem" that is TRUE! There is no point in writing a converse that is not a valid theorem, because we can't do anything with it. Therefore, the author chose, quite reasonably, to include enough in the "context" to make the converse a valid theorem.

And the author then chose to reword it in a way that looks quite different from the original, but is still a valid converse: “The line drawn through the midpoint of one side of a triangle and parallel to another side bisects the third side.”

Navneet wrote back,

Thanks! Your explanation was so easy and precise. Now, I want a little more help. (1) I want to understand what you meant to convey in these lines, with stress to the meaning of "latitude": "ThelatitudeI spoke of consists in what else we reasonably choose to include as part of the context." (2) Why did Doctor Peterson say this? "If your goal is simply to find a converse that makes sense and try to prove it, then a 'partial converse' like what you stated is appropriate." I think that a "partial converse" is valid for a theorem only if it has two hypotheses. If in the beginning only he said that the given theorem does not have two hypotheses, then why does he go on to say that a "partial converse" is appropriate for this theorem? Is the meaning of "partial converse" different here?

Doctor Rick responded first to the question about “latitude”, which is not a familiar word to many students:

To your first question, Dictionary.com offers this definition, which is the sense of "latitude" that I had in mind: freedom from narrow restrictions;freedomof action, opinion, etc.: "He allowed his children a fair amount of latitude." So I was saying thatwe have some freedom to decide what we call the converse of a theorem, because we have some freedom to choose what we consider to be part of the context and what is part of the conditions of the theorem.

Then he explained (correctly — that’s what twin brothers are for) what I meant by the quoted line:

To your second question, I have to go back and look at the context of what my brother Doctor Peterson said.... I see thatyou introduced the concept of a partial conversewith him; and that he told you we'd prefer not to use that idea -- it isn't something we've seen. But in terms of the statement you quoted, If a statement has two hypotheses -- "If a and b, then c" -- then a partial converse is: "If a and c, then b." We need to do the same sort of thing I did in my last response: REWRITE the theorem so that it *does* have two hypotheses. I rewrote it this way:Context: In Euclidean geometry, given a triangle and a line passing through the midpoint of one side and a point on another side of the triangle, ...Condition: If the second point is the midpoint of its side ... Conclusion: Then the line is parallel to the third side of the triangle. We could just as well say this:Hypothesis 1: In Euclidean geometry, given a triangle and a line passing through the midpoint of one side and a point on another side of the triangle, ...Hypothesis 2: If the second point is the midpoint of its side ... Conclusion: Then the line is parallel to the third side of the triangle. Now the "partial converse" would be: "If [Hypothesis 1] and [Conclusion], then [Hypothesis 2]." That's the same as my "If [Context] and [Conclusion] then [Condition]."

The two hypotheses in *themathpages*‘ concept of a partial converse are identical to the “context” and “condition” (hypothesis) in Doctor Rick’s explanation.

We did not have a context for your question in the beginning. (Context seems to be coming up a lot, doesn't it?) It seemed possible that you were asking about the converse because you were ASKED to state a converse for this theorem, without regard as to whether it would be a valid theorem. In that context, your converse could be considered correct. By taking the theorem exactly as stated, we produce a converse that is NOT a theorem -- which would be OK in this context. But Doctor Peterson also said, "if your goal is simply to finda converse that makes senseand try to prove it, then a 'partial converse' like what you stated is appropriate."Now that you have told us that your *book* gives the converse, we see thatthe goal was to write a *valid* converse theorem. We needed the extra "latitude" to do that, but it's appropriate in this context -- and much more useful.

In other words, you do better math when you are trying to accomplish something interesting and useful, not just following rules. And textbook questions that ask you to “write the converse” of a statement may be missing the point of writing converses.

Navneet, being a good and thoughtful student, wanted to practice these ideas, so he tried applying it to a simpler example:

Thanks! Given this theorem: "Angles opposite to equal sides of a triangle are equal." Then, as we know, its converse should be this: "Sides opposite to equal angles of a triangle are equal." Should we separate the theorem here into context, condition, and conclusion for finding its converse? I *GUESS* it should be separated like this: Context: In Euclidean geometry, given a triangle.... Condition: If any two angles are equal.... Conclusion: Then the sides opposite to the equal angles are equal. If I have not done right, then please correct me. Lastly, wouldn't you agree that if we have the freedom to choose what we put in the part of context and conclusion, then we can create more than one converse to any given theorem?

Doctor Rick answered:

This theorem is simpler than the others; I think the converse is reasonably clear, so you don't need to belabor it. If you want to do this ... then yes, that is a reasonable separation -- though when we switch the condition and conclusion here,]]>we'd need to change the wording somewhat. In proving such a theorem, you wouldgive specific names to the entities mentionedin the statement of the theorem. The situation is clearer once we've done that. For this theorem, we'd have Context: Euclidean geometry; a given triangle ABC. Condition: Angles A and B of triangle ABC are equal. Conclusion: Sides BC and AC of triangle ABC are equal. Now, switching the condition and conclusion yields a perfectly understandable converse (which happens to be a valid theorem): Context: Euclidean geometry; a given triangle ABC. Condition: Sides BC and AC of triangle ABC are equal. Conclusion: Angles A and B of triangle ABC are equal. To your last question, about the multiplicity of converses for any given theorem: yes, that's the point Doctor Peterson and I have been making. In general, we'd only be interested in a converse that can be proved to be a valid theorem. I supposeit's possible that one theorem could have more than one valid converse theorem, but I'm not going to start hunting for one. It just isn't important to me whether that ever happens or not.

We’ll start with a question from 1999 that introduces the concepts:

Math Logic - Determining Truth A number divisible by 2 is divisible by 4. I'm suppose to figure out the hypothesis, the conclusion, and a converse statement, say whether the converse statement is true or false, and if it is false give a counterexample. I don't understand.

Ricky has been asked to break down the statement, “A number divisible by 2 is divisible by 4,” into its component parts, and then rearrange them to find the converse of the statement. I took the question:

You're asking about the terminology of logic, which is important in math to help us talk about proofs and how we know something is true. Words such as "converse" allow us to talk about our reasoning and see whether we are really making sense. A statement such as "any number divisible by 2 is divisible by 4" (I've changed "a" to "any" to clarify the statement a little) can be rewritten asIFa number N is divisible by 2,THENthe number N is divisible by 4 Thehypothesis, or premise, is what is given or supposed, the "if": a number N is divisible by 2 Theconclusionis what is concluded from that, the "then": the number N is divisible by 4

We commonly write such a statement symbolically as “\(p\rightarrow q\)“, where the hypothesis is *p* and the conclusion is *q*. I rewrote each part slightly to allow it to exist outside of the sentence, naming the number *N* to avoid needing pronouns. What was important was to rewrite the statement in if/then form.

The converse of this statement swaps the hypothesis and conclusion, making “\(q\rightarrow p\)“:

The converse of the statement "IF a THEN b" is "IF b THEN a", turning the statement around so that the conclusion becomes the hypothesis and the hypothesis becomes the conclusion. In this case, the converse is IF a number N is divisible by 4, THEN the number N is divisible by 2

Ricky was asked to decide whether the converse is true or not, and then prove it, whichever way it goes. This part goes beyond mere logic and enters the realm of “number theory”; but commonly this sort of question is first asked in cases where the proof is not too hard, which is the case here.

Now we have to consider whether either statement is true. A statement and its converse may be either both true, or both false, or one true and the other false; knowing whether one is true says nothing about whether the other is true. In this case,the original statement is false. (This makes me wonder if you copied the problem wrong; it doesn't sound like this possibility was considered in the question.) How do I know it's false? BecauseI can give a counterexample: a number N for which the hypothesis is true but the conclusion is false. Can you see what I can use for N, which is even but not divisible by 4?

To show that a statement is not **always** true, we only need to find an example for which it is false. In this case, an easy example is 2, or we could use 6, or 102, or whatever we like.

But the question was about the converse:

However,the converse is true. See if you can see why. You might just try listing lots of numbers that are divisible by 4, and see whether they are all even. If all your examples are even, you haven't proven anything; but the list may suggest to you a reason why you will never be able to find a counterexample. That reason would be the basis of a proof.

I didn’t give a proof, in part because Ricky needed to think about that for himself, but also because I didn’t know what level of proof Ricky is expected to handle. One approach is to see that any multiple of 4 can be written as 4*k* for some integer *k*; but that can be written as 2(2*k*), which is clearly a multiple of 2.

Now we can review the meanings of all three terms, in this 1999 question, which again uses an example from basic number theory:

Contrapositive, Converse, Inverse Let m and n be whole numbers, and consider the statement p implies q given by "if m + n is even, then m and n are even." A) Express thecontrapositive, theconverseand theinverseof the given conditional. B) For the statements that are true, give aproof. C) For the statements that are false, give acounterexample. I have part A (I think) but I'm having trouble deciding which statements are true and which are false, and I'm completely lost on the proofs.

Doctor Kate could have asked Hollye for her answers to part A, to make sure she understands that part; but she chose to provide them:

I'll give you what I got for the first part, to see if it's the same as what you got. First, though, here's what my "p" and "q" are: p is "m + n is even" q is "m and n are even" ~p is "m + n is odd" ("~p" means "NOT p") ~q is "either m or n is odd"

It’s important to identify the parts of a conditional statement (if *p* then *q*); and since two of the new statements require negations, that also might as well be done early. Notice that the negation of “is even” could have been written as “is not even”, but since every number (integer) is either odd or even, writing “is odd” is cleaner. Also, the negation of “both are even” is “at least one is not even”; this is an application of De Morgan’s law, or can be seen by considering that if it is not true that both are even, then there must be one that is not even. These ideas were discussed last time.

Now here are the new statements:

A.Contrapositive(if ~q then ~p): "If either m or n is odd, then m + n is odd." B.Converse(if q then p): "If m and n are even, then m + n is even." C.Inverse(if ~p then ~q): "If m + n is odd, then either m or n is odd."

We saw the **converse** above; there we just swap *p* and *q*. The **inverse** keeps each part in place, but negates it. The **contrapositive** both swaps and negates the parts.

To check out which of these are true, it's best toexperimenta little. Try some numbers. Let's look at and pick some numbers where m or n is odd: 2 and 3 3 and 7 1 and 8 Notice that I tried to pick a variety of numbers - sometimes both odd, sometimes only one. That is because the opposite of "m and n are even" is "at least one of m or n is odd, and maybe both are." You can figure that out by imagining all sorts of things that don't satisfy "m and n are even." It could be really false (both m and n are not even) or just a bit false (only n is not even or only m is not even). Anyway, let's take a look at these numbers. Is 2 + 3 odd? Yes. Is 3 + 7 odd? That's 10... no, it's not. Wait, statement A says 3 + 7 WOULD be odd. This is a counterexample.

So now we know that the contrapositive, “If either m or n is odd, then m + n is odd,” is false, because there is at least one case, 3 and 7, where the hypothesis is true but the conclusion is false.

Remember that a statement like "<BLAH> is always true" can be proven false by just one example of when <BLAH> could be false. If I claim all dogs are black, all you have to do is bring me a Dalmatian, and I am wrong, even if a lot of dogs are black. Statement A is claiming that ALL the time, if one or both of n or m is odd, n+m is ALWAYS odd. But look, we found an example where it isn't. So statement A is false.

That’s the essence of a counterexample.

Doctor Kate continued, showing a way to prove that B and C (the converse and inverse) are both false. You can read that on your own, since my goal here is just to look at the logic. (We’ll have a series on proofs some time in the future.)

Continuing, here is a similar question, where statements must first be written in conditional form:

Converse, Inverse, Contrapositive For the directions it says "Writethe converse, inverse, and contrapositive of each conditional. Determine if the converse, inverse, and contapositive aretrue or false. If false, give acounterexample." I can't seem to do these:Allsquares are quadrilaterals.Ifa ray bisects an angle,thenthe two angles formed are congruent. Vertical anglesarecongruent. Thank you!

The second statement is straightforward, but the others need thought. Doctor Achilles first defined the three forms, as we’ve already seen, and then dealt with the first case:

The problem with your questions are that they don't neatly fit into the "if p, then q" format, so you need to first find EQUIVALENT sentences that are "if p, then q." Your first example says "all squares are quadrilaterals." That is the same as saying "if x is a square, then x is a quadrilateral."

Thus, “all” (the universal quantifier) translates directly to a conditional. The answer, left for Hana to do, will be:

- Converse: “If
*x*is a quadrilateral, then*x*is a square”; i.e. “Any quadrilateral is a square.” - Inverse: “If
*x*is not a square, then*x*is not a quadrilateral”; i.e. “Anything that is not a square is not a quadrilateral.” - Contrapositive: “If
*x*is not a quadrilateral, then*x*is not a square”; i.e. “Anything that is not a quadrilateral is not a square.”

The original statement, and the contrapositive, are true, because a square is a kind of quadrilateral; the converse and inverse are false, and a counterexample would be an oblong rectangle, which is not a square but is a quadrilateral.

The questions so far, where they dealt with truth at all, only asked about specific examples. Our last two questions will look more broadly at when these statements are equivalent.

Consider this question, from 2002:

Contrapositive I have a logic proof that I'm trying to solve. I'm up to the point after I've written down all my givens.One of the givens is p-->q.I want to say ~p-->~q, with my reason being inverse. Am I allowed to do this?

If we know a statement is true, can we conclude that the inverse is true? Doctor TWE answered with a counterexample:

No. Although the statement ~p --> ~q is called the inverse of p --> q, it does not necessarily follow. Let's look at an example. Suppose that: p = "X is 2" and q = "X is an even number" Clearly,p --> q is true("If X is 2 then X is an even number."). But is the inverse, ~p --> ~q, also true? This statement reads, "If X is NOT 2 then X is NOT an even number." Suppose X = 4. Then the "if" part, X is NOT 2, is true, but the "then" part, X is NOT an even number, is false. So the statement as a whole is false.

Here we are using logic to talk about logic: The statement “For all *p* and *q*, \((p\rightarrow q)\rightarrow(\lnot p\rightarrow\lnot q)\)” is false! *Sometimes* both original and inverse are true, but we can’t conclude the latter from the former.

What you *are* allowed to use in a logic proof is thecontrapositive. The contrapositive of p --> q is ~q --> ~p. It turns out that any conditional proposition ("if-then" statement) and its contrapositive arelogically equivalent. In our example, the contrapositive of "If X is 2 then X is an even number" would read, "If X is NOT an even number then X is NOT 2." We can see that this is also true.

Giving one example where the contrapositive is true does not prove that it is always equivalent; we’ll prove it below.

A third possible "switching" of the statement p --> q is q --> p. This is called theconverse, but like the inverse, it does not follow logically from the original statement. The converse of our original statement would read, "If X is an even number then X is 2." Clearly, not all even numbers are 2. So the converse statement is false. (It turns out that the inverse and converse statements are logically equivalent to each other - but not logically equivalent to the original statement.) To summarize, given the statement p --> q: The inverse is q --> p, NOT equivalent to p --> q The converse is ~p --> ~q, NOT equivalent to p --> q The contrapositive is ~q --> ~p, IS equivalent to p --> q

In fact, the converse and inverse turn out to be equivalent to one another, though not to the original.

Let’s look at one more, from 2003:

Truth of the Contrapositive The inverse of a statement's converse is the statement's contrapositive. True, but why? I don't know how to explain it. I tried an example: p: I like cats. q: I have cats. Converse If I have cats, then I like cats. Inverse If I don't like cats, then I don't have cats. Contrapositive If I don't have cats, then I don't like cats. I still can't explain the answer "true" that I came up with. Maybe it is wrong.

The opening statement describes the contrapositive as the inverse of the converse. What that means is this: Suppose we start with “\(p\rightarrow q\)“. Its converse is “\(q\rightarrow p\)” (swapping the order), and the inverse of that is “\(\lnot q\rightarrow\lnot p\)” (negating each part). This is the contrapositive. In the example, the converse of “If I like cats, then I have cats” is “If I have cats, then I like cats”, and the inverse of that is “If I don’t have cats, then I don’t like cats”, which is the contrapositive.

Doctor Achilles, perhaps misreading the question, answered the bigger question: Which of these are true?

Thecontrapositiveis true if and only if the original statement is true. It is false if and only if the original statement is false. So it islogically equivalent to the original statement. Let's say you have a conditional statement: "if I like cats, then I have cats." What does this mean? When is it true? When is it false? Well, for starters, if you like cats and you have cats, then the conditional will come out true. That is, (P -> Q) is true when P and Q are both true. Also, if you don't like cats and you don't have cats, then the conditional will come out true. That is, (P -> Q) is true when P and Q are both false. Also, if you don't like cats and you have cats, then the conditional still comes out true. Remember, it says that if you like cats, then you will have them; it makes NO claim at all about what will happen if you don't like cats. So, (P -> Q) is true when P is false and Q is true. However, if you like cats and you don't have cats, then the conditional will come out false. It says that if you like cats, then you will have them. So it is proven wrong if you like cats, but you still don't have any. So, (P -> Q) is false when P is true and Q is false.

In effect, he has made a truth table:

P Q P->Q --- --- ------ T T T F F T F T T T F F

If you are unconvinced by any of the reasoning, see Why, in Logic, Does False Imply Anything?.

So to review, (P -> Q) is true under any of these conditions: P is true and Q is true P is false and Q is true P is false and Q is false And it is only false under this one condition: P is true and Q is false You can say that another way, using ~P and ~Q (not-P and not-Q). (P -> Q) is true under any of these conditions: ~P is false and ~Q is false ~P is true and ~Q is false ~P is true and ~Q is true And it is only false when: ~P is false and ~Q is true Is there another sentence that uses P and Q that is only false when ~P is false and ~Q is true? Yes, the sentence is: (~Q -> ~P) You can go through the same analysis of this sentence as I did for (P -> Q) and you'll find that it has the same truth conditions.

So the truth table for the contrapositive is that same as for the original; this is what we mean when we say that two statements are **logically equivalent**.

We can instead just think through the example:

You can also understand this more intuitively: The sentence: "If I like cats, then I have cats." says that as long as the first part, "I like cats," is true, the second part, "I have cats," will definitely be true. In this case, what does "I don't have cats" mean? The only way "I don't have cats" can happen is if "I like cats" is false. That is, the only way "I don't have cats" can be true is if "I don't like cats" is true. Therefore, "If I don't have cats, then I don't like cats."

Which is more convincing? That depends upon you.

]]>As usual, I’ll start with a fairly basic question to set the stage, this one from 1998:

Negating Statements What is negation? My math teacher gave me some problems on it: "4 + 3 * 5 = 35" and "Violins are members of the string family." I've asked my parents about it and they don't know.

Heather has been asked to “negate” these statements; what does that mean? Doctor Teeple started with the concept of a statement:

To negate a statement, you writethe opposite of what the statement says. But before we talk about the opposite of a statement, let's talk about thestatementsthemselves. A statement is pretty much what it sounds like it should be. It's an equation or sentence or a declaration of some sort.It doesn't matter whether the statement is true or false; we still consider it to be a statement. For example, I could say, "The sky is purple" or "The earth is flat." Both of those are statements. I could say, "The U.S. is in North America" or "Giraffes are not short." Those are also statements. We can negate each of these statements by writing the opposite of what it says. So for example, the negation of "The sky is purple" is "The sky isnotpurple." The negation of "Giraffes arenotshort" is "Giraffes are short."

So, the negation of a statement is a statement that says the first one is not true. If the original was a true statement, the new one will be false, and vice versa. (Do you see why Doctor Teeple started out by emphasizing that a statement doesn’t have to be true?)

We make statements and negate them without judging whether they are true or false. That is another issue. But once we are given that a statement is true or false, we can note what happens to the statement when we negate it. For example, suppose we know the following: "The sky is purple." False "Giraffes are not short." True We negated these and got the following: "The sky is not purple." True "Giraffes are short." False Notice what happened. Negation turns a true statement into a false statement and a false statement into a true statement.

So one way to check whether your answer is correct is to determine whether each of the pair is true (either in reality, or under some assumed reality); if they have the same truth value, then the second can’t be the negation of the first.

Now, all of the statements we have been working with are sentences.We can also do this with math equations.Here are some statements: 6 * 3 = 18 4 + 3*2 > 15 15/7 = 12 12 + 1 <= 13 (where <= means less than or equal to) Some of them are true and some are false, but that is a side issue; we can negate them either way. So here are the negations of the above statements: 6 * 3 =/ 18 (where =/ means not equal to) 4 + 3*2 <= 15 15/7 =/ 12 12 + 1 > 13 That's all there is to negating statements.

In English, we typically add the word “not”, or some equivalent; in symbols, we replace “=” with “\(\ne\)“, “>” with “\(\le\)“, “\(\ge\)” with “<“, and so on.

This should be enough for Heather to answer her own questions. Since enough time has passed, it’s okay to give the answers here:

- The negation of “\(4 + 3 \times 5 = 35\)” is “\(4 + 3 \times 5 \ne 35\)“.
- The negation of “Violins are members of the string family” is “Violins are not members of the string family”.

I want to warn you to be on the watch for statements that contain words like "forevery," "forall," or "thereexists." Negations of these types of statements can be tricky. Here's an entry in the Dr. Math archives that might help:

This leads us to the next question, from 1996:

Negation in Logic What is the negation of the sentence "Ineveryvillage,there is aperson who knowseverybodyelse in that village."? My guess is "Inat least onevillage, there isat least oneperson who knowsat least oneperson in that village." Am I close at all?

Words like “every”, “at least one”, “some”, “there exists a”, or “none”, are called “quantifiers”, because they tell how many of something there are. They are particularly tricky to negate.

Doctor Mike saw that Thomas needs to start simpler, so he turned the latter part of the statement temporarily into a single phrase, which can be a very helpful technique:

Close, but not quite there. Think of it this way. If you use "special person" to mean a person who knows everyone in his/her village, then the original sentence becomes: "Everyvillage has a special person." The negation would then be : "Somevillagedoes NOThave a special person."

Notice how we can’t just stick in a “not” (unless you take the easy way out and say “**It is not true** that every village has a special person”). If this is not true of **every** village, then there must be **at least one** village for which it is **not** true. “Not every” means “at least one does not”, or equivalently, “some do not”. Thomas got this part right.

Now we need to expand the “special person” part, recalling that we temporarily defined it as “a person who knows everyone”:

So what does it mean for a villagenotto have a special person? It means that every person in that village could be matched up with some person that he/she doesnotalready know. The negation should be: "In at least one village,each person does not know everyone else." or perhaps something like: "In at least one village, andfor every person in that village, that person does not know everybody." or alternatively: "There existsa village, such thatfor everyperson in that village,there isanother person that the first person does not know."

Each of these is a little awkward; you have to read them carefully. In the first, we aren’t just saying that “not everyone knows everyone else,” but more strongly, “each person has someone they don’t know.” The second says this explicitly, but not quite in everyday language (it reads more like a logician wrote it). The third is even more “logicianese,” with phrases like “there exists” and “such that”. But the goal here is to write precisely, because that’s what we need in logic.

In plain English, the clearest wording might be, “In **each** village, **no one** knows everyone else.” Here I just negated the inner phrase by replacing, in effect, “there is” with “there is no.”

So, what is the general procedure for doing this?

In general, for language exercises like this, there are two basic rules. The negation of a sentence like "For every...there existsa...such that X is true." is a sentence like "There existsa...such thatfor all...X isfalse.". (The other basic rule is the reverse of this.) It is also often easier to see what is happening if some symbols are used. See if you follow the symbolic version below for the sentence and the negation. "For all V, there is a P in V, such that for all Q in V, P knows Q." "There is a V, such that for every P in V, there is a Q in V such that P does not know Q."

More generally, perhaps, we can say this:

- The negation of “For
**all**A, p” is “For**some**A,**not**p”. - The negation of “For
**some**A, p” is “For**no**A, p”, or “For**all**A,**not**p”. - The negation of “For
**no**A, p” is “For**some**A, p”.

We applied the first rule to the first part of the statement, which required then negating the inner part, for which we can apply the second rule. I applied the first version of the second rule; Doctor Mike applied the second version, probably because he was assuming that the ultimate goal is a symbolic statement, where only “some” and “all” (existence and universality) are allowed, with no special symbol for “none”.

Another common difficulty arises when we want to negate a compound sentence using “and” or “or”. Here is a question about the former, from 2003:

Distributing 'Not' over a Conjunction If p = today is sunny and q = tomorrow is Friday then which of the following means "It isnottrue that today is sunnyandtomorrow is Friday"? 1. not (p^q) or 2. not p ^ q I'm not sure whether "it is not true" in front of a conjunction applies to both conjuncts or only the first one.

This is mostly a question of English grammar. Doctor Achilles focused on that:

Translations are always tricky. However, I would say that "it is not true that ..." means that thenegation applies to everything that follows. ContrastIt is not true thattoday is sunny and tomorrow is Friday with Today isnotsunny and tomorrow is Friday In the sentence I just made up, the 'not' is only in the first clause, and so only applies to 'p'. In your sentence, the 'not' governs the whole thing, so should apply to 'p' and 'q'. So the correct translation is not (p^q) I hope this helps. If you have other questions or you'd like to talk about this some more, please write back.

Note that if the statement was, “It is not true that today is sunny, and tomorrow is Friday,” we would translate this as “\((\lnot p) \wedge q\);” the comma serves as a parenthesis in English.

We could have also brought up the fact that \(\lnot(p \wedge q)\) doesn’t mean the same thing as \(\lnot p \wedge \lnot q\); if you did want to “distribute” the negation, you would have to follow deMorgan’s Law and change the *and* to an *or*: \(\lnot p \vee \lnot q\) (“Today is not sunny, or tomorrow is not Friday”). But I suspect that Sarah already knew that.

Let’s look at one more question, from 2010, that touches on our topic for next time, contrapositives:

Negating a Quantifier I have to write the converse, inverse, and contrapositive of this conditional statement: If a triangle is isosceles, then it hasat least twocongruent sides. I know that p: a triangle is isosceles q: the triangle has at least two congruent sides statement: if p then q converse: if q then p inverse: if not p then not q contrapositive: if not q then not p So Converse: If a triangle has at least two congruent sides, then the triangle is isosceles. But what is the negation of "at least two"? Is it "none"? or "at most two," as in Inverse: If a triangle is not isosceles, then it hasat most twocongruent sides. Contrapositive: If a triangle hasat most twocongruent sides, then it's not an isosceles.

In order to write the inverse and the contrapositive, Marianne has to negate the two statements, “a triangle is isosceles,” and “it has at least two congruent sides”. The first is easy; the second, though not quite the same as the quantifiers in an earlier question, is very similar: What is the negation of “at least two”, considering that the negation of “at least one” is “none”?

I answered, focusing on the meaning of the specific statement, keeping in mind that a triangle has only three sides, and that it makes no sense to talk about having “one congruent side”:

Just consider the cases. It either has none, or two, or three: Number of congruent sides At least 2? At most 2? None? --------------- ----------- ---------- ----- 0 F T T 2 T T F 3 T F F Which column is the negation of "at least 2"? The one that switches every falsehood to a truth and every truth to a falsehood -- the one titled "None?"

So, the negation of “at least two congruent sides” is “no congruent sides”. If we chose “at most two”, both the original and the purported negation would be true when there are two congruent sides.

Rather than making a table, we could do this:

There are other ways to come up with this fact. You might write it as an inequality. N is at least 2 means N >= 2 The negation of that is N < 2 which in turn means 0 or 1. Since a triangle can't have "one congruent side," that really means none.

The negation of “greater than or equal” is “less than.” It would be valid to say, “less than two congruent sides,” but that might confuse readers.

Or, by the same reasoning as the last sentence, if a triangle does NOT have at least two congruent sides, then it has none. Saying it has two congruent sides means it has a PAIR of congruent sides, and if it doesn't have at least one pair, then it has none. So your inverse and contrapositive are wrong. (You could also directly check them: the contrapositive should be true, but if you make a triangle that has at most two congruent sides, it might have two, and WILL be isosceles.)

Next time, we’ll look at the converse, inverse, and contrapositive themselves, which is the reason we had to look at negation.

]]>