We’ve been looking at examples of extended discussions with students about various kinds of problems. Here, we have one (not from a student) that led to some good thinking about combinatorics – the techniques of counting the ways something can happen.

Here’s the question, from 2004 (not a paid product placement):

How Many Possible Pizza Topping Combinations? I work at Domino's Pizza and would like to tell customers how many topping combinations we have. It's tricky because you can get single-layers, double-layers, or triple-layers of ANY topping, and mix single, double, and triple-layered toppings on each pizza. There are 13 different toppings. You can get a pizza with 1 or up to 39 toppings (triple everything.) The fact that you can have a pizza with double pepperoni, triple sausage, and normal amount of green peppers makes it much harder. I took (39^13 + 26^13 + 13^13 and got 485,362,204,315,790,908,548)... that seems like a lot though, so I thought I had better ask an expert.

Questions about the number of ways to top a pizza (or make a sundae, or whatever) abound; this one is a little more challenging than many. For some other examples from our archive, see

Combinations of Toppings when Ordering a Pizza Gift Wrap Combinations Pascal's Triangle and the ice cream cones Shirts and Pants Fast Food Combinations

Ryan’s thinking shows some understanding of appropriate methods; using 13 as an exponent makes sense, since we have the same kind of choice to make about each of 13 items. His bases, however, are not appropriate. It appears he is trying to count separately the ways to have single, double, and triple toppings, but he has overcounted considerably.

Doctor Edwin took the question:

I love this question. You're right, it's a lot more interesting because of the possibility of single, double, or triple doses of each topping. There are a couple of ways to think about the problem. I'll start with the one that I think is the simplest, and then we can talk about the one I think is the coolest.

Problems in combinatorics typically can be solved in multiple ways; in fact, several of us have said that we never quite trust our work on such a problem until we get the same answer in more than one way. So offering two methods is common.

He starts the way we often start with a big problem like this, “playing” with smaller numbers to get a feel for how it will work with bigger numbers, while also checking our understanding of the problem itself:

Suppose you hadone topping, let's say mushrooms. How many different possible pizzas would there be? If I understand your setup right, there would be 4: no topping, single mushrooms, double mushrooms, or triple mushrooms. Now imagine that there aretwo toppings, mushrooms and sausage. How many possible combinations are there? Well, if a customer doesn't want mushrooms, there are 4, right? And if the customer wants 1x mushrooms, that's 4 more, and so on for 2x and 3x mushrooms. In other words, there are 4 x 4 = 16 different combinations with two items. We could continue this for a long time. If you have only sausage, mushroom, and onion, that's 16 combinations with no onion, 16 with 1x onion, and so on, for a total of 4 x 4 x 4 = 64 combinations. So it looks like the number of combinations is: (# of choices for item 1) X (# of choices for item 2) X ....etc. Now we have the same number of choices for each item, so that simplifies things a bit. Can you take it from there? Write back and tell me what you get.

Because the choice for each topping is independent, we multiply the number of ways to make each choice.

Let’s finish this up. For each of the 13 items on the menu, we have four choices (0, 1, 2, or 3 of them). So the answer is $$4\times 4\times 4\times 4\times 4\times 4\times 4\times 4\times 4\times 4\times 4\times 4\times 4 = 4^{13} = 67,108,864.$$ (Note that this includes a pizza with no toppings, which makes sense to me, but seems to have been excluded in the problem statement.)

Not as hard as it seemed, really. By approaching the decisions one topping at a time, we found a very straightforward method. One way to envision this process is to imagine an order sheet that lists all 13 toppings, with a box next to each to either leave blank, or fill in a number from 1 to 3. The key to the solution is the realization that 0 is also a number, so there are 4 choices for each.

The second method is really the same basic formula, but approached in terms of a different representation:

Now on to the cool part. Let'srepresent a pizza choice numerically. I can represent each topping with a number zero through three representing no topping through triple topping. If you're taking orders and calling them back to the cook, and there is only one topping (sausage) available and someone wants triple sausage, you could yell back, "Gimme a 3." Now suppose again that there are sausage and mushroom available, and someone wants double sausage and single mushrooms. You could work out a system with the cook where you always list the ingredients in the same order. So then you could yell, "Gimme a 21." So you could represent every possible pizza as a 13-digit number, where every digit was 0, 1, 2, or 3. I don't know how much you remember from math class (some, obviously, judging from your question), but you may have learned at some point to do math in different bases. Every pizza can be represented bya 13-digit number in base 4, and every possible number in base 4 represents a pizza. So the question then becomes, how high can you count in base 4 with only 13 digits? 0000000000000 0000000000001 0000000000002 0000000000003 0000000000010 0000000000011 0000000000012 0000000000013 0000000000020 0000000000021 ... and so on If that's unfamiliar or confusing, go back to thinking of them as toppings: 0x sausage, 0x mushrooms 0x sausage, 1x mushrooms 0x sausage, 2x mushrooms 0x sausage, 3x mushrooms 1x sausage, 0x mushrooms 1x sausage, 1x mushrooms 1x sausage, 2x mushrooms 1x sausage, 3x mushrooms 2x sausage, 0x mushrooms 2x sausage, 1x mushrooms So, how high can you count in base 4 with 13 digits? Let me know what you come up with.

The idea of representing a problem in a different way is extremely useful in combinatorics. It’s especially useful if the new representation is more familiar to you than the original. In this case, if you’ve worked with bases, you know that just as in counting from 000 through 999 you have 1000 numbers (which is \(10^3\)), here we are counting from 0,0000,0000,0000 through 4,4444,4444,4444, and the next number is \(10,0000,0000,0000 = 4^{13}\). So that’s the answer this way. (I wrote the numbers with a comma every four places to help keep track of digits, because someone who found base 4 natural would probably do that … .)

That’s really the same sort of thinking as the first way, but using a different representation. Our 13 blank boxes have just become 13 digits.

Ryan gave it a try, but didn’t quite understand yet – that’s why we like conversations more than just answers. He wrote back:

Hey Doctor Edwin! Thanks for the help (and quick response!) I may have found the answer, but I'm not sure. I found the number of combinations for a 1-topping (39) and then the number of combinations for a 2-topping (741) and 3-topping (9139) and so on up to a 39-topping pizza (1 combination, [all the toppings]). I then added all the final numbers together and got "549,755,813,879". Is that correct? If so, I can happily tell people we have over half a trillion topping combinations, haha!

Ryan is thinking of a one-topping pizza as choosing one of the 13 toppings, then choosing 1, 2, or 3 of it, for a total of 39 pizzas; and so on. Since these are mutually exclusive, we can add up the 13 possible numbers to get the total. But it isn’t clear how he is counting multiple toppings, since he didn’t give details for his calculations.

Doctor Edwin replied:

Hi, Ryan. I think it's going to be a lot less than half a trillion. I think you and I are using a couple of words differently. When I said imagine you could only haveone topping, I meant, imagine there wasonly one topping to choose from. You meant that you could only pick one topping, but you could pickany of the ones that are availablein real life. The problem with your approach (andit can be solved that way, it's just more complicated) is that for my first choice I've got 40 possibilities (including plain). For my second choice I can't choose from 40 possibilities, because I've already used one of them. That is, suppose I choose sausage for my first topping. I can't choose 1x sausage or 2x sausage or 3x sausage for my second choice, can I?

Let’s try finishing Ryan’s method, and see how hard it will be.

For one topping, we saw above that there are \(13\times 3 = 39\) possibilities.

For two, we have to pick 2 of the 13 toppings, which we can do in “13 choose 2” ways: \({13\choose 2} = \frac{13!}{2!11!} = 78\). Then we have to pick a number of each of them, 1, 2, or 3. So the total is \({13\choose 2}\times 3 \times 3 = 702\). Ryan had said 741; that’s 39 too many. Possibly his numbers are partial sums instead of addends.

For three toppings, the same thinking leads to \({13\choose 3}\times 3^3 = 286\times 27 = 7722\). This is very different from Ryan’s 9139, so I don’t know just what he was doing.

Carrying this further, we get a sum of 13 terms: $$\sum_{i=1}^{13}{13\choose i}3^i.$$ If we just calculate these and add them up (I used Excel) we get the same answer as before – except that this excludes a plain pizza. My numbers are:

39 + 702 + 7,722 + 57,915 + 312,741 + 1,250,964 + 3,752,892 + 8,444,007 + 14,073,345 + 16,888,014 + 13,817,466 + 6,908,733 + 1,594,323 = 67,108,863

It appears from this that we should be able to prove a theorem that $$\sum_{i=0}^{n}{n\choose i}3^i = 4^n.$$ Without such a theorem, this method is not easy, though it was successful!

(There is, of course, a combinatorial proof of this summation: Each term of the sum represents the number of base-4 numbers with *i* non-zero digits, so the sum is, as we’ve seen, the number of all base-4 numbers. An algebraic proof is probably a little harder.)

So let’s move on; often a student needs a second perspective in order to understand an answer (which is part of the reason we are not bothered when two of us answer a question, each with a slightly different method, or a different way to say the same thing):

So instead of thinking of it that way (how many could I get if I pick only one, how many if I pick only one or two, etc.), let's think of it the other way (how many could I get if there is only one topping to choose from, how many if there are only two toppings to choose from?). In that case, for each of my 13 toppings, I've got 4 possibilities: zero, 1x, 2x, or 3x. With only one topping to choose from, I've got 4 pizzas I could order. With two toppings to choose from, I've got 4 choices for my first topping. For each of those, I've got 4 choices for my second topping, for a total of 4 x 4 = 16 choices. Let's try that out and count them. We'll use M for mushrooms and S for sausage: 0M 0S 0M 1S 0M 2S 0M 3S 1M 0S 1M 1S 1M 2S 1M 3S 2M 0S 2M 1S 2M 2S 2M 3S 3M 0S 3M 1S 3M 2S 3M 3S That's every combination we can make if we only have two ingredients to play with. Let's say we add onions to the menu. If someone wants no onions, they've got 16 choices. If they want 1x onions, that's another 16 possible pizzas they could order. 16 more for 2x onions, and 16 more for 3x onions. In other words, with three toppings on the menu, there are 4 x 16 = 4 x 4 x 4 = 64 possible toppings to choose from. So a general formula would be (number of ways to use a single topping) ^ (number of toppings) where the number of ways to use a single topping includes not using any. So for your example, the total number of possible pizzas to choose from is 4 ^ 13 Not anywhere near a half-trillion, but it's a heck of a big number, isn't it? I'm surprised I ever manage to order a pizza at all.

Here, rather than calculating each number and adding, we are generalizing from the small examples to obtain our simple final answer.

]]>Today I want to look at a recent question that led into both geometrical and trigonometrical solutions, and particularly a useful perspective on the Law of Sines.

Here is the question, from March:

A quadrilateral ABCD is put inside a circle with radius 1.

AB = √3, ADC = 75°, BCD = 120°

Find angles ADB, DAB, DAC.

I was given two hints as below:

- The shape (as attached below)
- The formula: 2 x radius = AB / sin ADB
My questions are:

- If I was not given a hint, how can I know the proper shape?

(Because I think if I draw it wrong I will not get the answer)- I never heard about this formula. How can we use this formula?

The figure, appropriately, adds nothing to the problem; the problem explicitly stated everything that matters. But the student was unsure whether it would be possible to draw the correct diagram without being given it. We’ll have to discuss the reasons for that question.

He also is unfamiliar with the formula, which, as I will point out, is actually part of the Law of Sines. Finally, he doesn’t see how to apply it. So we have several things to discuss.

I replied, first discussing how I solved the problem myself, using only geometry:

I solved it using no trigonometry except for one little right triangle. The main tool I used to solve this was the fact that an angle inscribed in a circle equals half the subtended arc:

First use elementary trigonometry to find half of angle AOB, where O is the center of the circle, by forming two congruent right triangles.

Then you know arc AB.

Use angles ADC and BCD to find arcs ABC and BAD.

Then you can find all the individual arcs, and from them find the requested angles.

Since we never quite got back to this method in our discussion, let’s complete this now. I’ll redraw the figure, including the center O and a few extra lines:

(This is drawn to scale; notice that the original diagram was almost exactly correct, though there was no promise that it was.)

We know two sides of triangle AOE, and recognize it as a 30-60-90 triangle, half of an equilateral triangle; so $$\angle AOB = 2\angle AOE = 120°.$$ (This is as close as we will come to trigonometry in this method.)

Then, by the theorem that a central angle (e.g. ∠AOC) is twice an inscribed angle (e.g. ∠ADC), we have $$arc{ABC} = \angle AOC = 2\angle ADC = 150^{\circ},$$ and $$arc{BAD} = \angle BOD = 2\angle BCD = 240^{\circ}.$$ (I name the arcs with three points to make it clear which direction each goes; this is also a reason not to just use central angles.)

To find **∠ADB**, we use the first arc we calculated: $$\angle ADB = \frac{1}{2}\angle AOB = 60^{\circ}.$$

For **∠DAB**, we first find $$arc{BCD} = 360^{\circ} – arc{BAD} = 360^{\circ} – 240^{\circ} = 120^{\circ}.$$ We get $$\angle DAB = \frac{1}{2}\angle DOB = \frac{1}{2}(120^{\circ}) = 60^{\circ}.$$ Hmmm … we have an equilateral triangle, don’t we?

For **∠DAC**, we find that $$arc{BC} = arc{ABC} – arc{AB} = 150^{\circ} – 120^{\circ} = 30^{\circ};$$ then $$arc{CD} = arc{BCD} – arc{BC} = 120^{\circ} – 30^{\circ} = 90^{\circ}.$$ Finally, $$\angle AC = \frac{1}{2}\angle DOC = \frac{1}{2}(90^{\circ}) = 45^{\circ}.$$

Next, I answered the first question, about drawing the figure if we weren’t given one:

As for the shape, you are told that the quadrilateral is

inscribed inthe circle, I presume, not just that it isinsidethe circle. That is, the vertices lieonthe circle, not just inside it. You can then assume that a quadrilateral ABCD is non-intersecting, so that the vertices arein that orderaround the circle. Nothing else is needed to make an adequate figure to show the relationships. No angles have to be correct in order to solve the problem. (My own figure, drawn before looking at yours, happened to look like yours flipped upside-down, but would not affect any work.)

The ideas here reflect what I discussed in the post What Role Should a Figure Play in a Proof?

All the work could have been done just as well with a picture like this (which might be what I would have drawn myself), in which the angles and lengths are reasonable but not precise, and I drew the quadrilateral clockwise:

Next, I answered the second question, about the provided formula:

As for the formula 2r = AB / sin ADB, that is a version of the

Law of Sines, which says that in any triangle, the ratio of each side to the sine of the opposite angle is the same, namely the diameter of the circumscribed circle. This last part is often omitted! That fact could be used to find angle ADB, and then you could find arc AB, and so on following my ideas. Presumably the context of the question makes this an appropriate method for you.See:

I’ll look at this page later.

The student responded with three questions, which I will break up here in order to follow each immediately with its answer. First, about the figure:

For the image I sent, how can angle ACD be half of angle BCD? Is there any rule regarding this?

My answer:

The image is not necessarily to scale; the angles are close, but may not be exactly right. This is typical: a picture does not have to be exact in order to communicate the facts of a problem, and it is common to explicitly state this in a problem.

But it does look

approximatelycorrect, doesn’t it?

I’m not sure what the question meant; having so far only drawn a rough sketch, I assumed he meant that in the figure ACD and ACB looked equal, but were not; but in fact they turn out to be, and the picture was quite accurate. Possibly what he meant was that in his own work, he *assumed* that CA bisects angle BCD, and wasn’t able to justify that assumption. If so, then that is an appropriate caution, and part of the reason to draw an intentionally inexact figure and consciously avoid making assumptions from what we see.

Second question, about the process of solving:

I tried to find AOB, and I found AOB = 120° and ADB = 60°, and arc AB = 2π/3.

Then I found AOC = 150°, and arc ABC = 5π/6.

But I don’t understand how to find arc BAD (I could not find angle BOD).

My answer:

The central angle BOD of arc BAD is twice the inscribed angle BCD. (I actually found the degree measures [central angles] of the arcs, not their lengths, but there is not much difference.)

Let’s carry out the whole solution by this method, which is much like my own above:

- ∠AOB = 120°, using the 30-60-90 triangle as I did above.
**∠ADB**= 60°, because an inscribed angle is 1/2 of the central angle of the arc it subtends.- Arc AB = 2π/3, by converting the 120° angle to radians and multiplying by the radius, which is 1. [As I mentioned, I just used central angles to measure arcs, and didn’t need arc
*lengths*.] - ∠AOC = 150°, because the central angle is twice the inscribed angle ∠ADC.
- Arc ABC = 5π/6, by converting the 150° angle to radians.
- Here’s where I took over: the reflex angle ∠BOD = 240°, because it is twice ∠BCD.
- Arc BAD = 4π/3.
**∠DAB**= 60°, either from arc BCD = 2π – 4π/3 = 2π/3, or from nonreflex angle ∠BOD = 120°.- Arc BC = Arc ABC – arc AB = 5π/6 – 2π/3 = π/6.
- Arc CD = arc BCD – arc BC = 2π/3 – π/6 = π/2.
- ∠COD = 90°.
**∠DAC**= 1/2 of 90° = 45°.

Third question:

And is (sin A) / A is the same as A / (sin A)?

(The website you have given to me showed (sin A) / A)

Now we can get back to that page I referred to, which included an incorrect statement in the question:

I hadn’t noticed that the

questionis wrong, and the correction is not emphasized in theanswer. The question asks, “Prove why(sin A)/aequals the diameter of the circle,” which is incorrect; the answer says, “soa/sin(A)= D, where D is the diameter of the circumscribed circle,” silently correcting the error. But then it states the Law of Sines in the former way: “This gives a proof of the Law of Sines, which states “sin(A)/a = sin(B)/b = sin(C)/c.” The best form is what we state in our FAQ:

http://mathforum.org/dr.math/faq/formulas/faq.trig.html#reloblitriThe Law of Sines: a/sin(A) = b/sin(B) = c/sin(C) = 2R.

This is not uncommon: A question is asked incorrectly, we write back without calling the student’s attention to his mistake, and the archived answer consequently can be misleading. The archivist, in editing the exchange for publishing, had some difficult choices to make (as I am discovering as I try to do something similar in this blog).

Here is the page I referred to (from 1996):

Sine of an Angle and Opposite Side

Hello. I am a 10th grader in high school and I am currently in Algebra II. Our math teacher gave us a problem for bonus, but she didn't know the complete answer. Do you think you could help? Given triangle ABC inscribed in a circle. Prove why (sin A)/a equals the diameter of the circle.

Doctor Pete answered:

Let the center of the circumscribed circle of triangle ABC be O. Draw OA to intersect at D, so AD is a diameter. Then angle BOD = 2BAD. This is because AO = BO, so ABO is isosceles, and therefore angle ABO = BAO; since BOD = ABO+BAO, BOD = 2BAD. Similarly, COD = 2CAD, so BOC = BOD+COD = 2(BAD+CAD) = 2BAC.

Then the angle bisector of BOC also bisects BC.

Since triangle BOC is isosceles; let the midpoint of BC be E. Hence sin(A)/a = sin(BAC)/BC = sin(BOC/2)/(2BE) = sin(BOE)/(2BE) = (BE/BO)/(2BE) = 1/(2BO). But BO is the radius R of O, so a/sin(A) = D, where D is the diameter of the circumscribed circle. Note that sin(A)/a cannot give the diameter (think of the units); if one has two similar triangles ABC and A'B'C', sin(A) = sin(A'), so you would expect the radii to be *directly* proportional to the length of a side, not *inversely* proportional. Also, you can see this gives a proof of the Law of Sines, which states sin(A)/a = sin(B)/b = sin(C)/c.

The next to last paragraph then explains why the expression provided by the student has the wrong dimensions to have been equal to the diameter; this provides an explicit correction.

The first two paragraphs of the proof merely demonstrate the fact we’ve used several times, that the central angle of an arc is twice any inscribed angle subtended by the arc:

The core of the proof uses this picture:

Let me restate this in a way that gets directly to the goal, rather than starting from the student’s incorrect expression:

Because inscribed angle ∠BAC is subtended by arc BC, we know that ∠BOC = 2∠BAC. We construct the angle bisector OE, so that ∠BOE = ∠BOC/2 = ∠BAC = α. In right triangle BOE, we see that $$\sin(\alpha) = \frac{BE}{BO} = \frac{\frac{1}{2}a}{R} = \frac{a}{2R}.$$ Therefore, $$\frac{a}{\sin(\alpha)} =2R.$$

This is the fact that our student was given as a hint; and since we could have done the same with any of the angles, we have the Law of Sines in its full form, $$\frac{a}{\sin(A)} = \frac{b}{\sin(B)} = \frac{c}{\sin(C)} = 2R.$$

There are several other ways to prove this theorem when we omit reference to the circumcircle, but this beautiful little proof is the simplest and most comprehensive.

Now we have finished our discussion, because our student was satisfied with the proof he wrote. But I’m not! We haven’t yet done what the problem asked, to use the hint (the Law of Sines) to find a solution to the problem. Let’s give that a try.

First, they stated the Law of Sines specifically for ∠ADB, whose opposite side is given, so let’s use the hint: $$2R = \frac{AB}{\sin(\angle ADB)}.$$ We can directly apply this to find ∠ADB: $$\sin(\angle ADB) = \frac{AB}{2R} = \frac{\sqrt{3}}{2},$$ so ∠ADB = 60°.

Next, we can use the given angles to find the opposite sides of the associated inscribed triangles (that is, the diagonals of the quadrilateral): $$2R = \frac{AC}{\sin(\angle ADC)},$$ so $$AC = 2R\sin(\angle ADC) = 2\sin(75^{\circ}),$$ which we could work out exactly if we needed it; and $$2R = \frac{BD}{\sin(\angle BCD)},$$ so $$BD = 2R\sin(\angle BCD) = 2\sin(120^{\circ}) = \sqrt{3}.$$

Aha! That shows that triangle ABD is isosceles (at least), so we know ∠DAB = ∠ADB = 60°.

Now we just need ∠DAC. For this, we’d need to know CD to use the Law of Sines; but it’s quicker just to use the fact that ∠DAC = ∠DBC since arc CD subtends both. Since we now know that triangle ABD is equilateral, this is just $$\angle DAC = \angle ABC – \angle ABD = (180^{\circ} – \angle ADC) – \angle ABD = (180^{\circ} – 75^{\circ}) – 60^{\circ} = 45^{\circ}.$$

Those are all the same answers we got before – wonderful!

This is a nice approach; but the others were hardly more difficult, if at all. Sometimes figuring out how a hint is supposed to be used (that is, trying to read the mind of the problem writer) is harder than just solving without the hint. But the lesson here is a good one!

]]>Students often struggle with solving an equation with several variables, for one of those variables. This is also called “solving a formula”, or a “literal equation”; or “making one variable the subject”. Learning to use variables instead of just numbers (as we looked at last week) is the first step into algebra; working with *nothing but* variables feels like a huge leap. But it isn’t! Let’s observe a discussion with such a student.

Here is the question, from Emma in 2010:

Solving for One Unknown among Many: How and Why I have no problem solving for unknowns in equations that look like this: 5(6 + x) = 9 + 1x But my math book gave me problems to solve for x that look like this: cdx + e = rd xy - b = q The book says that all the rules that I have learned work no matter what symbols are used. Buthow can you solve for x when none of the numbers is known?I have checked my work once I was done doing these problems, but I found them incorrect and didn't understand it. For instance, on the second problem, here are my attempts to solve it (they were incorrect): xy - b = q -y - ...?I can't subtract y from the other side because there is no y.So that's as far as I got. Thank you for your help!

There are a couple issues revealed in Emma’s work; but the main one seems to be that she is distracted by the lack of numbers.

Doctor Ian responded, taking a surprising approach by temporarily using *nothing but numbers* to make a point:

Hi Emma, It's true thatall the rules that you have learned work no matter what symbols are used. They even work when no symbols are used at all. This is because a symbol like 'x' just represents a quantity, like any other quantity. To see what I mean, consider something like 3 + 5 ----- * 7 = 28 2 You can evaluate the left side, and see that this is true. Suppose I'd like to'solve for 3.'I can do the same sorts of transformations you've been learning in algebra class. For example, I candivide both sides by 7to get 3 + 5 ----- = 28/7 2 I canmultiply both sides by 2to get 3 + 5 = 2(28/7) And I canadd -5 to both sidesto get 3 = 2(28/7) + -5 You can verify that each intermediate step is true, and so is the final equation.

Emma will later ask what it means to “solve for 3”; if that’s your question, be patient! But all he is doing here is **rearranging the equation**, just as you do when you solve for *x*, isolating one quantity, in this case the 3 rather than an *x*. The point is that all the things Emma has learned for moving numbers around also work here. Each step resulted in an equation that was still true. And if the 3 had been replaced with an *x*, the same work would have found the value of *x*.

He did this to emphasize one idea that becomes important with variables:

Now, what did I do here? More importantly, what did I NOT do? I didn't evaluate every expression just because I could. Instead of simplifying 28/7 to 4, for example, I kept it around. I left everything implied. In this case, I did it to make a point. But when you do it with variables, you do it because you don't know yet what values you have, so you really have no choice. But it's EXACTLY the same idea, carried out exactly the same way. That is, suppose I started out with an equation like my example above, but this time with only one known quantity and the rest unknown: 3 + a ----- * c = d b To 'solve for 3,' I could do the same steps: 3 + a ----- * c = d b 3 + a ----- = d/c b 3 + a = b(d/c) 3 = b(d/c) + -a If you substitute the original values for a, b, c, and d, you find that it all still works.

All of this is just practice in manipulating either variables or numbers in the very same way.

Now, I don't have to solve for 3.I could solve for the variable a instead: 3 + a ----- * c = d b 3 + a ----- = d/c b 3 + a = b(d/c) a = b(d/c) + -3 OrI could solve for b: 3 + a ----- * c = d b 3 + a ----- = d/c b 3 + a ----- = b d/c Or I could solve for c, or for d. I'll leave those for you to try. Remember to check your result by substituting the original values, and seeing that you get a true statement. The key idea, though, is thatyou learn to look at anything-- whether it's a number like '3,' or a variable like 'a,' or an expression like (x - 2)^2, or a function like f(x) --as 'some quantity.'

This is essentially the same thing he emphasized last week. As long as you make only certain kinds of changes, the equation remains true:

And then you canapply the rules you know, which keep an equation balanced, namely: You can add 'some quantity' to both sides of an equation. You can multiply both sides of an equation by 'some quantity.' You don't need special rules for variables, or parameters, or expressions, or functions. They're the same as the rules for numbers. Because they ARE numbers -- they're just numbers the values of which we don't yet know, or haven't yet chosen.

The big change with variables is that the result of solving has a different look:

So when you go on to wonder 'how can you solve for x,' you point up that we actually use the word 'solve' in a lot of different ways. At first, you learn that 'solve' means 'find a particular value,' like x + 4 = 7 x = 3 But what you're learning now is that you can'solve for' one variable 'in terms of' others, like x + a = b x = b - a Here, we've 'solved for x in terms of a and b.' In this form, it means we can CHOOSE values for a and b, and once those choices are made, the value of x is determined. For example, if a is 3 and b is 5, then x must be 2. Or, if a is 4 and b is 2, then x must be -2.

What we’ve done here is to solve many different equations all at once: not just *x* + 4 = 7, but *x* + _ = _. We can solve once, and then plug in whatever numbers we want. So solving here means not getting a **number**, but an **expression**.

Solving yet other kinds of equations produces different kinds of results:

If we were in a situation where we'd normally know x and b, and want to find a, we could 'solve for a in terms of x and b': x + a = b a = b - x This doesn't contain any new information. It justputs it in a form that's easier to use. But -- and here's the part that can cause confusion -- often we just say we are 'solving for x,' leaving the rest implied. And we use 'solve' in other ways, too. For example, given an equation like ... x^2 = 4 ... there are TWO values that make this equation true. It's true when x is 2, and when x is -2. So we say that there is a SET of solutions, or a 'solution set,' which is {2, -2} In fact, an equation like ... x + 4 = 7 ... also has a solution set; that set happens to have one element: {3}

So a solution can mean an **expression**, or a **number**, or a **set of numbers**, or (as he goes on to show in a part I’ll skip) **nothing**, or a **whole line**.

Returning to the main point, he concludes:

Don't worry too much about understanding all this in detail right now. The main point I want to make here is that you need to pay attention to how the word 'solve' is being used, because the meaning can change on you as you go through your math classes, and often no one will alert you that its meaning has been expanded. Anyway, does the idea of solving your equations make more sense now? If so, try some of your equations, and let me know what you come up with. If you're still stuck, try assigning numbers to the variables you DON'T want to solve for, like ... xy - 3 = 4 ... becomes ... x*2 - 3 = 4 y = 2, b = 3, q = 4 ... and solve for that -- again, leaving operations implied ... x*2 - 3 = 4 x*2 = 4 + 3 x = (4 + 3)/2 ... so you can undo the substitutions later: x = (q + b)/y It helps if each variable gets its own number! So you don't want to use, for example, 2 to stand in for both b and q, because you want to be able to tell them apart when you're done. I hope this helps! Write back if you'd like to talk more about this, or anything else.

I frequently use this idea of replacing the extra variables in an equation with numbers, to make it look more familiar, and solving that as a “dry run” to get used to the operations you’ll need to use on the actual equation.

Emma replied with the natural question:

As I was reading through the advice you gave me, I encountered a side question I need to clarify before I can continue.What do you mean by 'solve for 3'?Three's value is already known -- it's 3! Thanks!

As long as you think of “solving” as finding a value, Doctor Ian’s idea makes no sense – and that’s the point. We need to think of solving more generally. He answered:

Good question! This gets to the heart of what it means to 'solve for [one quantity] in terms of [other quantities].' All that really means is torearrange an equation so that one of the quantities appears by itself-- isolated -- on one side and nowhere on the other side. So when we start with an equation like ... 3x + 4 = 5x - 6 ... and we 'solve for x,' ... 3x + 4 = 5x - 6 4 = 2x - 6 10 = 2x 5 = x ... we're solving for x 'in terms of' some constant -- and we want to find out what that constant is.

This is the meaning of solve when you first learn algebra.

But if we start with an equation like ... ax + 4 = bx - 6 ... then when we 'solve for x' ... ax + 4 = bx - 6 4 = (b - a)x - 6 10 = (b - a)x 10/(b - a) = x ... we don't get a unique answer. What we get is an equation that will let us choose values for a and b, and then evaluate an expression to get a corresponding value for x. This is what we mean when we say we'resolving FOR x IN TERMS OF a and b.

Again, “solving” doesn’t always mean finding a number. I suspect this is why, in some cultures, “solve” is replaced by “make *x* the subject” (that is, write an equation that says, “*x* is …”, where *x* is the subject of the sentence): to avoid using the word “solve”, which will be misunderstood. Sooner or later, though, it is necessary to learn standard adult usage.

Here is where this concept becomes useful:

Think about a formula for something like the volume of a cylinder: V = pi * r^2 * h Here, V is the volume, r is the radius of either end, and h is the height. This is already in a good form if we have measurements for the radius and height, and want to calculate the volume. For example, if the radius is 10 cm, and the height is 8 cm, we can just substitute: V = pi * (10 cm)^2 * (8 cm) That is, the formula is already 'solved for V in terms of r and h.' But suppose we're told that the volume is 1200 cm^3, and that the radius is 10 cm. What we don't know is the height. We could still substitute those values to get an equation like ... 1200 cm^3 = pi * (10 cm^2) * h ... but it's not very convenient for finding h. What we can do is 'solve for h in terms of V and r': V = pi * r^2 * h V/pi = r^2 * h V/(pi * r^2) = h Now we can substitute our values for V and r, and calculate h directly. In the same way, if we expect to know V and h, we can 'solve for r in terms of V and h.' I'll leave that for you to do, if you're interested.

I sometimes describe this process as “predigesting” the formula: Rather than have to solve an equation each time we want to find the height, we do all the solving once.

Back to the question:

Anyway, it IS a little strange to 'solve for 3' in an equation, but I was trying to make the point that the operations you do are the same, whether you're working with numbers or variables or any other kinds of quantities. The value of 'solving for 3,' though, is that you can, at each step, evaluate both sides to convince yourself that the equation is still correct. (3 + 5)/2 = 4 True 3 + 5 = 2(4) Still true 3 = 2(5) - 5 Still true That's not always so easy to see when working with variables: (x + 5)/2 = 4 True for some value(s) of x x + 5 = 2(4) Still true for the same values x = 2(5) - 5 Still true for the same values Does this answer your question?

Emma replied:

Yes! Thank you. I just can't think about it too hard, becauseI'll start wondering WHY it works, WHY I have to do it, etc.I'll read the rest of your notes and get back to you if I have any other questions. Thanks so much, again.

But “thinking hard” is our goal, so this called for a little more explanation:

Hi Emma, These are exactly the questions you SHOULD be thinking about. The basic ideas are ones you've been learning about as far back as you can remember, although often they're not spelled out as explicitly as they should be. For example, you learned long ago that you can multiply 3 and 4 to get 12, and you can divide 12 by 3 to get 4. But in that second case, what you REALLY have is a multiplication ... 3 * ? = 12 ... that's not written in a convenient form. If we write it like this instead, ... ? = 12 / 3 ... then we can use division to 'solve for the unknown factor' in a multiplication where we know the factor and a product, instead of the two factors.

These ideas are taught as part of arithmetic, but are the foundations of algebra: We can turn a fact inside out.

Answering the “why I have to do it” question:

In trying to use mathematics to model situations in the world, we often find that there are certain relationships, like force = mass * acceleration or ___________________________ | | length of pendulum period = 2 * pi * | -------------------------- \| gravitational acceleration As a rule, if there are N quantities that might change, and you know N - 1 of them, that will force a particular value for the remaining quantity. Sometimes, as I showed in the example of the cylinder, we know all the values on one side of the equation, and we can just substitute and evaluate to find the value on the other side. But sometimes we have a missing quantity 'in the middle' of one side, and we'd like to 'solve for' that quantityin order to calculate it easily. So that's why we do this sort of thing.

How about the other question?

Why does it work?The whole idea of 'solving' equations is this. Write down any equation with variables, like these three: 2x + 4 = 10 x^2 - 1 = 8 y = 3x + 4 In each case, we have a set (called the 'solution set') of values, or collections of values, that can be assigned to the variables in order to give a true statement. In the first equation above, there is just one value that works: x = 3. So we can say that the solution set is the set {3} In the second equation, the equation is true when x is either 3 or -3. So now we have two elements in our solution set: {3, -3} That is, we can grab any element of this set, and substitute it for x, and we will get a true statement. If we try substituting anything NOT in this set, we will get a false statement. In the third equation, we need matched pairs of values to make the equation true. For example, when x is 1, y must be 7. So the pair (x = 1,y = 7) or just (1,7) is in the solution set. {(1,7), ...} So is the pair (2,10): {(1,7), (2,10), ...} In this case, there are infinitely many solutions in that set, so we can't write them all down. We can, however, make a graph, which is a sort of 'picture' of all the solutions at once.

These, again, are different kinds of solutions we might get.

But here is the important point. When you do certain transformations on equations, like adding the same thing to each side,you are guaranteed that the solution set of the old equation is the same as the solution set of the new equation. In other words,you can change the appearance of an equation, without changing the conditions under which it's true. So we can start with something complicated, like ... (x + 4) ------- + 5x = 18 (x - 2) ... and we can keep transforming it, without changing the solution set, until we get something simple like this: x = 5 And now it's trivial to see what the solution set is! It's just {5} So that must be the solution set of the original equation, too. That means the main thing you're learning in algebra is WHICH transformations will let you do this kind of thing -- and which other ones can also be used, with certain caveats. And then you spend a lot of time practicing combining them -- much of which has to do with the larger issue of learning to solve problems: How to Motivate Students to Learn Math? http://mathforum.org/library/drmath/view/71952.html I hope this helps! Let me know if you'd like to talk more about this, or anything else.

At this point Emma revealed an aspect of her original question that showed a different misunderstanding. I’ll let you read that yourself, since this is getting long.

]]>Last week I mentioned “non-routine problems” in connection with the idea of “guessing” at a method. Let’s look at a recent discussion in which the same issues came up. How do you approach a problem when you have no idea where to start? We’ll consider some interesting implications for problem solving in general, with an emphasis on George Polya’s outline.

This came to us in March, from a student who identified him/herself as “J”:

Hi,

Recently I had to solve a problem

If (a + md) / (a + nd) = (a + nd) / (a + rd) and

(1 / n) – (1 / m) = (1 / r) – (1/n) , then

(d / a) = -(2 / n)

i.e. Given the two expressions above I need to prove the last equality.

I don’t understand problems like these. Basic Algebra books talk about problems like equation solving or word problems, but those are easy because

there’s always some method you can use. For example regarding equation solving you move x’s to the left, numbers to the right; word problems can be solved using equalities like distance = rate * time. But a problem like the one above it seems has no method;it seems like you’re supposed to just manipulate the symbols until you get the answer. For example I tried to solve it like this:Regarding the first expression, after multiplying numerator of the first fraction by the denominator of the second I get

(d / a) = ((m + r – 2n) / (n^2 – mr))

and 2mr = nm – nr then substitute for mr in the first expression.

I reached the solution by luck;I just manipulated the symbols andit took me a lot of time. So is there a more efficient way to solve problems like these? How to think about these problems?Am I supposed to just mindlessly manipulate the symbols until I get lucky?Finally are there any

books that deal with problems like these? Because like I mentioned it seems like most precalculus books talk about equation solving etc., problems which have a clear method. Thanks.

Before we deal with the question, let’s look more closely at his solution.

We are given two equations:

$$\displaystyle\frac{a + md}{a + nd} = \frac{a + nd}{a + rd}$$

$$\displaystyle\frac{1}{n} – \frac{1}{m} = \frac{1}{r} – \frac{1}{n}$$

We need to conclude that

$$\displaystyle\frac{d}{a} = -\frac{2}{n}.$$

J gave only a brief outline of what he did; can we fill in the gaps?

My version is to first “cross-multiply” in each equation to eliminate fractions, and do a little simplification:

The first becomes $$(a + md)(a + rd) = (a + nd) (a + nd),$$ which expands to $$a^2 + rda + mda + mrd^2 = a^2 + 2nda + n^2d^2,$$ then $$rda + mda – 2nda = n^2d^2 – mrd^2,$$ which factors to yield $$(r + m – 2n)da = (n^2 – mr)d^2.$$ Dividing, we get $$\displaystyle\frac{d}{a} = \frac{r + m – 2n}{n^2 – mr}.$$

(You may notice here that in dividing both sides by *d*, we obscured the fact that the line before is true whenever *d* = 0. I’ll be mentioning this below.)

The second equation, multiplied by \(mnr\), becomes $$mr – nr = nm – mr,$$ which easily becomes $$2mr = nm + nr.$$ (J had a sign error here.)

Now, replacing \(mr\) with \(\displaystyle\frac{nm + nr}{2}\), we get $$\displaystyle\frac{d}{a} = \frac{r + m – 2n}{n^2 – \frac{nm + nr}{2}} = \frac{2(r + m – 2n)}{2n^2 – nm – nr} = \frac{2(r + m – 2n)}{-n(r + m – 2n)} = -\frac{2}{n}.$$

Taking the question myself, I replied:

I tried the problem without looking at your work, and ended up doing almost exactly the same things. That took me just a few minutes. So probably it is

not your methoditself, butyour way of finding it, that needs improvement. In my case, I did the “obvious” things (clearing fractions, expanding, factoring) to both given equations,keeping my eyes open for points at which they might be linked together, and found one. It may be mostly experience that allowed me to find it quickly. That is, I didn’t “mindlesslymanipulate”, but “mindfullymanipulated”. And the more ideas there are in your mind, the more easily that can happen.So maybe just doing a lot of (different) problems is the main key.

I added a few more thoughts about strategies:

There may be a better method for solving this, but finding it would take me a longer time than what I did. So

perseverance at trying things is necessary, regardless. Solutions to hard problems don’t just jump out at you (unless they are already in your mind from past experience);you have to explore. The ideas I describe for working out a proof apply here as well:I like to think of a proof as a bridge, or maybe a path through a forest: you have to start with some facts you are given, and find a way to your destination. You have to start out by looking over the territory,

getting a feel for where you are and where you have to go– what direction you have to head, what landmarks you might find on the way, how you’ll know when you’re getting close.(By the way, in my work I also found that d/a = 0 gives a solution, so that if d=0 (and a ≠ 0), the conclusion is not necessarily true. Did you omit a condition that all variables are nonzero?)

You are probably right that too many textbooks and courses focus on routine methods, and don’t give enough

training in non-routine problem solving. They may include some “challenge problems” or “critical thinking exercises”, but don’t reallyteachthat. One source of this sort of training is in books or websites (such as artofproblemsolving.com) that are aimed at preparation for contests. Books likePolya’sHow to Solve It(and newer books with similar titles) are also helpful.Here are a few pages I found in our archives that have at least some relevance:

What Is Mathematical Thinking?

Others of us may have ideas to add.

Some these have been mentioned in previous posts such as How to Write a Proof: The Big Picture and Studying Math: Want a Challenge?.

The next day, J wrote in with another problem, having already followed up on my suggestions:

Hi. I posted here recently asking about problem solving and algebra and I was recommended a book called

“How to solve it” by Pólya. I bought that book and now I am trying to solve some algebra exercises using it. Today I came across this problemIf bz + cy = cx + az = ay + bx and (x + y +z)^2 = 0 , then a +/- b +/- c.

(The sign +/- was a bit confusing to me since it’s not brought up anywhere in the book besides this problem, but Wikipedia says that a +/- b = 0 is a + b =0 or a – b = 0.)

In the book “How to solve it” Pólya says that first it’s important to

understand the problem and restate it.So my interpretation of a problem is this:

If numbers x, y, z are such that (x + y + z)^2 = 0 and bz + cy = cx + az and bz + cy = ay + bx, then the numbers a, b, c are such that a + b + c = 0 or a – b – c =0

Next Pólya says to

devise a plan. To do that he says you need to look at a hypothesis and conclusion andthink of a similar problemor a theorem.The best I could think of is an elimination problem, i.e. when you’re given a certain set of equations and you can find a relationship between constants. Can you think of any other similar problems which could help me solve this problem?

I first responded to the last question:

Hi again, J.

I would say that the last question you asked was “similar” to this, so the same general approach will help. That’s essentially what you said in your last paragraph, I think. I know that isn’t very helpful, but it’s all I can think of myself. You’d like to have seen a problem that is more specifically like this one, such as having (x + y + z)

^{2}= 0 in it, perhaps, so you could get more specific ideas.I only know that I have seen a lot of problems like this involving

symmetrical equations(where each variable is used in the same ways), and I suspect those problems can be solved by similar methods. But I don’t know one method that would work for this one.

I’ll get back to that question. But let’s focus first on Polya.

Here is what Polya says (p. 5) when he introduces his famous four steps of problem solving:

In order to group conveniently the questions and suggestions of our list, we shall distinguish four phases of the work. First, we have to

understandthe problem; we have to see clearly what is required. Second, we have to see how the various items are connected, how the unknown is linked to the data, in order to obtain the idea of the solution, to make aplan. third, wecarry outour plan. Fourth, welook backat the completed solution, we review and discuss it.

This process is then explained in more detail, and used as an organizing principle in the rest of the book. It can be amazing to see how many students jump into a problem before they *understand* what it is asking, or do calculations without having made any *plans*. On the other hand, it would be wrong to think of these four steps as a *routine* to be followed exactly; often you don’t fully *understand* a problem until you have started *doing something*, perhaps carrying out a half-formed plan and then realizing that you had a wrong impression of some part.

And J has here a good example of a misunderstanding. This problem uses the plus-or-minus symbol (±) in a rare way, which in this case requires *asking* (not explicitly one of Polya’s recommendations, but valuable!).

The problem says this:

$$\text{If } bz + cy = cx + az = ay + bx \text{ and } (x + y + z)^2 = 0 \text{, then } a \pm b \pm c.$$

(No, that doesn’t quite make sense! We’ll be fixing that shortly.)

What does it mean when there are two of the same symbol? The Wikipedia page J found says, “In mathematical formulas, the ± symbol may be used to indicate a symbol that may be replaced by either the + or − symbols, allowing the formula to represent two values or two equations.” They give an example (the quadratic formula), where *either sign* yields a valid answer; then an example with two of the same sign (the addition/subtraction identity for sines) in which both must be replaced with the *same sign*; and third example (a Taylor series) where the reader has to *determine which sign* is appropriate for a given term. Later they introduce the minus-or-plus sign (\(\mp\)), which explicitly indicates the *opposite sign* from an already-used ±.

But here, we have two ±’s with no clear reason why they should be the same, or should be different. Is this a special case? J has assumed they are the same, so that it means “\(a + b + c = 0\) or \(a – b – c = 0\)“. This is the first issue I had to deal with:

First, though, did you mean to say that the conclusion is a ± b ± c

= 0? That wouldn’t quite mean what you said about it, because the two signs need not be the same. Rather, it means thateithera + b + c = 0, or a + b – c = 0, or a – b + c = 0, or a – b – c = 0:any possible combinationof the signs.

Now, how did I *know* that, when it goes against what Wikipedia seems to be saying? I’m not sure! There is actually some ambiguity; really, we just shouldn’t rule out this *possibility*. But I saw from the start that if the two signs are the same, then the problem has an odd *asymmetry*, requiring *b* and *c* to have the same sign in this equation, but not *a*. That simply seems unlikely, considering the symmetry elsewhere.

Sometimes we discover, as we proceed through the solving process, that we *have* to interpret the statement one way or another in order for it to be true – an example of my comment that understanding can come *after* doing some work. (That was actually the case here. But the problem really should have been written to make this clear!)

What this means is that

we don’t know the signsof the numbers. One thing that suggests is that we might be able to show some fact about a^{2}, b^{2}, and c^{2}, so that we would have to takesquare roots, requiring us to use ± before each of a, b, and c.It’s also interesting that they said that (x + y + z)

^{2}= 0, which means nothing more than x + y + z = 0. That also makes me curious, and at the least puts squares into my mind for a second reason.

Here I am just letting my mind wander around the problem, pondering what the givens suggest. This is part of both the understanding phase, and the “looking for connections” Polya talked about.

Not even being sure of the conclusion, I just tried manipulating the equations any way I could, just to make their meanings more visible; and then I solved x + y + z = 0 for z and put that into my derived equations, eliminating z. That took me eventually to a very simple equation that involved a, b, x

^{2}, and y^{2}. And that gave a route to the ± I’d had in mind.

We could say that my initial plan is, as I suggested at the top, to **explore**! We can refine the plan as we see more connections. (As I said, Polya has to be followed flexibly.)

There’s a lot of detail I’ve omitted, in part because much of my work was undirected, so you may well find a better way. But the key was to have some thoughts in mind before I did a lot of work, in hope of

recognizing a useful form when I ran across it. The other key wasperseverance, because things got very complicated before they became simple again! (I suspect that as I go through this again, I’ll see some better choices to make, knowing better where I’m headed.)I don’t think you told us where these problems came from; they seem like contest-type problems, which you can expect to be highly non-routine. As I said last time, until you’ve done a lot of these, you just need to keep your eyes open so that you are learning things that will be useful in future problems! I am not a contest expert, as a couple of us are, so I hope they will add some input.

Since we never got back to the details of this problem, let’s finish it now. Frankly, I had to look in my stack of scrap paper to find what I did in March, because I wasn’t making any progress when I tried it again just now. Clearly I could have given a better hint! I was hoping that just the encouragement that it could be done would lead to J finding a nicer approach than mine.

But here’s what I find in my incomplete notes from then. First, I rewrote the equality of three expressions as two equations, and eliminated c; I’ll use a different pair of equations than I did then, with that goal in mind: $$cx + az = ay + bx\; \rightarrow\; c = \frac{ay-az+bx}{x}$$ $$ay + bx = bz + cy\; \rightarrow\; c = \frac{ay-bz+bx}{y}$$

Setting these equal to eliminate c, $$\frac{ay-az+bx}{x} = \frac{ay-bz+bx}{y}$$

Cross-multiplying, $$ay^2-ayz+bxy = axy-bxz+bx^2$$

Solving \(x + y + z = 0\) for *z* and substituting, $$ay^2-ay(-x-y)+bxy = axy-bx(-x-y)+bx^2$$

Expanding, $$ay^2 + axy + ay^2 + bxy = axy + bx^2 + bxy + bx^2$$

Canceling like terms on both sides, $$2ay^2 = 2bx^2$$

Therefore, $$\frac{x^2}{a} = \frac{y^2}{b}$$

We could do the same thing with different variables and find that this is also equal to \(\frac{z^2}{c}\). So we have $$\frac{x^2}{a} = \frac{y^2}{b} = \frac{z^2}{c} = k$$

Now we’re at the place I foresaw, where we can take square roots: $$x = \pm\sqrt{ak}$$ $$y = \pm\sqrt{bk}$$ $$z = \pm\sqrt{ck}$$

Therefore, since \(x+y+z=0\), we know that $$\pm\sqrt{ak}+\pm\sqrt{bk}+\pm\sqrt{ck}=0$$

and, dividing by \(\sqrt{k}\), we have $$\pm\sqrt{a}\pm\sqrt{b}\pm\sqrt{c}=0$$

In March, it turns out, I stopped short of the answer, thinking I saw it coming. But in fact, I didn’t attain the goal! **I hoped that a, b, and c would be squared** before we have to take the roots. We seem, however, to have proved that

I’m wondering if the problem, which was never quite actually stated, might have been different from what I assumed. In fact, armed with this suspicion, I tried to find an example or a counterexample, and found that if $$\begin{pmatrix}a & b & c\\ x & y & z\end{pmatrix}= \begin{pmatrix}1 & 4 & 1\\ 1 & -2 & 1\end{pmatrix}$$ satisfies the conditions, with $$bz + cy = cx + az = ay + bx = 2,$$ but no combination of signed *a*, *b*, and *c* add up to 0. So the real problem must have been something else …

At this point J abandoned that path, and closed with a side issue:

Hi Doctor. I have one more question about problem solving.

I spent some more time on the problem we discussed then I skipped it and decided to focus on other problems instead. I managed to solve a few of them but then I took a long break when I came back

I couldn’t remember the solutions without looking at my work.I don’t know if you read

How to solve itby Pólya. I ask since at the beginning of that book Pólya gives an example of a mathematical problem. The problem in question is this:Find the diagonal of a rectangular parallelepiped if the length, width, and height are known.

He asks the reader to consider the auxiliary problem of finding the diagonal of the right triangle using Pythagoras theorem. I am telling you this because the solution to this problem is very clear; I can recall it even long after I finished reading. I do not feel the same about algebra problems. I solve them, do the obvious things, and then I almost immediately forget. Does that happen to you? If not

how do you remember the solution?I just want to know if you find these algebra problems as unintuitive as I do.

My memory is as bad as anyone’s! I replied,

I wouldn’t say that I remember every solution I’ve done, or every solution I’ve read. The example you give is a classic that stands out, particularly the overall strategy. Others are more ad-hoc and don’t feel universal (in the sense of being applicable to a large class of problems), so they don’t stick in the memory.

I don’t have my copy of Polya with me (I’ve been meaning to look for it), but I recall that one of his principles is to

take time after solving a problem to focus on what you didand think about how it might be of use for other problems. This is something like looking around before I leave my car in a parking lot to be sure I will recognize where I left it when I come back from another direction.I want to fix the good idea in my mind and be able to recognize future times when it will fit.But even though I do have that habit, there are some problem types that I recognize over and over, but keep forgetting what the trick is. (Maybe sometimes it’s because I’ve seen two different tricks, and they get mixed up in my mind.)

So you’re not alone. For me, though, it’s not such much being

unintuitive, as just not beingmemorable, or being too complex for me to have focused on them enough to remember.

So Polya recognized the likelihood of forgetting (failing to learn from what you have done), and the need to make a deliberate effort there!

]]>We’re looking at extended discussions of a single topic, which illustrate how we try to guide a student to a deeper understanding. Here, a student asks how to solve an equation, and Doctor Ian takes him through the whole process, clarifying what it means to solve an equation, and what you do to get there.

It came in 2003, from Vinnie:

Variables, Explained I have problems trying to understand how to answer and work out these types of math problems: 13.7b - 6.5 = -2.3b + 8.3

Doctor Ian started by observing that there are many possible reasons for being unable to solve such an equation, so he asked a diagnostic question:

Hi Vinnie, Is it the variables that are troubling you, or the use of decimals? For example, would you be able to solve this? 14b - 6 = 2b + 1 Give that a try, and let me know what you come up with. (Show me your work, too, since that will help me figure out how we can get to the next step together.)

This is a good first step to find out what a student needs: Try a simpler problem. This will reveal whether Vinnie makes little mistakes, or big ones. Here’s the response:

First I'd like to thank you for taking time on reading and replying to my message. Lets see... 14b - 6 = 2b + 1 8b = 2b + 1 8b = 3b 24b? I am not sure I understand very well how to answer these types of problems. What would be the first step?

Aha! There are some fundamental errors here; it looks superficially like what we expect to see, but every single detail is wrong. Ian adeptly avoids saying that, though!

He starts at the beginning, with what first a variable, then an equation means:

Hi Vinnie, Thanks for getting back to me, and showing your work. That really helps. It looks like the first thing you're having trouble with iswhat it means when we write something like 14b - 6The b is a variable. That is,it stands for some number whose value we're trying to find. When we write a number and a variable, or a number of variables, togetherwithout any operator, multiplication is implied. So '14b' means 14 times (whatever b is) If the value of b turns out to be 3, the value of '14b' is 14*3 = 42 If the value of b turns out to be 5, the value of '14b' is 14*5 = 70 Does that make sense? So when we have something like 14b - 6 we can't really do the subtraction, because we don't know what the value of b is.

Thinking of a variable as standing for a specific number is important; and putting specific numbers in place of it goes a long way to demonstrate that idea. We’re also getting at Vinnie’s first big error, namely combining unlike terms (the 14*b* and the 6). Students often hear about “combining like terms” and may learn the rules, but do they always understand *why* they can’t combine *unlike* terms?

So let's look at a very simple equation, like 3b + 2 = 11 This says: There is some number whose value we don't know yet. Call that value 'b'. If we multiply b by 3, and add 2 to the product, we get 11.

So the equation is a **sentence**, saying that this number is equal to that number. We want to find what value *b* must have, if the sentence tells the truth.

Usually the goal in a case like this is tofind the value of b that makes the equation true. We could try to find it byguessing: Does b = 0? 3*0 + 2 = 11 0 + 2 = 11 2 = 11 This is a false statement, so the value of b is _not_ zero. Does b = 1? 3*1 + 2 = 11 3 + 2 = 11 5 = 11 This is also a false statement. Is there something we can do that's quicker than trying possible values? (What if the correct value turns out to be 1058? That will take us a long time to find!)

And if the answer turned out to be -2.357, we might *never* get to guessing that! But by “guessing and checking” (trying out the equation with specific values), we are again focusing on what the equation *means*, which is essential.

Now (still not following *rules* someone has taught us, but just *thinking*), we can find a solution by working backward:

We can reason this way. Looking at 3b + 2 = 11 we can think of it as something + 2 = 11 which means that something = 11 - 2 = 9 Does that make sense? Then since our 'something' is just 3b, we know that 3b = 9 Now, we can think of _this_ as 3*something = 9 which means that something = 9/3 = 3 But our 'something' is just b. So we know that b=3.

So, we ask, “What plus 2 is 11?” and answer, “It’s 11 – 2, which is 9.” Then we ask, “What times 3 is 9?” and answer, “It’s 9 divided by 3, which is 3.”

There’s one more really important thing about what it means to solve an equation: Since solving means finding the value of the variable that makes the equation true, we can **check** our answer by seeing if it makes the equation true:

Let's check that by substituting 3 for b in the original equation: 3b + 2 = 11 3*3 + 2 = 11 11 = 11 And this is true. So we've found the value of b that we were looking for.

So, 3 + 3 is 9, and 9 + 2 is 11, which is just what we wanted.

I’m often amazed, both as a Math Doctor and in face-to-face tutoring, how many students don’t know how to check their answers – which means they’ve missed the whole point of solving!

Now it’s time to see how Vinnie is doing:

Were you able to follow all this? If so, then try to find the values that make the following equations true: 5x - 4 = 1 4x + 7 = 23 3x + 5 = 2 Let me know what you get, and then we can go to the next level of complexity, okay?

Vinnie’s response shows that he understands at least what it means to solve an equation, and to check it; but he’s not quite there yet:

I think I'm following. You are trying to find the missing number that makes the things in the equation equal, right? 5x - 4 = 1 Herex is nothingbecause 5-4 is already 1. So the x confused me here, unless x = 1. Then I would understand because 5(1)-4 = 1, sox equals 1, right? 4x + 7 = 23 4(4) + 7 = 23 ----> x = 4 because 4*4 + 7 = 23 Did I get it? 3x + 5 = 2 In this one I don't understand how you can get it equal to 2. But so far so good. I understand what you mean.

The idea that *x* is “nothing” is interesting; he seems to mean that if you just drop the *x*, then 5 – 4 = 1 is true. And, in fact, setting *x* to 1 does exactly that, as 5*x* is just 5.

Doctor Ian continued with the concept of variables, to make sure a good foundation had been laid for the next step. (Keep in mind that this is 2003.)

Hi Vinnie, You're right, we _are_ looking for the number that makes each equation true. It's a little like what we could do with sentences. Suppose we make a sentence with a blank in it: _______ is President of the United States. There are lots of things we could put in the blank that would make the sentence false: Bob Dole is President of the United States. Homer Simpson is President of the United States. Jennifer Lopez is President of the United States. But there is at least one thing we could put in to make it true: George Bush is President of the United States. A variable is like that. You can think of it as a 'hole' or a 'blank' in a sentence involving numbers. So when we write 5x - 4 = 1 we mean 5*___ - 4 = 1 and we want to find what goes in the blank to make the sentence true.

Blanks could be all we need, *if* we never had more than one unknown:

Why use letters instead of blanks? Mostly because there are times when we need to fill inmore than one blank, e.g., ___ is President of the United States, and ___ is Vice-President. and we want to be able to tell the different blanks apart, so we give themnames. And being lazy, we make the names as short as possible - usually asingle letter: X is President of the United States, and Y is Vice-President. Y is older than X. Note that this only works if we have X = George Bush Y = Dick Cheney It doesn't work if we assign them the other way. If it helps, you can rewrite single letters aswords, if that will make it easier to remember what role they're supposed to be playing. That is, when someone writes 5x - 4 = 1 you could immediately rewrite that as 5*something - 4 = 1 Eventually, you'll get used to thinking of letters as representing unknown quantities, but there's no law of nature that says an unknown quantity has to be represented by a single letter, or that if you use a single letter it has to be x.

In computer programming, variables are usually whole words, or even phrases, because there are so many of them to keep track of.

Doctor Ian next confirmed the correct answers to the first two problems, then dealt with the last, 3*x* + 5 = 2, using the same “something” approach as before:

This one was tricky. In this case,the value has to be a negative number. Here's one way you might think of it: 3x + 5 = 2 something + 5 = 2 -3 + 5 = 2 so something = -3 3x = -3 3 * something = -3 so something = -1 x = -1 If we put the value back in the equation, we get 3(-1) + 5 = 2 -3 + 5 = 2 2 = 2 which checks.

That was not really harder technically; but the negative number gave it a different feel. Often students are troubled when the answer to a problem is not as nice as the answers to previous examples, and they think they must be wrong.

I put the negative case in there to make a point. In algebra, you start out with equations where we can often find the right value by making a few guesses. But quickly we get to equations where there might be zillions of possible guesses - for example, wherethe value we're looking for could be large or small, positive or negative, an integer or a fraction or even an irrational number.

Never expect things to be nice! We learn algebraic techniques initially in cases where the algebra seems hardly necessary; but we are preparing for the real world, which is not so nice. When a teacher always gives nice examples, it is a disservice to the students, not a kindness in the long run.

It’s time now to start heading toward standard methods, in order to make this solving process more routine. For this, Doctor Ian moves on to somewhat harder problems, with the variable on both sides:

For those equations, it's helpful to have some rules that we can follow to help us proceed from something that looks like 16b - 6 = 2b + 1 to something that looks like b = 1/2 And for the most part,there are really only two rules. The first is thatwe can never, ever, for any reason, divide by zero. The second is thatwhatever we do to one side of an equation, we can do to the other side without changing the truth of the equation. For example, if we have an equation like 16b - 6 = 2b + 1 this tells us that, for some value of b, the quantity on the left is equal to the quantity on the right. Does that make sense? Well, if we have two things that are equal, and we add 4 to each of them, they'll still be equal, right? So let's do that: 16b - 6 + 4 = 2b + 1 + 4 If we multiply both of them by 11, they'll still be equal, right? 11(16b - 6 + 4) = 11(2b + 1 + 4)

A key idea here, which many students miss at first, is that the “something” that we do to both sides must be an operation we perform on **the value of each entire side**. The two examples here are *adding a number to each side*, and *multiplying each side by a number*. We can’t do something like *multiplying one term of each side* by the same number.

This tells us **what we are allowed to do** – what “tools” we have available. But the examples above didn’t actually accomplish anything, did they? The other thing we need to know is **how to decide what to do** – that is, what “tool” is useful at a certain point in the process.

Now, it turns out that most things like this that we might try aren't all that helpful. They just make things more complicated. Butin any given situation, there will usually be a few adjustments that will help us turn what we have into something simpler. For example, if I start with 16b - 6 = 2b + 1 and add 6 to both sides, I end up with something simpler: 16b - 6 + 6 = 2b + 1 + 6 16b + 0 = 2b + 7 16b = 2b + 7 That's simpler than what we started with.

Since the final goal (an equation saying what the value of the variable is: *b* = 1/2) is simple, anything that **makes the equation simpler** (but still true) is useful. It’s like trying to find water in the woods by always hiking downhill.

And what made this particular choice (adding 6) useful? It reduced the number of terms in the equation. Note that we aren’t following a set routine yet; this isn’t necessarily “the right” thing to do, or even “the best” – it’s just a useful thing to do, and that’s enough. You don’t need to memorize a routine (though you probably will, eventually, just by doing the same thing many times). The emphasis here is on doing something helpful.

We can do something similar by subtracting 2b from each side. (We don't know what b is, but it's some number, and so 2b is also a number, and we can subtract the same number from each side.) 16b - 2b = 2b - 2b + 7 Now, on the right side, whatever b is, 2b minus 2b is zero. So the right side simplifies to 16b - 2b = 0 + 7 16b - 2b = 7

What was it that made this useful? We still have the same number of terms, so in a sense it isn’t simpler; but we now have the variable on only one side, which is another characteristic of our goal, which will have only *b* on the left side.

In addition, it sets us up to use another important tool:

Now, what about the left side? To simplify the left side, you need to use thedistributive property of multiplication over addition, which sounds kind of complicated, but it's actually much simpler than its name would imply; and in algebra, it's one of the best friends you can have. If you're not familiar with the distributive property, take a moment to read this: Distributive Property, Illustrated http://mathforum.org/library/drmath/view/52842.html Let's try to use the distributive property on the left side of our equation: 16b - 2b = 7 First, we find a factor that they have in commmon. How about b? b(16 - 2) = 7 Make sure you understand what I did here, because it's one of the keys to solving just about any algebra problem. Anyway, now we can do the subtraction: b(14) = 7 14b = 7 Now, eventually you'll learn to look at things like 16b - 2b and immediately simplify that to 14b but it's good to know _why_ you can do that, i.e., that you're really just applying the distributive property.

The **distributive property** says that a multiplication with a sum, \(a(b + c)\), can be rewritten as a sum of multiplications, \(ab + ac\). We just multiply each term in the sum by the multiplier, “distributing” the multiplication. Here, we did the reverse: combine a sum by removing a common factor from each term and multiplying the resulting sum by it: \(ab + ac = a(b + c)\). In the particular situation we have here, where the common factor is the variable, we call the process **combining like terms**: \(ax + bx = (a + b)x\).

I like to think of what we just did this way: If you have 16 “bees” and take away “2 “bees”, there are 14 “bees”. So it’s just common sense. But it also has those fancy names.

Back to Doctor Ian:

So now we have something pretty simple: 14b = 7 At this point, we might just guess the answer. But we don't have to. We candivide both sides of the equation by 14. And that gives us 14b 7 --- = -- 14 14 b = 1/2 I know that this has been a lot for you to plow through. I'm sorry about that, but this is _all_ stuff that you need to know if algebra is going to make any sense.

I find that a lot of students, if they decide just to guess, tend to say that *b* is 2 instead of 1/2. They see the 14 and the 7, and they jump to a conclusion. That’s why it’s useful to explicitly go through this division step, undoing the multiplication by dividing both sides, to make sure everything ends up in the right place. That’s also why you need to check your answer – and checking is all the more important when you are most confident (so that you may have rushed), or when you least want to check (because the answer is a fraction, and you want to just write it down and run).

Doctor Ian summarized what he’d said, and then did a final check:

Now take another crack at your original problem, 13.7b - 6.5 = -2.3b + 8.3 and show me the steps you take. This is very similar to the problem we just went through. Only the numbers are different.

Vinnie successfully solved the original problem, explaining the work much as Doctor Ian did, but displaying the work as teachers often do:

So it is like this: 1) 13.7b - 6.5 = -2.3b + 8.3 + 6.5 6.5 ---> You add 6.5 to both sides. --------------------------- This cancels on one side, and 13.7b = -2.3b + 14.8 gets added to 8.3 on the other. 2) 13.7b = -2.3b + 14.8 + 2.3b 2.3b ---> You add 2.3b to both sides. --------------------------- This cancels on one side, and 16b = 14.8 gets added to 13.7b on the other. 3) 16b 14.8 --- = ---- ---> You divide both sides by 16. 16 16 This cancels on one side, and on the other you get 0.925. 4) b = 0.925 ---> Nothing left to do.

And, of course, to finish up, we should check the answer:

Left-hand side: \(13.7b – 6.5 = 13.7(0.925) – 6.5 = 6.1725\)

Right-hand side: \(-2.3b + 8.3 = -2.3(0.925) + 8.3 = 6.1725\)

Since these are equal, the solution is correct.

Seeing a student solve his own problem with no direct help is our goal in tutoring. We seek not to give answers, but understanding. And this whole interaction is typical of the best we do.

]]>Last week we looked at a question about a triangle inscribed in a semicircle. Not long after that question, the same student, Kurisada, asked a question about triangle inscribed in a circle, which had some connections to the other. As we enjoy doing, we led the student through several possible approaches to a solution. It also illustrates a situation where different methods can lead to what appear to be entirely different answers, yet they may be identical.

Here is the new problem, from the very end of last December:

A circle O is circumscribed around a triangle ABC, and its radius is r. The angles of the triangle are CAB = a, ABC = b, BCA = c. When a = 75°, b = 60°, c = 45° and r = 1, the length of sides AB, BC, and CA are calculated as ____, ____, ____ without using trigonometric functions.

Here is a picture showing all the information we have:

Using trigonometry, we could find the sides if we knew one of them; but the only length we have is the circumradius (the radius of the circumscribed circle). With no formula for this radius, and no trigonometry, how are we to do this?

Since all we were given was the problem, Doctor Rick responded with just a hint, and the usual request to see work:

Hi, Kurisada. This is another interesting problem!

As a start, I suggest constructing the radii OA, OB, and OC, and determining the interior angles of the triangles AOB, BOC, and COA. I can think of several ways to do this.

Many of the angles you will now find in these three triangles will be familiar angles that you know how to work with. The most challenging may bring to mind one of the problems we have discussed with you before. See what you can do now.

Drawing in the radii, as I already did above, is a standard first step, as they must be involved in the solution. Doctor Rick’s work, as suggested, involved a triangle similar to one from last week’s problem, but that is not the only way.

Kurisada replied:

I found that AOB is 90° and thus, AB is √2.

I’ve also found another angle but I wasn’t able to find AC and BC without using trigonometry ratio.

Triangle AOC has the angles 120°, 30°, and 30°.

And triangle BOC has the angles 150°, 15°, and 15°.

Did I miss something?

Here is what we have now:

As Doctor Rick said, there are several ways to have found these angles; one is to use the fact that a central angle is twice the inscribed angle, so that for instance ∠AOB = 2∠ACB = 90°. Since the triangle is isosceles, the other angles are both 45°.

Doctor Rick replied, having only started work on actually solving the problem himself, but adding more hints on the harder two triangles:

You’ve done well so far. You’ve got the easiest side, AB.

For side AC, consider that triangle AOC is isosceles, and construct the altitude to AC.

Side BC is the most challenging part that I mentioned. Notice that when you construct the altitude to BC, you’ll have the same right triangle that turned out to be the answer in the triangle-in-a-semicircle problem: 15-75-90. Applying things we learned there can help us find the

areaof triangle BOC pretty easily, but I’m not sure how much that helps. Let’s both work on this!

Here is the figure with those two altitudes added; the first yields 30-60-90 triangles, which are easily solved, and the second gives the triangles we saw in the other problem:

I had another idea, and jumped in briefly:

Here is an alternative: Having found AB, construct the altitude

from Ato BC. Several things work out nicely.

Doctor Rick by now had finished his work, and added:

I found a fairly simple way to complete the work I started … it involves extending BO to the other side of the circle and constructing the perpendicular from C to this line.

Then, recall our work on the triangle in a semicircle, and construct the radius OC as well, which makes another 30-60-90 triangle.

However, my solution has nested square roots, whereas Doctor Peterson’s solution has a sum of square roots. It can be shown that the two solutions are equal, but his is “nicer” — we don’t really like nested roots.

From here on, the actual interaction mingled work on the two approaches in a way that is very hard to follow, so I am going to break with tradition and untangle these into two separate threads. (It was not easy, especially because there were also several typos and consequent confusion to edit out.) First, we’ll follow the discussion of Doctor Rick’s idea.

Kurisada now showed his work:

Focusing on the doctor’s statement about 30-60-90, then I thought that there is a fixed ratio of the sides of 30-60-90 triangle.

I searched it and I found the ratio 1 : √3 : 2.

I wrote the perpendicular point from C to line BO after extended as Y (sorry for my bad English in this, but I attached the picture below).

And I take the triangle COY with angles 30-60-90.

Since OC = 1, then OY = (√3)/2, and CY = 1/2.

Then I take triangle BCY.

CY = 1/2 and BY = 1 + (√3)/2.

Then using Pythagoras Theorem,

I got BC = √(2 + √3).

The key answer shows that BC = (√6 + √2)/2.I wonder if I did anything wrong.

I also wonder if what doctor wanted to tell me is as above or not.

I also tried to apply about my previous problem (triangle inside a semicircle), but I can’t find something to apply to this problem especially the non-trigonometry one. Can doctor give me a little more clue?

Nothing is wrong. Kurisada has done well, and as mentioned earlier, the answers are equivalent.

Doctor Rick replied (using a picture I’ve replaced with one of my own to correct an error):

Here is my figure for this solution method:

There are several ways to prove that angle COY is 30°.

“Focusing on the doctor’s statement about 30-60-90, then I thought that there is a fixed ratio of the sides of 30-60-90 triangle

I searched it and I found the ratio 1 : √3 : 2″

I had assumed you were already familiar with this fact, as we used it in discussing the previous problem with you. It is easily derived by starting with an equilateral triangle and constructing an altitude (which is also a perpendicular bisector and an angle bisector). This forms two 30-60-90 triangles. The side opposite the 30° angle is half of a side of the equilateral triangle, and hence half of the hypotenuse of the 30-60-90 triangle. The length of the remaining side follows via the Pythagorean Theorem.

“And I take the triangle COY with angles 30-60-90.

Since OC = 1, then OY = (√3)/2, and CY = 1/2.

Then I take triangle BCY.

CY = 1/2 and BY = 1 + (√3)/2.

Then using Pythagoras Theorem, I got BC = √(2 + √3).

The key answer shows that BC = (√6 + √2)/2.

I wonder if I did anything wrong.”

No, you haven’t done anything wrong. As I said last time, this method results in an answer with a nested square root — exactly what you found, √(2 + √3) — while Doctor Peterson’s method gives a sum of roots — as your answer key does, (√6 + √2)/2. And I said that these can be proved to be equal, but this is far from obvious at first! I suppose, therefore, that the answer in the key was obtained by something more like Doctor Peterson’s method.

“I also wonder if what doctor wanted to tell me is as above or not.”

You did fine using this method. If you finish the work by Doctor Peterson’s method, you should obtain the book’s answer.

We’ll get to the direct route to the answer \(\frac{\sqrt{6}+\sqrt{2}}{2}\); but in order to see that the two answers are equal, that is, that $$\sqrt{2 + \sqrt{3}} = \frac{\sqrt{6}+\sqrt{2}}{2},$$ we can just square both sides (having observed that both sides are positive, so that squaring does not lose information): On the left, $$\left(\sqrt{2 + \sqrt{3}}\right)^2 = 2 + \sqrt{3},$$ while on the right, $$\left(\frac{\sqrt{6}+\sqrt{2}}{2}\right)^2 = \frac{6 + 2\sqrt{6}\sqrt{2} + 2}{4} = \frac{8 + 2\sqrt{12}}{4} = 2 + \sqrt{3}.$$ So the two sides are in fact equal.

Continuing,

“I also tried to apply about my previous problem (triangle inside a semicircle), but I can’t find something to apply to this problem especially the non-trigonometry one. Can doctor give me a little more clue?”

In my non-trig solution to that other problem, I constructed the radius equivalent to OC in this problem. If, in figure (b), we give the name F to the other intersection of BO extended with the circle, and construct FC, then triangle FCB is just the triangle inscribed in the semicircle of the other problem. It is a 15-75-90 triangle; its altitude OE is half the radius of the circle, as we discussed in that problem (as this makes the area of FCB half the maximal area of an inscribed triangle). Thus this new problem is nearly the reverse of the previous problem: there we needed to determine the angle FBC knowing the base and altitude of the triangle, whereas now we know the angles and need to determine the side lengths.

So Doctor Rick’s method gives a correct answer, and ties into what we looked at last week.

Now let’s look at the discussion of my method, which was interlaced with that. Kurisada said:

I drew the altitude AD, and found that AD = DC since ADC is 90°, 45°, 45°.

But, I also did : BD x CD = AD^2, resulting BD = AD which I think is impossible as the angles are 90°, 60°, 30°.

I also tried to do AC ÷ AB = DC ÷ AD, but it resulted AC = AB which I think is also impossible due to the same reason as above.

Rick replied:

“I drew the altitude AD, and found that AD = DC since ADC is 90°, 45°, 45°.”

Correct.

“But, I also did : BD x CD = AD^2, resulting BD = AD which I think is impossible as the angles are 90°, 60°, 30°.”

The geometric mean property we discussed earlier [in the semicircle problem] applies only to a right triangle; ABC is not a right triangle. Or am I misunderstanding what you did here? The angles you cite are for triangle ADC.

But that, in fact, is exactly what Doctor Peterson was getting at (in part) — you can use the side ratios for a 30-60-90 triangle to determine your OC, and the side ratios of a 45-45-90 triangle to determine your OB. (This is after you’ve determined AC and AB as you indicated earlier.)

“I also tried to do AC ÷ AB = DC ÷ AD, but it resulted AC = AB which I think is also impossible due to the same reason as above.”

Presumably you are still talking about the theorem about a right triangle, in which there are three similar right triangles. That doesn’t apply here.

Kurisada answered:

I didn’t realise about the fact that the geometric mean is only applicable to right angle so what I did is wrong.

Rick answered (again, I had to replace his picture with one that is labeled correctly):

Doctor Peterson gave you a

link to Wikipediawhich calls the theorem the “right trianglealtitude theorem or geometric mean theorem”. It’s important to be aware of the givens when you seek to apply a theorem!Here is the figure for this solution:

Here, D is the foot of the perpendicular from A to BC, as Doctor Peterson had in mind. It should be obvious that triangle ABD is a 45-45-90 (right isosceles) triangle, since angle ABD = ABC is given as 45° and ADB is a right angle; and also obvious that triangle ACD is a 30-60-90 triangle since angle ACB = ACD is given as 60°. Now, early on, we discussed finding the lengths of AB and AC, so you should know those — do you? You said AB = √2, which is correct; perhaps you never finished finding AC.

Here’s what I said in my second message about that: “For side AC, consider that triangle AOC is isosceles, and construct the altitude to AC.” What do you find? I hope you’ll recognize two more of those 30-60-90 triangles that I had assumed you already understood.

Then CD = AC/√2, and BD = AB/2, by the side ratios for the two “special triangles”. Add these and you’ll get the length of BC, which is what we’re looking for.

Let’s finish the work. Here is a picture with that altitude to AC, OE:

From triangle CEO, we see that \(CE = \frac{\sqrt{3}}{2}\), so $$AC = \sqrt{3}.$$ Then, going back to the previous picture, from triangles CAD and BAD we have \(CD = \frac{\sqrt{3}}{\sqrt{2}} = \frac{\sqrt{6}}{2}\), and \(BD = \frac{AD}{2} = \frac{\sqrt{2}}{2}\), so $$BC = BD + CD = \frac{\sqrt{6}+\sqrt{2}}{2}$$ as before. Those are our final answers.

Now Kurisada was satisfied:

Thank You very much Doctor Rick!

Now I understand about this problem

And sorry because it took too long time

The discussion concluded on January 1.

We do not mind taking time over a problem; we like going deeper to make sure a student understands the concepts fully. Thanks for sticking with this, and have a happy New Year!

…

]]>The question is from 2017:

More Methodical Than Guessing It seems like guessing is involved in many operations that we do in maths. For example, in findingsquare roots, we make guesses as to what they could be. Infactorisation, we think of numbers that can multiply to be another, in a guessing kind of way. Another example is in solving the generalcubic equation. One important step is to simply rearrange an algebraic expression to another, more complex one -- just because it works. (How do we knowwe should re-arrange it that way and not in any other way?) While I only have a high school level of mathematical experience, I love learning about more advanced concepts. But all this guessing is difficult for me to master, especially in complex problems. In middle school, I was very good in geometry, but bad in algebra where, for example, we factorised expressions in a guessing kind of way. All of which leads me to two questions.Are there better optionsfor these mathematical operations?Can we classifymathematical operations into the categories of "guess it first" on the one hand; and on the other hand, "Do steps 1, 2, 3, ... and then you will get the solution directly"?

Interesting question! Are there “guess operations” and “systematic operations”? I answered:

Hi, Egan. There are a number of different things that can be said about this. My first thought is that we need to distinguish between an "operation" (in the sense of what we want to accomplish; for example, find the square root) and an "algorithm" or process or method to perform the operation or otherwise reach the goal (do this, then that, ...). There are usually multiple ways to accomplish a task.Some may involve guessing, while others may not; some may be more efficient than others, while others may be easier to explain. Sometimes guessing turns out to be the most efficient method, even though non-guessing methods are available.Guessing is not inherent to the operation, itself -- just a feature of a particular method you may choose to use. So we can't sort "operations" according to whether they involve guessing.

So we may choose to solve a problem by guessing even though we don’t have to; on the other hand, some things can’t (reasonably) be solved by guessing. We’ll see some examples later.

Second, "guessing" can take several very different forms. You are familiar with algorithms of the "guess and check" variety, where you make a guess, see whether it works, and then make a new guess based on the outcome. Sometimes the "guess" will almost always work, and the "check" is mostly a matter of continuing on with your work. (I am thinking of long division; longhand square roots are similar, once you have the necessary experience.)

In these examples, we don’t merely make a wild guess, but rather estimate intelligently, and the check is inherent in the work. (If we get too large a product, we know exactly what to do to fix it.)

Other processes, such as some factoring techniques, can be done by a quick "guess" (insight) that turns out to be correct, or by systematicallytrying all possible answers(e.g., all pairs of factors of a given number), with the conclusion being either the answer or the knowledge that there is no answer. (Pure "guess and check" can never determine whether you have just missed an answer.) Such a procedure may be done blindly (just plodding through the list) or intelligently (using clues to know what to skip).

I’ll be demonstrating this.

Still other "guess-like" processes are not really guesses at all, butsuccessive approximationsto the correct answer, which may not be possible to find exactly. (Here I am thinking of the "divide and average" method for finding a square root.)

Here we are actually following a routine procedure, starting with a reasonable “guess”, that promises to improve the “guess” at every step.

Let's consider some examples. I've mentioned two different methods for finding a square root, and there are a couple others. This illustrates my first point. One method ("longhand") involves guessing for each digit (which should rarely be far off). Another ("divide and average") starts with a "guess," but goes on from there to repeatedly improve the guess, with no more guessing needed: Square/cube roots without a calculator http://mathforum.org/dr.math/faq/faq.sqrt.by.hand.html

I showed these techniques and others in my post Evaluating Square Roots by Hand.

Long division as normally done involves guessing (I'd rather call it estimating) quotient digits. But you could do it with no guessing at all: Just make a list of multiples of the divisor, and you can look up the closest multiple to the dividend. (We guess to save time, not because it is needed!) Or, you could do what a computer or calculator does: convert to binary, divide in binary (which requires no guessing), and convert back. The method with guessing is easier and faster for humans, but not the only way. See this page: Long Division, Egyptian Division, Guessing http://mathforum.org/library/drmath/view/58858.html

Egyptian division is equivalent to dividing in binary; there, each quotient digit is either 0 or 1, and it is always obvious which to use.

And when I first taught my oldest son to do long division, he made a table of multiples of the divisor, as I suggested here, so that he never had to guess.

Factoring can be done using methods that require various amounts of guessing. Looking in our archives for examples that discuss guessing, I found one that shows a traditionaltrial and error methodwith lots of guessing: Factoring Trinomials: 9x^2 - 42x + 49 http://mathforum.org/library/drmath/view/61570.html There, Doctor Ian refers to another page in which he shows how to factor bycompleting the square, or using thequadratic formula, each of which entirely bypasses guessing (and, unlike the other method, even works when the factors do not involve integers): Factoring Quadratics Without Guessing http://mathforum.org/library/drmath/view/60700.html There is also a commonly-taught method of "ac-grouping," which uses a simpler kind of guessing: Factorization by Decomposition and the Distributive http://mathforum.org/library/drmath/view/77809.html But when I saw the title of the first of these, I immediately saw that 9x^2 - 42x + 49 could be factored using only a single "guess": recognizing that the first and last terms are both squares, we can guess that it MIGHT be a perfect square, of the form (a - b)^2 = a^2 - 2ab + b^2. In that case, the factorization has to be (3x - 7)^2. Then all I have to do is check that by multiplication, and I can determine that it's correct. (If the guess had been wrong, I'd have switched to one of the other methods.)

This last example involves “guessing that” it is a perfect square, not “guessing a number” — a very different kind of guess.

Do you see my point here? There are MANY ways to factor a quadratic, which can range fromno guesses, toone guess, topotentially hundreds of guesses. The optimal strategy is to first "guess" what might be the most efficient method for a particular problem, and then do it. The methods with fewer guesses involve various levels of complexity and risk, as a trade-off to the challenge of guessing.

My standard advice for solving a quadratic equation is to stare at it for ten seconds and decide whether you want to go to the trouble of the usual method (that is, whether guessing pairs of numbers will be efficient), and if not, “Use the formula, Luke!”

Your other example was the cubic equation. I suppose you have read something like this: Cubic and quartic equations http://mathforum.org/dr.math/faq/faq.cubic.equations.html There is actually a formula for this that requires no guessing. Far beyond the scope of high-school algebra, it takes up half a page, at least. I have never tried to master it. Alternatively, you can follow the procedure discussed on our page, which does not involve guessing, but is an orderly process.

On the other hand, we often solve cubic equations by the Rational Root Theorem, which involves listing possible rational numbers that could work; this is yet another sort of guessing, where we first “guess” (hope) that there is a rational root, then “guess” each of a restricted number of possibilities, and then, if that didn’t work, use an approximation method.

Still referring to the cubic,

You also allude to the role of guessing in not just the process, itself, but rather inhow that process was devised. You may be right that this involved some guessing (somewhat like my perfect square guess above, where I tried something that seemed useful, and it worked). That's a very different kind of guessing, which arises in anynon-routine problem-solving. I compare this sort of thinking to finding your way to a goal through a forest you have never seen before. You have to develop anintuitionabout what might work, based on experience using the mathematical "tools" you have learned. There is often no way around that; that's why we call such a problem "non-routine." And that kind of math may be the most useful in real life: when you learn how to solve problems you have never seen before, you are ready for the real world, where nothing is ever quite what you have been taught! Willingness totry and fail and try something else -- to persevere-- is the most important thing you can learn.

Here, with experience we develop a sense of what may work to solve a problem, perhaps by trying to rework a problem into a form to which we know we can apply familiar methods. In other words, we just try to make the problem look more familiar.

As far as learning to do the guessing part, all I can say (apart from the fact that you can often work around it) is that as you gainexperiencein any field, you develop a feel for how things work; and that sense helps you make the right guesses much of the time. How that is done depends very much on the particular field you are trying to master. Perhaps we could discuss that in detail for a particular problem, or kind of problem. Show us one that interests you and how you currently try to solve it, and we can suggest ways to do it better.

Egan wrote back with an encouraging compliment:

Thank you very much, sir, for your detailed answer! I have been using this site for about eight years, and I always recommend it to my friends (especially those with more of a mathematical background), as this is so much better than Quora or Mathematics Stack Exchange. :)

We are quite different from those sites, but I like to think we offer something better on occasion.

]]>Like many questions we get, this one can be solved in many ways. We like to guide a student to whatever solution will fit what they have learned; along the way, we may find various additional methods, and side trips into other topics of interest.

This comes from Kurisada, who has sent us many good problems from his (her?) review of many areas of math, at the end of last December:

Hello Dr. Math

May I know the way of the problem 2(2) below?

Incidentally, I find it interesting how the language used in English-language problems from around the world can vary; this helps me not let “errors” in students’ English bother me. For what it’s worth, here is how I would have worded the problem:

Consider a semicircle with diameter AB = 4, with point C on the circular arc.

(1) The maximum possible area of triangle ABC is ___.

(2) If the area of triangle ABC is half of the maximum, and point C is nearer to point A than point B, then angle CAB is ___.

Here is a picture of what the question is asking:

Point \(C_1\) is the position of C that yields the greatest possible area for triangle ABC, which we are first asked for (part 1); the position shown for \(C_2\) yields half of that area.We want to find angle α (part 2). There is a very quick way to find each of these; but our goal is to help the student discover that he can solve a problem himself, not just to give the easy answer.

As we commonly do for questions like this, I asked for Kurisada’s ideas, while giving hints for possible methods:

There are many ways you might work on this problem; that is one of the reasons we ask you to show your own work. Do you want to use coordinate geometry, perhaps, or trigonometry, or calculus? Some or all of these may be helpful, in addition to ordinary geometry; but you have given no hint as to which of them you know.

You imply that you have the answer to part (1). What is your answer? How did you get it? That will play a role in the next part.

The second part probably requires trigonometry. Do you know whether an exact or approximate decimal answer is required?

From part (1) you can find the height of this triangle ABC. Again, there are several ways I can see to find required parts of this triangle in order to determine the angle. Please make a start, so I can see what methods you are comfortable with, and avoid suggesting ways that might be too hard for you; and also tell me what topics you are learning.

Because my focus was on getting information, I hadn’t given a lot of thought to the actual solution, which doesn’t really require trig. Doctor Rick supplemented my ideas:

Since you have labeled this as a

geometryproblem (and many geometry courses do not include trigonometry, though some do), let me point out that I see a way to solve it without trig. However, you do need to be familiar with the ratios of sides of some “special triangles”, so that you will recognize such a triangle from its ratios.So, what

doyou know in this regard, and what have you done so far?

In fact, the solution to the problem (which asks only for an angle, not sides) is even easier than that. But all of the methods discussed will work, and the best way is always the one you see, not the one you will later wish you had seen!

Kurisada responded with the information we wanted, showing work, and also the given answer (which we often need to know in case it is wrong):

- For the first question, my answer is 4.

I did it by Triangle trigonometry ratio (4/sin 90 = x/sin 45).

Then I got x = 2√2.

Then I found the height = 2.

And area = 4.- In the second question, I’ve tried the same way but I couldn’t get the answer.

I also divided the triangle into two and get sin x = 0.5 {2√(2 – √3) + √(6 – 3√3)} and I stuck there.

I wonder if there is any special ratio in this triangle.P.S.: the key answer shows that the answer is (5π)/12 or 75°.

I prefer either trigonometry or coordinate geometry to deepen my understanding in these topics.

My trigonometry and geometry ability is still basic to intermediate.

The first solution uses the Law of Sines applied to triangle \(AOC_1\). The same result could be obtained by right angle trigonometry, or by merely recognizing OC as a radius.

The second, presumably using the Law of Sines again, needs further explanation.

I answered:

Thanks for letting us know that trigonometry is available for your solution.

You haven’t defined your x’s or stated where you put C in part 2, so it’s hard to follow your work.

For (1), I just see that the maximum possible height equals the radius, 2, so the area is bh/2 = 4*2/2 = 4. Your long way is valid.

For (2), I’m not sure exactly what you’ve done; but it was probably similar to my first attempt, which was not pretty.

The quick way (probably what Dr. Rick has in mind) is to draw the

radiusOC (O being the midpoint of AB) and think about triangles AOC and COB. Some simple trigonometry, or mere recognition of a special triangle as in (1), will tell you what ∠AOC is. No more trigonometry than that is needed.I haven’t tried a coordinate geometry solution; that would probably be similar to my first attempt using trig.

My work for part (1):

Kurisada replied:

I just tried to draw the line OC.

I got AC = 4 cos x from:

AB / Sin C = AC / sin B

4 / sin 90 = AC / sin (90 – x)

And 4 = sin 2x from:

OC / sin A = AC / sin O (the angle O is the same as 180 – 2x )

2 / sin x = 4 cos x / sin (180 – 2x)

And after several similar calculations, I still couldn’t find the answer .

Did I do something wrong?

I answered:

“I just tried the draw the line OC

I got

AC = 4 cos xfrom:AB / Sin C = AC / sin B

4 / sin 90 = AC / sin (90 – x)”

You could just use the right-triangle definition of cosine, rather than apply the Law of Sines to right triangle ABC; but this is correct.

“And 4 = sin 2x from:

OC / sin x = AC / sin O (the angle O is the same as 180 – 2x)

2 / sin x = 4 cos x / sin (180 – 2x) “

Here you applied the Law of Sines to isosceles triangle OAC. But this last equation is a tautology, since sin(180 – 2x) = sin(2x) = 2 sin(x)cos(x), so it doesn’t contribute anything. [In effect, you have proved the double-angle identity!]

I think you are relying on the Law of Sines too much. (

“To a man with a hammer, everything is a nail!”) I would just use right-triangle definitions of trig functions, and geometry. Focus not on what tools you have, but on what you know about the problem.You haven’t used the essential fact about the area of triangle ABC. In fact,

you haven’t told me where C is, though I think you probably know. Use it!From that, you can determine what ∠AOC is. (If necessary, draw the perpendicular from C to AB, and use the resulting right triangle.) What is that angle?

You have seen the other main fact you need to know, namely that ∠BOC = 2∠AOC. Good. From that, you can find ∠OAC, which is what you need.

I accidentally omitted the first key step, which locates point C. This can involve drawing an altitude from C, and noting that it must be half the altitude of \(C_1\). Only then would I look at triangle AOC.

Kurisada was using the Law of Sines everywhere. My comment about “a man with a hammer” is a familiar saying: In solving a problem, you need to focus on the needs of the problem, rather than on what methods you have been learning most recently. And yet, the “wrong” tool can still get the job done.

My quicker way to relate AC to ∠A (Kurisada’s *x*), as well as my way to locate point C, involve dropping a perpendicular, so let’s do that now:

Because this triangle has the same base as the first, but half the area, its altitude must be half the radius. Triangle ABC shows that \(b/4 = \cos(\alpha)\), so \(b = 4\cos(\alpha)\); my new little triangle shows that \(1/b = \sin(\alpha)\), so that \(b = 1/\sin(\alpha)\). But neither of these is necessary for the solution.

Hi Doctor Peterson

I found the way and the answer is match with the key answer.

But I’m not sure whether it is the same as what you have expected me to do.

Here is my way:

x = ∠OAC

Since ∠COB is 2x, then ∠AOC is 180 – 2x

I calculate the area of AOC as below:

A = 1/2 . AO . OC . sin (180 – 2x )

A = 1/2 . 2 . 2 . Sin (2x )

A = 2 sin 2x

Then I calculate the area of BOC and found that it has the same area with AOC.

Total area = area BOC + area AOC and the total area is 2.

Hence,

2 = 4 sin 2x

Sin 2x = 1/2

Sin 2x = sin 30

2x = 30 + 360k or 150 + 360k

x = 15 + 180k or 75 + 180k

If k = 0

x = 15 or x = 75

If k= 1

x = 195 or x = 285

The possible values for x are 15° or 75°.

If x = 15, then ∠AOC = 150° and this is wrong because ∠AOC is <90°.

If x = 75, then ∠AOC = 30 and this is possible.

Thus the answer is 75°

My questions:

- Is my way in choosing whether x is 15 or 75 correct or there is actually another way to do it in a more proper way?
- Is what you wanted to tell me the same as I did?
Thank you

Some of these steps are longer than necessary, but that is to be expected. I replied:

Well done. You have explained everything very thoroughly.

- Does my way in choosing whether x is 15 or 75 correct or there is actually another way to do it in a more proper way?
You could just use the fact that angle A > angle B, and they are complements, so you have to choose the one that is an acute angle greater than 45°. But your way is entirely proper.

- Does what you wanted to tell me the same as I did?
Here is my quick way (which may or may not be the same as Doctor Rick’s):

Constructing the perpendicular CD to AB, we know that CD = 1 (because it is the altitude of a triangle with base 4 and area 2); so triangle OCD has leg CD = 1 and hypotenuse OC = 2, so that ∠AOC = 30°. Therefore ∠BOC = 150°, and ∠BAC is half of that, namely 75°.

I used no trigonometry here, which fits your categorization of the question under Geometry rather than Trigonometry. Yours is a very nice trigonometrical method.

Doctor Rick joined us again:

Doctor Peterson’s “quick way” is indeed exactly the non-trig method I found. It is not the

firstmethod I used, however. In order to emphasize that we can’t talk about “theway” to solve the problem (as if there were a single correct method), let me show you my first method, even though it is not as nice as yours.I started as in the “quick way”, constructing the altitude CD to AB and noting that it must have length 1. Then I used the geometrical theorem that the altitude to the hypotenuse of a right triangle is the geometric mean of the segments into which it divides the hypotenuse:

AD × DB = (CD)

^{2}= 1x (4 – x) = 1 (where x = AD)

x

^{2}– 4x + 1 = 0x = 2 ± √3

Note that this gives us the lengths of

bothsegments into which CD divides AB. The lesser of these is AD, while the greater is DB (because we’re told that C is nearer to A than to B). Thus, using trigonometry on triangle ADC,tan A = 1/(2 – √3) = 2 + √3

My calculator then told me that A = 75°. With a little more work we could confirm that the tangent of 75° is exactly 2 + √3. However, at this point I figured that with such a nice answer, there was probably a nicer solution! That’s when I looked for, and found, the non-trig solution.

As always, we have multiple solutions, some easier and others more … interesting!

Kurisada had a little more to say:

I tried the non-trig method before but I realized that I was too focus in finding the AC which made me stuck. Thank you doctor for the explanations

And about the tan A = 2 + √3, I barely used tan before and that made me not to think about using tan. Thank you doctor for the explanations

I learned very much from this question and answers

One more time let me express my gratitude to Doctor Rick and Doctor Peterson ^^

One more thing:

May I know what is AD × DB = (CD)

^{2}formula called?I would like to search more about geometry rules.

I explained:

Doctor Rick used the term “geometric mean”; searching for that plus “hypotenuse” gave me a good reference for the theorem he mentioned:

https://en.wikipedia.org/wiki/Geometric_mean_theorem.It is also called the “mean proportional”, which is described here:

That’s yet another nice side-trip from this problem. It turned out that we didn’t need to calculate any lengths except the 1; but there is a lot more to find along the way.

Next week we’ll look at a somewhat related problem.

]]>I’m looking for past questions that led to deep discussions. This week, we have a case where a student realized he was doing algebra by rote, not thinking about what variables really mean. This realization was triggered by a step that many students stumble over, where parameters change their role. I have seen many calculus students struggle with this, so it deserves close examination.

Here is the question, a long one from 2010:

Un-learning Unknowns Today, we started a unit on linear programming, but we first had a review ofsystems of equations, specifically for the upcoming SAT. We were looking at this problem: Each time Sue rents a bicycle, she pays a fixed base cost plus an hourly rate for the time the bicycle is rented.Last Saturday she paid $12.00 to rent a bicycle for 6 hours, and yesterday she paid $9.50 to rent a bicycle for 4 hours.Which of the following equations shows the total cost C, in dollars, for Sue to rent a bicycle for n hours? a) C = 1.25n b) C = 1.25n + 4.50 c) C = 2.00n + 2.50 d) C = 2.50n + 2.00 e) C = 4.50n + 1.25 First, I said ...12.00 = m(6) + b... where m is the slope of the line, or rate; and b is the fixed base price, or the y-intercept. Then, I said9.50 = m(4) + b. Six and 4 are the number of hours. For the first equation, I set ... b = 12 - 6m ... and I substituted in this value into the second equation, so I got 9.50 = 4m + (12 - 6m). I then combined like terms and solved for m, arriving atm = 1.25. This madeb = 4.50. Therefore, answer b) must be right.

The work is correct; Sam solved the system by substitution, and did it well. But then he stopped and thought about it, which is great:

But then I stepped back and thought for a moment. I have been doing systems of equations for so long thatI have simply been going through the motions. As I looked over my work, I became more and more confused. Please correct me if I am wrong, but I have always thought of systems of two-variable linear equations as lines to plot on a graph. The point where the lines intersect is the pair of coordinates that works for both graphs, and this is the solution to the system. For example, in y = 5x and y = -3x + 2, you are assuming that y and x are the same value, and this allows you to then substitute or eliminate. (It would be very helpful if you could comment or help me with this statement I just made, forI never fully understood how you can substitute y or x values if the equations are different.) To solve this problem, I set up two equations. One was 12.00 = 6m + b and the other was 9.50 = 4m + b. My main confusion comes from the fact that a system of equations is the point where the two lines intersect. But how can you plot the lines here and see where they intersect?There is no y or x value, and you do not know m or b.My second question came up when I wondered about plotting the lines. I visualized some lines, but I thought they were the same. What I am trying to say is that the two equations ... 12.00 = 6m + b 9.50 = 4m + b ... alreadyhave the same y-intercept and the same slope. They have different x-values, but the y-values correspond in a way that is proportional. If the two lines are the same, then how can you have an intersection point? Essentially, the y-intercept and the slope of a line are the defining qualities of a line. (Right? Please help me with this.) If that is true, and we have two equations with the same slope and y-intercept, then how come they are not the same line? I mean ... 126 = y + 4x 96 = y + x ... are different because of the slope and y-intercept, right? I am a slow learner, but I understand stuff when it is detailed, so please thoroughly explain the answer. THANK YOU SO MUCH because I am really confused right now.

Sam is overthinking; but the solution is not necessarily to back off and underthink! This is a wonderful question, because he wants to understand more deeply, and be able to think accurately.

I responded, starting with the fact that his work was correct:

Let me start by reassuring you that you did get the right answer. Now, on to your questions. Thinking about the meaning of what you're doing is a good thing!

First, we have to deal with the question,

“For example, in *y* = 5*x* and *y* = -3*x* + 2, you are assuming that *y* and *x* are the same value, and this allows you to then substitute or eliminate. (It would be very helpful if you could comment or help me with this statement I just made, for **I never fully understood how you can substitute y or x values if the equations are different**.)”

I answered, pointing out that his explanation was just right:

That's exactly it: you'reassuming both equations are true(since you're assuming x and y are the solutions), so you can replace either side of one equation with the other side. That's what "equal" means -- they're the same value. So whichever you use, you get the same result.

His big problem starts when he says,

“But **how can you plot the lines here** and see where they intersect? There is no *y* or *x* value, and you do not know *m* or *b*.”

“What I am trying to say is that the two equations

12.00 = 6*m* + *b*

9.50 = 4*m* + *b*

already have the same *y*-intercept and the same slope. They have different *x*-values, but the *y*-values correspond in a way that is proportional.”

He is focusing on the names of the variables; here, *m* and *b* are a slope and a *y*-intercept, but *not* of these lines!

I continued,

We tend to use x and y so much that we forget that you can use ANY variables for the unknowns! In your case, the unknowns that you are looking for are not x and y, but m and b. Soif you were to plot the lines, the axes would be labeled m and b, not x and y. Since m and b are now the unknowns,they are not the slope and intercept of the lines; they are the variables! The lines are ... 6m + b = 12.00 4m + b = 9.50 ... which you can think of as if they were 6x + y = 12.00 4x + y = 9.50 Each has its own slope (-6 and -4, respectively), and its own y-intercept (12 and 9.5, respectively) in terms of the variables m and b.

At this point in the problem, we are just solving a system of equations for variables *m* and *b*. We would not normally graph the equations, but would just solve in the abstract. Sam’s problem is in confusing this with one representation of the system, as a pair of lines. To see it that way, we have to set aside the meaning of the variables and just graph with one variable (I chose *m*) on the horizontal axis, and the other (*b*) on the vertical axis. Here is the result:

We can see that *m* = 1.25 and *b* = 4.5 at the intersection; so these values of *m* and *b* satisfy both equations. So the answer to the question is \(C = 1.25n + 4.50\), which is an entirely different line, where *m* and *b* have changed meanings. Here is that graph:

(Note, by the way, that what the problem is asking for can be thought of as finding the equation of the line through the points A(4, 9.50) and B(6, 12); this could have been done in several other ways, such as finding the slope as \(\displaystyle m = \frac{12 – 9.5)}{6 – 2} = \frac{2.5}{2} = 1.25\), and then using this in the point-slope form, \(y – y_1 = m(x – x_1)\). I put the two points on the graph to demonstrate that the solution is correct.)

This idea of taking the slope and the intercept from the original problem andturning them into the variables in a new problem(the system of equations) trips up a lot of people. Let algebra help you see abstractly: it doesn't matter what you call the variables in order to get this completely. Some textbooks, in fact, use a lot of unknowns other than x and y precisely so that students don't get too used to recognizing a kind of equation by the letters, but instead learn to understand them by the relationships they involve.

So Sam needs to avoid thinking of *m* and *b* as always meaning slope and *y*-intercept.

Sam wrote back:

Dr. Peterson, Thank you for your answer, but I need a bit of clarification. First,I am still a bit wary of the idea of changing the variables and equations. Do you think you could give me some examples? If so, that would be great. Along with the problem I gave, when you say ... 6m + b = 12.00 4m + b = 9.50 ... do you mean that the x-axis is the slope value and the y-axis is the y-intercept value? That would mean that a change in slope will cause a change in the y-intercept. I do not think this is correct, for how could you add the 6 times the slope if you already have a slope on the x-axis? Also, with your example, how can you change the x-value that was given to you as the slope? In addition, why are the lines ... 12.00 = 6m + b 9.50 = 4m + b ... not the same line?They both have the same slope and y-intercept.I'm still confused, but I see where you might be going, and I thank you for your patience and help. Thank you for all you do at Dr. Math!!

I replied, offering a different example in the hope of breaking the connection to *m* and *b*:

As I said, this is a big hurdle for students, so I'm not at all surprised that you still have problems with it! Let's try a different approach. Here is a system of equations; can you solve it? 5a + 3b = 9 3a - 2b = 13 Of course you can. And when you do, you do not care at all whether I used a and b, or x and y, or some other pair of variables. The names don't matter;an equation is all about the relationship between the variables, regardless of their names or meanings. That is, it is abstract, and not tied to what the variables are. Let me know if you're okay so far, and I'll continue answering your specific questions after that.

Sam replied:

I think I am OK with that so far, but would you have to denote which variable would go on the x axis and which variable would go on the y axis?

Good question! I answered:

You COULD choose to graph the equations if you wish -- butthe problem doesn't necessarily involve graphing. That is just one way to represent it. The problem itself is just about finding a pair of values that satisfies both equations. And if you did graph them, you could chose to put either variable on either axis. There is nothing inherent in the problem that makes one a better choice for the "independent variable" over the other. In other words,graphing is one TOOLyou can use to interpret the problem -- namely, as the intersection of two lines. But the problem itself doesn't require that tool, either for solving it or for understanding it.

This is an essential point. A mathematical concept typically can be modeled in a number of different ways. We must not confuse the concept with one model. This is another aspect of the abstractness of mathematics.

I continued:

Now let's look back at your problem. You are trying to answer this question: Each time Sue rents a bicycle, she pays a fixed base cost plus an hourly rate for the time the bicycle is rented. Last Saturday she paid $12.00 to rent a bicycle for 6 hours, and yesterday she paid $9.50 to rent a bicycle for 4 hours. Which of the following equations shows the total cost C, in dollars, for Sue to rent a bicycle for n hours? That problem isnot about a graph, but about rental costs. Your approach to solving it was torepresent it as a problem about finding the equation of a line by looking for the slope and intercept. (There are other ways to model the problem, but you chose this because it is familiar. A standard method of problem solving is to model a new problem as one that is familiar, so you can use tools you already have.)

So Sam has shown that he can choose a representation for a problem. Good!

You also recognized that "a fixed base cost plus an hourly rate" means that the cost is a linear function. Suppose that function is C = mn + b or, using the variables you are more familiar with, y = mx + b Again, you think of it this way just because you want to turn the problem into one you know how to solve. All these variable names -- x, y, m, and b -- arethings you chose to introduce, along with the idea of graphing. (By the way, the variables m and b here areparameters, which means they are considered fixed for the sake of the problem, namely graphing a line, whereas x and y are considered to actually vary. But really they are all just variables -- letters representing numbers you don't know. They just play different roles.)

Sam also recognized that the letters *n* and *C* played the roles commonly taken by *x* and *y*. The concept of a parameter may be the hardest here.

Now your problem becomes one of FINDING the parameters m and b, on the basis of two data points, namely that (x, y) = (6, 12) and (4, 9.5) are known to be points on the line y = mx + b. So you can write two equations by replacing x and y with those pairs: 12 = 6m + b 9.5 = 4m + b This changes the models. Wherea moment ago we had m and b as parametersdescribing a line (with x and y the variables),now they are unknown quantities you want to find!So you temporarily forget the original meaning of m and b, and justthink of this as a new problemof solving a pair of equations. It would look more typical, from your experience, if the equations were ... 6x + y = 12 4x + y = 9.5 ... or ... x + 6y = 12 x + 4y = 9.5 ... depending on which of m and b you think of as "x." But it really makes no difference. The point is that m and b are now just unknown numbers we are trying to find.THEY ARE NOT A SLOPE AND AN INTERCEPT at this point; in particular, they are not the slope and interceptof these two lines! In fact, the slope of the line ... 6x + y = 12, or y = -6x + 12 ... is -6. But that doesn't matter, because we don't even have to ask that question unless we choose to solve this system of equations by graphing. We just follow the usual methods to solve the equation and find what m and b are (namely 1.25, and 4.5). THEN we go back to the equation where we first used them, and put these values in their places: C = mn + b becomes C = 1.25n + 4.5

This takes us all the way through Sam’s work.

Now let's answer the questions you asked last time: Along with the problem I gave, when you say ... 6m + b = 12.00 4m + b = 9.50 ... do you mean that the x-axis is the slope value and the y-axis is the y-intercept value? That would mean thata change in slope will cause a change inthe y-intercept. I do not think this is correct, for how could you add the 6 times the slope if you already have a slope on the x-axis?If you chose to graph the equation6m + b = 12, putting m on the horizontal axis and b of the vertical axis, then the graph would indeed showhow the slope OF THE ORIGINAL EQUATIONy = mx + bwould force the y-intercept to changein order for the line to pass through (6, 12). That's because this equation represents the constraint that the line must pass through this point; if you imagine the line as a straight stick pinned to the plane at that point, and rotated that stick (changing the slope), it would change where it crossed the y-axis. For example, if m were 0, then the line would be y = 0x + 12 (so that y = 0*6 + 12 = 12 when x = 6), and so on. But in solving the problem, you never stop to think about this level of meaning (and it doesn't matter) because at the point where you are working with the equation 6m + b = 12, m and b are nothing more than two numbers you are trying to find. The problem has been abstracted from its original context. This is the power of algebra: we can turn any sort of problem into a problem that we can solve without having to think about the meaning of the quantities involved.

Here are graphs of several lines passing through (6, 12) at different slopes, showing how the y-intercept changes. If I increase the slope by 1, the intercept decreases by 6.

You also asked: In addition, why are the lines ... 12.00 = 6m + b 9.50 = 4m + b ... not the same line? They both have the same slope and y-intercept. As I've said this time, m and b are NOT the slope and y-intercept of either of these linear equations;they WERE the slope and y-intercept of a different line, y = mx + b or C = mn + b. But for the purposes of solving for m and b, they are just the two variables in this system of equations. We don't think of them as slope and intercept at this point, becausewhat they were originally is irrelevant to the work we are doing. They are the slope and intercept of a line we are not currently paying any attention to. Does that help at all?

It did. He answered:

Dear Dr. Peterson, I want to say thank you for all of your helpful responses. They really helped me understand the material and I appreciate the patience and depth of knowledge you have, along with all the other volunteers at Dr. Math!

That’s what we like to hear.

]]>This week’s question, asked in January on the new site, will take us through some tricky areas of calculus, and also give a glimpse both of the value of quoting the entire problem you are working on when you ask for help, and of the interesting side discussions we can get into when that rule is violated! Several of us got involved in this discussion.

Kurisada, who has had a lot of interesting questions in a variety of fields, asked this:

I was asked to

find the area of y = 5x^4 – x^5with the range of x: 0 – 5.And I found the answer: 3125/6.

Then I was asked to find the area of the same function with the range of x: 0 – 6.

But the answer is 0.

While when the range of x: 0 – 7, the answer is not 0.

May I know

why the answer can be 0?Doesn’t it must be >= 3125/6?

How can the area of a larger region be smaller? And how can an area in fact be zero?

Doctor Fenton answered:

Hi Kurisada,

Remember that when using integrals to compute the area between a curve and the x-axis,

all area below the x-axis will be computed as a negative number. When the area above the x-axis is numerically equal to the area below the x-axis, the total area will be 0 because the two areas have opposite signs.

Note the subtle restatement of the problem: Doctor Fenton assumes that rather than “find the area of the equation” (which is not mathematically meaningful), it must have been “find the area *between* the curve and the *x*-axis”, which is the typical form of such a question. Also, in such a question, areas below the axis are taken as negative, because that is what integration inherently does. A definite integral in effect adds areas of little rectangles, each a width (Δ*x*) times a height (*y*). If *y* is negative, that area is negative.

And this is how you can get a total area of zero, when the positive and negative areas cancel one another out.

But more was needed. Kurisada wrote back, asking about a different, similar problem:

Hi Doctor.

I am still not understand the mechanism.

I saw another function: y = (1/x^3) – 8 and I found two of the areas, that are from interval 1/4 to 1/2 and 1/2 to 1. I got 4 and -2.5.

And when I try to find the total, I got 6.5 (it is 4 + 2.5).

In this question, may I use the same way? Or maybe my way above is wrong? (I checked the key answer and my answer is true.)

Checking this example, we find these regions above and below the axis:

The integrals (signed areas) are 4 and -2.5; in this problem, they are evidently not asking for the **total signed area**, but for the **total absolute area** (counting both regions as positive).

So now we have **two different kinds of questions** being asked about similar functions.

As Doctor Fenton didn’t respond quickly, I jumped in with some thoughts about wording:

Hi, Kurisada.

It may be a difference in how the question is phrased.

Please quote exactlyhow your problem was worded.Different authorsmay also mean slightly different things.According to some, the area UNDER a curve is found by merely integrating, and takes sign into account, so that the answer of 0 is appropriate for your area from 0 to 6. Some authors might ask for the area of the region BETWEEN (or BOUNDED BY) the curve and the x-axis, intending an UNSIGNED area, so that you would add the absolute values of the separate areas found by integration, as in your second example.

Here is an example of a site that seems to be

inconsistentabout this, giving a negative answer to one problem, but taking absolute values for another:

https://revisionmaths.com/advanced-level-maths-revision/pure-maths/calculus/area-under-curveSome use various ways to explicitly indicate what they want, such as “

net area” here:

http://www.mathwords.com/a/area_under_a_curve.htmI would check how your own book states these problems, and see if they tell you how to distinguish the two types of question.

The first of these, in its first example, asks, “What is the **area** **between** the curve \(y = x^2 – 4\) and the *x* axis?”. After giving the answer -10.67, they add, “Note: the area is negative because it is below the *x*-axis. Areas above the *x*-axis, on the other hand, give positive results.”

In their second example, they ask, “Find the **total area** between the curve \(y = x^3\) and the *x*-axis between *x *= -3 and *x* = 2.” This time, they add absolute values, getting the answer 8. Presumably the word “total” is the cue to the difference.

The second page uses the term “**net area**” for the signed area, which turns out to be zero, and doesn’t give an example of “total area”.

So both wording, and the usage of a particular author, can make a huge difference.

Kurisada replied:

Hi Doctor

Does it mean that,

if the area is not bounded by anything, the area we interpret is only when the function is positive (And we subtract it with the negative one)?And the reason we subtract it is because we need to find the

net positive area?I was thinking if there is no bound, the area should be infinite.

My book only explained about the

area between the curve(s) (and x or y axis).The question shows the curve and the shaded area (that is between the curve and x axis).

Then it first ask me to

find the area from x = 0 to x = 4.Then the question ask me to

evaluate if the range is from x = 0 to x = 6.

I’m not sure what was meant by “not bounded by anything”; obviously an “area” without a boundary is infinite. My guess is that there is some language ambiguity about the use of “bounded” in these problems.

Doctor Fenton picked up the question again, discussing the topic in general:

The general description of the problem is to

find the area between two curves over a bounded interval. Usually, these are two graphs of the form y=f(x) and y=g(x) over the interval [a,b] (with a < b), or graphs of the form x=h(y) and x=k(y) for y in the interval [c,d].When the graphs are y=f(x) and y=g(x), then

one curve will be above the other over some interval. If f(x) > g(x) over [a,b], then the area bounded by the two curves will beb

∫ f(x)-g(x) dx .

aThis will be a positive number, since f(x) > g(x) on the interval. However, if the two graphs intersect, then they may change places as to which is the upper curve and which is the lower curve. If f(x)≥g(x) for a ≤ x ≤ c but f(x)g(x) over c ≤ x ≤ b, then y=f(x) is the upper curve on [a,c], and the lower curve on [c,b]. In this case, the

total areaisb c b

∫| f(x)-g(x)| dx = ∫ f(x)-g(x) dx + ∫ g(x)-f(x) dx .

a a cTo find total area, you always subtract the lower curve from the upper curve when integrating.

If you just integrate the difference f(x)-g(x), you are getting the

net area, since the integral will be negative on intervals where g(x) is the upper curve.If you are given only one function on an interval, then you need to remember that the x-axis is a curve, namely y=0 (so g(x)=0 on the interval).

This might make more intuitive sense if you think of the

position/velocity interpretationof the integral: if t is time and f(t) is the velocity at time t, where the particle is moving to the right when f(t) > 0 and to the left when f(t)<0, thenb

∫ f(t) dt

agives you the

final positionat time b if the particle started at time a. If the integral is positive, it finishes to the right of where it started; if the integral is negative, it finishes to the left of the starting position; and if the integral is 0, it finishes where it started. You don’t have to keep track of where the motion is to the right or left – the integral does that automatically, because the integral is positive on intervals where the integrand is positive, and negative on the intervals where the integrand is negative.If you want the

total distance traveled, however, you need to computeb

∫ |f(t)| dt .

aThen you need to determine where the particle changes direction and compute integrals over the intervals where it is always moving in the same direction separately. That is, you compute

d

∫ f(t) dt

con intervals [c,d] where f(t) ≥ 0, but compute

d

∫- f(t) dt

con intervals [c,d] where f(t) ≤ 0. Then you add the separate results.

Does this help?

Kurisada replied:

I understand about Doctor’s explanation.

But I still don’t get about the area = 0 (especially after seeing the graph).

The graph is as attached below:

If I need to find the area from x = 0 to x = 6, which area should I search (in the graph above)?

Can Doctor please mark the area? (I began to realise that what makes me didn’t get it is the lack of understanding in the graph).

I’ve held off on showing the graph of the function until now, because this will be important here. Doctor Rick jumped in at this point:

Hi, Kurisada.

Here is a graph of y = 5x

^{4}– x^{5}, the first function you talked about. Is that what your graph shows? It’s hard to be sure, becauseyou chose scales for the axes that don’t show it very clearly.On my graph you can see

one region above the x axis, between x = 0 and x = 5, andanother region below the x axis, between x = 5 and x = 6. The areas of these two regions are equal; you can’t see this exactly from the graph, but you can see that it is reasonable. The integral of the function from 0 to 6 equals 0 because the integral from 0 to 5 is that area, while the integral from 5 to 6 is the negative of that area.If you still have questions about this particular problem, we’ll need further information.

Doctor Peterson asked you for the exact wording, and from what you’ve said, evidently it included a figure. Could you show us that figure?We can’t be sure what it meant by “the area”, especially for the second part, without seeing the figure. (We still might not besure, but at least this should help.)

Although graphs are not required to solve problems like this, an accurate graph can be very helpful (and an inaccurate graph can be very misleading).

Finally, Kurisada answered the request for the exact problem:

Here is the question:

The curve now makes sense to me after Doctor sent it (I use phone app and I cannot change the scale) (and the straight line confused me).

So does it mean that if I found the total area, it will be 7776 + 7776 = 15552?

The graph is not precise, and doesn’t attempt to cover up to *x* = 6; but the wording is the important thing to observe. Doctor Rick wrote back:

Showing us the actual problem changes a lot! You were never

askedto “find the area of y = 5x^4 – x^5 with the range of x: 0 – 5 … [and then] to find theareaof the same function with the range of x: 0 – 6″ as you said at first.Rather, you were asked first to calculate “the

area of the shaded region enclosedby the curve and the x axis” — which is quite clear, given the figure. It would be meaningless to “find the area of y = 5x^4 – x^5”; we can only find the area of a region, not of a function.Since the region shown is above the x axis and below the graph of the function, it is found simply as the integral from 0 to 5. However, I do not get 7776 for this integral, as your last message implies. I get 3125/6 = 520 5/6, as you imply at the end of your first message.

Part (iii) of the problem does not mention

areaat all; therefore we don’t need to get into the variations among textbooks that Doctor Peterson mentioned. It only asks you tocarry out the integralfrom 0 to 6, and “comment on your result.” You found the result of that integral to be zero; and we have discussed how that can be.Using the correct values for the integral from 0 to 5 (3125/6) and for the integral from 5 to 6 (-3125/6), the correct result for the total area of the two regions in my figure is the sum of the absolute values of those integrals: 3125/6 + 3125/6 = 3125/3 = 1041 2/3.

Kurisada closed out the discussion:

I’m really sorry Doctors!

I’ve always been thinking that it means the area.

But now I understand if I was asked to find the area.

Thank you very much Doctor Rick, Doctor Fenton, and Doctor Peterson for the explanations!

As I said at the start, the discussion demonstrates the value of showing the exact wording of a problem from the start; but on the other hand, not doing so led us into a deeper view of what is going on.

]]>