We’ve been looking at some issues involving frequency distributions and the classes used in them. Let’s look at a related concept with some similar issues, namely the **cumulative distribution function (CDF)**, also called an **ogive** (more on that name at the end of the post!).

Here is the question, from 2014:

Ogive, More or Less I'm doing some exercises onogivewhich start with a graph showing information in acumulative frequency distribution table(also known as cumulative frequency curve). I think I've found an interesting question. Here's some information about the time it takes 10 runners to complete a track: Time | Upper class boundary | cumulative frequency ------------------------------------------------------------- 7-9 seconds | 9.5 | 0 10-12 seconds | 12.5 | 3 13-15 seconds | 15.5 | 7 16-18 seconds | 18.5 | 9 19-21 seconds | 21.5 | 10 One of my exercises asks me to find x, where 60% of runners took more than x seconds to complete the track. I reasoned that 60% of 10 runners is 6 runners. And if 6 runners took more than x seconds, that means 10 - 6 = 4 runners took less than x seconds. I didn't draw out the ogive, but let's say from the ogive I found that x is 13.5s. Now, the interesting question is 13.5s should also mean 6 runners took 13.5s or more AND 4 runners took 13.5s or less But the value of 13.5 seconds appears twice, as the boundary for both classes. How weird is that?! How can we say that 4 runners took less than 13.5s but not 13.5s or less? When we draw out an ogive, the value of any information that we find from the graph seems to have more than one meaning: equal, equal or less than, equal or more than, less than, or more than the value we've found. Is there anyway to differentiate them? Could you please help? Thanks!

Before we dig into the question itself, let’s pay attention to the terminology, which not everyone will be familiar with.

The word “ogive” (pronounced “oh-jive”), in this context, is another word for a cumulative frequency graph, a visual representation of a cumulative frequency distribution. In this case, we might have started with a frequency distribution like this:

Time | frequency --------------------------- 7-9 seconds | 0 10-12 seconds | 3 13-15 seconds | 4 16-18 seconds | 2 19-21 seconds | 1

This is written in such a way that time is taken as a discrete quantity measured in whole seconds; no runners took less than 10 seconds, 3 took 10, 11, or 12 seconds; and so on. To make the cumulative distribution, we add the frequencies up through a particular class, by adding that class’s frequency to the cumulative frequency before it:

Time | frequency | cumulative ---------------------------------------- 7-9 seconds | 0 | 0 10-12 seconds | 3 | 3 (0 + 3) 13-15 seconds | 4 | 7 (3 + 4) 16-18 seconds | 2 | 9 (7 + 2) 19-21 seconds | 1 | 10 (9 + 1)

Therefore, the 7, for example, means that 7 runners took anything up through 15 seconds. A graph would look like this, where I am using the class boundaries as Ted did in his question, rather than the class limits as some authors would do):

Ted’s question is about whether the ogive represents the number less than a given value, or less-than-or-equal, and how to handle those boundaries.

I answered:

There are several things you are missing, all of which are important if you want to fully understand what you are doing. You say that the meaning of the ogive (CDF) is not clear; but meanings are determined by definitions! What definition have you been given? Here is one: http://en.wikipedia.org/wiki/Cumulative_distribution_function The cumulative distribution function of a real-valued random variable X is the function given by ... F_X(x) = P(X <= x), ... where the right-hand side representsthe probability that the random variable X takes on a value less than or equal to x. The probability that X lies in the semi-closed interval (a, b], where a < b, is therefore P(a < X <= b) = F_X(b) - F_X(a). In the definition above, the "less than or equal to" sign, "<=", isa convention, not a universally used one(e.g., Hungarian literature uses "<"), but is important for discrete distributions. This says that although some sources would use "less than," the standard definition uses "less than or equal." Assuming your class uses the standard definition, there is no ambiguity.

We have seen this several times before, that a convention may not be shared by all mathematicians (not to mention people who merely use math without fully understanding it). In the same way that language varies (I suspect that “ogive” is not nearly as common in the U.S. as in some other countries), definitions themselves can vary.

But we’ll be seeing evidence that Ted’s class does use the “usual” definition.

Second, your example is written assuming that the times are integer numbers of seconds; that is why classes are listed as "7-9," "10-12," and so on, rather than as "7 < x <= 10," "10 < x <= 12," and so on, which would cover all possible values. In this case, you use a class boundary like 9.5 in part to avoid the "or equal" issue:NO value will actually equal the class boundary. So your issue is irrelevant when looking at the actual data. (It would not be irrelevant for other problems involving continuous distributions, but you may not have learned about those yet.)

As I mentioned above, the data are given as a discrete distribution, which is why Ted used boundaries halfway between the limit (e.g. 9.5 between 9 and 10). But this implies that there is no ambiguity in talking about all times less than, or less than or equal to, one of these boundaries, because no time is exactly on one.

What leaves me a little puzzled is that the definition of the CDF fits with continuous classes defined like \(7\lt x \le 10\), but as we’ve seen in recent posts, it is common instead to define them like \(7\le x \lt 10\). I find that to be true among the pages I’ve looked at in writing this post to confirm common usage of the CDF. (On the other hand, several pages I’ve looked at about the ogive, specifically, *have* used \(7\lt x \le 10\). I’ll look into this below.)

Third, when you write "60% of runners took more than x seconds," and then attempt to find x, you are actually asking for an approximation, not an exact value. This is because a grouped distribution IS an approximation (ignoring individual values), so that the ogive you draw for this problem is composed of line segments that assume the actual data are uniformly distributed within each class.

In fact, many of the sources I find for drawing an ogive specifically mention that the curve is to be “hand-drawn” (not perfect straight lines), suggesting that the point of the graph is to indicate (subjectively) what the CDF might look like if we used the original data rather than the classes. That is necessarily approximate.

So concern about precision is misplaced; being off by one would not really matter.

I haven’t really answered his question yet. It will help to make things more concrete:

Let's actually draw the ogive, and consider what it shows us. I'll assume you have been taught to draw the ogive like this, joining given points by straight lines: 1.0+ * | * | / | * | / 0.5+ / |........./................... | * | / | / 0+--*-----+-----+-----+-----+-- 9.5 12.5 15.5 18.5 21.5 There is a straight line from (12.5, 0.3) to (15.5, 0.7). You are looking for the point at which this line crosses the horizontal line y = 0.4 (since if 60% took MORE than x seconds, then 40% took LESS THAN OR EQUAL TO x seconds). Algebraically, you can find that x = 13.25. (If you just drew a graph and measured it, you wouldn't be quite so precise, and your 13.5 may be reasonable.)

This is the graph I drew more exactly above, but this time as a probability distribution, with the vertical axis marked as percentages. Here is my graph with that 40% line added in:

Note also that I quietly corrected a little mistake Ted had made. He said, “if 6 runners took more than *x* seconds, that means 10 – 6 = 4 runners took less than *x* seconds.” The last phrase should have been “took less than *or equal to x* seconds,” which is the proper negation.

Incidentally, what we’ve done here is essentially what I discussed in Finding the Median of Grouped Data.

Now we use the careful definition we had found to interpret what this intersection means, using the points I made earlier:

So the ogive says that P(x <= 13.25) = 0.4 The answer is "about 13.25 seconds." What this means is thatabout4 runners tookless than or equal to13.25 seconds, and about 6 took more than 13.25 seconds. It doesn't matter much, though, whether we say "or equal," because (a) the times were measured only to the nearest second anyway; and (b) we don't know the actual times of the 4 who took between 12.5 and 15.5 seconds -- so any answer we give is a guess. It's entirely possible that all 4 had times of 15 seconds, and none took 13 or 14, so that in fact 3 had times less than or equal to 13.25, and there is NO time that EXACTLY 4 of the 10 runners took no more than. This is why I say thatanything you determine from an ogive of a discrete distribution will be an approximation, based on a guess about the actual distribution of individuals. Therefore, your concern about "or equal" is swallowed up in bigger inaccuracies.

This is a point I’ve made repeatedly recently: grouping loses information, so everything we say is based on a series of guesses.

The actual cumulative distribution, looking at individuals rather than classes, would not consist of straight lines, or of a smooth curve, but of steps. This is illustrated in the second picture on the Wikipedia page. What you draw is just an approximation of that (and is far more reasonable when the number of individuals is bigger than 10).

Here is the picture I referred to (unless it’s changed since that time):

An ungrouped discrete distribution would be like the top graph, with each individual point causing a jump in the cumulative count. But we don’t have that much information.

Your question turns out to be a very interesting one that reveals a lot of facts worth knowing. But did you notice thatthe question was worded to fit perfectly with the definition of the ogive? It asked about runners taking MORE than a given time, which is the complement of taking LESS THAN OR EQUAL TO that time -- and that is what the ogive explicitly tells you. So if we didn't have all these issues of accuracy,the ogive would give exactly the answer you need.

This is the evidence I mentioned, that this class is using our “less-than-or-equal” definition for the CDF.

Ted responded in a gratifying way:

A good teacher teaches, whereas a great teacher inspires; and Dr. Math, you have inspired me to learn more. Specifically, I'm going to learn about discrete and continuous probability distributions very soon. :> But I've learnt a lot already. Thanks!

I mentioned variations in how classes are defined, with some sources using the (a, b] style that fits with the usual definition of the CDF, and others using the [a, b) style (e.g. by writing “20 – under 30” or \(20\le x \lt 30\)). Here is an example of another question we got (unarchived) about the CDF from 2017:

Does the cumulative frequency (less than type) corresponding to a data value, include the frequency of that data value and the frequencies of all the values less than it, OR does it include the frequencies of values 'strictly less than' it? What I know is, in case ofcontinuousdata type, the value of theUpper Class Boundary (UCB) of a class is not included in that class(it is included in the next class), so the cumulative frequency corresponding to the UCB does not include the frequency of the UCB, it includes frequencies of values 'strictly less than' the UCB. In case ofdiscretedata type, say for a simple frequency distribution, the cumulative frequency of the first data value is its absolute frequency i.e, in this case the cumulative frequency is not 'strictly less than', it is 'less than or equal to' type i.e it includes the frequency of the concerned value. So, my question is, is the definition of cumulative frequency different for different types of data values? Is it 'less than or equal to' for discrete data types and 'strictly less than' for continuous data types? Or, am I wrong somewhere?Also, is the 'greater than' cumulative frequency 'greater than or equal to' or 'strictly greater than'?

Note that Sam’s source used the [a, b) style for classes. Also, the CDF is called “the less-than type” as opposed to the “greater than type”; it is not, apparently meant to exclude “or equal”.

In reply, I referred to the answer above, and added:

Looking around to see what others do, I see some who call it "the less than type", but define it as less-than-or-equal, so the name is not the definition. But if you use class boundaries rather than limits for adiscretedistribution, it won't matter, since the boundaries lie between data values, and won't be hit anyway. (That's part of the reason we define class boundaries that way.)If you go by class limits, then being inclusive makes the result equivalent to using the boundaries.And it really makes no sense to use strict inequality in this case; the Wikipedia quote in the page above seems to agree. For acontinuousdistribution, it doesn't really matter which way you define it (in theory) because any one value has zero probability of being exactly one of the data values. So I think you should always use the "or-equal" definition, unless you are working from a text that says otherwise.

So, for discrete distributions, we take the boundaries between values, so the highest actual value in a class will be included in the cumulative count; for continuous distributions, any one value in principle has probability zero of occurring, so including it or not theoretically should mean nothing. (This is true, for example, of the normal distribution.) But it is entirely possible that a textbook that excludes the upper boundary from a class, would also exclude it from the cumulative count. It would be interesting to do a survey of books and other authorities to see how these choices are correlated.

One more little issue …

As I’ve said, “ogive” is not the term I learned for this kind of curve; but I was familiar with the word before I’d ever seen it in this context. If you are curious, here is a question from 2002:

Origin of Ogive What is the origin of the term "ogive"? I know what the word means, but I would like to know where the name came from.

Doctor Pat answered:

Part of what is in Math Words and Some Other Words of Interest, at http://www.geocities.com/poetsoutback/etyindex.html under the term ogive is: "The term was applied by Francis Galton to the cumulative normal distribution but is used more generally now. The word originally comes from a term in architecture for a diagonal rib of a Gothic vault or a pointed arch. The term comes from the Late Latin obvita, the feminine past participle of obvire, to resist."

(The link, to “Doctor Pat” Ballew’s own page of math etymology, is broken, and I can’t find the site’s current location, even through the author’s blog. The origin of the word is not entirely certain.)

Here are some architectural ogive arches:

My impression is that the statistical term relates more to the related concept of ogee arch,

or ogee molding,

each of which relates to the typical shape of an ogive graph, particularly in the orientation shown as “cyma recta”. The words are closely related, and have been used somewhat interchangeably.

For comparison, here is the CDF of the normal distribution (the red curve is the standard normal):

]]>A recent question raised a different issue about grouped frequency distributions than we have discussed previously: What do you do when the last class is labelled something like “30 or more”? As we’ll see, there is no one right answer!

Here is the initial question, which came in last month:

I understand that to use the formula to find mean for grouped data I have to find the class midpoint first. How do I find the midpoint when the information I have only states “above 100”?

We looked at the mean of a grouped distribution last time. There we had distributions like

Class Frequency ----- --------- 37-46 19 47-56 23 57-66 27 67-76 28

This might represent the number of people in an audience of various ages. We found the mean by multiplying the midpoint \(x_i\) of each class by its frequency \(f_i\), adding these up, and dividing by the total number of people:$$\frac{\sum_{i=1}^{n}x_i\cdot f_i}{\sum_{i=1}^{n}f_i}.$$ The midpoint of each class is the average of its upper and lower limits, such as \((37+46)\div 2 = 41.5\) for the first class. (We saw that we can do the same thing with class *boundaries*, for a continuous distribution.)

What if it looked like this instead?

Class Frequency ----- --------- 37-46 19 47-56 23 57-66 27 67-76 28 67 or more 28

The last class has no upper limit; how can we find its midpoint?

Looking back at old unarchived answers (because there is nothing about this published in the *Ask Dr. Math* archive), I found a couple questions like this that were never answered, probably because we were too busy even to answer the questions we could answer confidently! Here is one from 2008:

I'm trying to find the mean and median for this frequency distribution: Minutes of delay Shop A Shop B ---------------- ------ ------ Less than 10 20 15 10-15 25 20 15-20 30 30 20-25 25 15 25-30 20 10 30 or more 10 5 How do I calculate the mean ifthe last interval is open? How do I calculate the midpoint of the last class? Is there another way to do it? I know that I have to find the midpoint for each interval, multiply it by the frequency of each class, sum it up for all classes and divide it by the total number of observations. Mean = (Sum mp x f)/n But I need the midpoint of the last interval!!!

(Note that this distribution is continuous, and has to be interpreted so that “10-15” means “at least 10, and less than 15”, so that 15 is not included in two classes.) The last interval goes, in principle, from 30 to infinity, so its midpoint would be infinite.

A similar question from 2006 included a hint to the intended answer:

The Department of Commerce, Bureau of the Census, reported the following information on the number of wage earners in more than 56 million American homes. Number of Earners Number (in thousands) 0 7,083 1 18,621 2 22,414 3 5,533 4 or more 2,797 a. What is themediannumber of wage earners per home? b. What is themodalnumber of wage earners per home? c. Explain whyyou cannot computethemeannumber of wage earners per home. Hint - the data above is an example of grouped data.

This is not really grouped, as each row pertains to a single value – except for the last, which *is* a group representing all higher numbers! Apparently the author of this problem says that we can’t find the mean, because of the open-ended class. (Note that we *can* find the median and the mode, because neither is affected by the outliers in that last class!)

I answered, as I often do when there is no definitive answer in my experience, by searching for suitable sources to get a sense of what sort of answers knowledgeable people give. The sources I found are not necessarily authoritative, but are meant to provide a survey of what might be said about the subject. I started my reply with the technically correct answer:

The quick answer is,

you really can’t find the mean of an open-ended distribution. Without having both limits for every class, you just don’t have the information you need. That is explicitly stated here:

https://people.richland.edu/james/lecture/m170/ch03-ave.htmlThe Mean is used in computing other statistics (such as the variance) and

does not existfor open ended grouped frequency distributions.

With no upper limit for a class, we can’t find a midpoint, and therefore can’t use the formula I gave above. It simply doesn’t apply. (I myself would not say that the mean *doesn’t exist*, but just that we don’t have enough information to *find* it. If we knew the original data, we could. The author here is referring to the mean of the distribution as presented.)

I didn’t stop there, though:

But there are several ways to deal with this situation, depending on how much you care about accuracy. Since any statistics based on grouped data are only approximations anyway, some guesses can make sense.

As I mentioned last time, such statistics are just estimates based on inadequate data and hopeful assumptions (such as that the data within a class is distributed in such a way that its midpoint is its mean). The formula itself is therefore not perfect, and there is no reason not to try adjusting the data in order to try for the best estimate we can get with even less adequate data!

One possibility is just to make the simplest possible assumption:

Looking around for examples, I found several sources that recommend just

assumingan upper limit such that the class width of the open-ended class isthe same as its nearest neighbor. This is perhaps the easiest solution; I suspect these rules may be given primarily for students, so that they can always find some answer, even if it is not ideal. An example of this is:

3. Open End Intervals:These are those intervals or classes, which either the lower limit of first interval or the upper limit of last interval or both of these, are not given. Here only an assumption about the length of these intervals is made according to the length of the interval nearest to these intervals.

Let us suppose the given class intervals are:

Less than 10, 10-20, 20-30, 30-40, 40-50,more than 50; Then the desired class intervals i.e. 1st and last are0-10and50-60respectively; as the length of intervals nearest to these two is also 10 i.e. in intervals 10-20 and 40-50. But if class intervals are not equal, then first interval should be taken equal to second and last equal to penultimate one.But that is not really a very good guess in many real situations, because the presence of the open-ended class is likely due to the fact that there are extreme values — and as you probably know, extreme values have a significant effect on the mean.

As I hinted, I felt that the sources I found that made this recommendation typically were meant not for serious statisticians, but for classes whose students just expect a simple rule for every case. I don’t know the credentials of this source, but it probably represents what is taught in some curriculum at that level. The fact that no justification is given makes the whole idea suspect.

In the example given here, it makes good sense to interpret the first class, “less than 10”, as “0 to 10” (not really because we take the same width, but merely because the numbers presumably can’t be negative). But assuming the last class ends at 60 ignores the fact that if no data values were greater than 60, they would have used 60! On the other hand, probably there are not many greater than 60, so maybe the assumption is good enough.

But, as I said in my response, this reasoning ignores the fact that the mean is strongly affected by outliers – or we might do this in order to *deliberately ignore* outliers. If that “more than 50” included a value of 100, the mean would be far larger than if we just use a midpoint of 55.

What if you really want an answer, and you want it to be the best you can get with the limited information at hand?

I found a nice, long discussion of

better guesses(but still guesses) here:

http://uregina.ca/~gingrich/ch51.pdf(pp 37-42)

Open Ended Intervals. As noted in Chapter 4, data is often presented so that it has open ended intervals. If the mean is to be determined for such a distribution, some value has to be entered for X for the open ended interval when using the formula for the mean. Exactly what this value should be is not readily apparent from the table of the distribution. …About all that can be done in the case of open ended intervals is to pick a value of X which seems reasonable based on what is known about the distribution of the data. Do not pick a value too high, or too low, but

pick a value which you think approximately representsthe mean value of the variable for the set of cases in the open ended interval.

Note that this is a serious attempt at accuracy, with each choice justified, and with explicit mention of the fact that the conclusions are approximate. For example, the author deliberately chooses as a “midpoint” for the last class in the first example that is larger than if that class were the same size as others, explaining why; and concludes by saying, “The mean is thus 12.195 thousand dollars, or $12,195. *Given the approximations that have been made in this calculation*, it might be best to round the mean to the nearest $100 and report it as $12,200, for perhaps round it to the nearest thousand dollars and report it as $12,000.”

Note that if this is done in a classroom exercise, each student might make a different choice – that is what I mean by “subjective”. Each such choice might be equally reasonable; some might be based on better background knowledge than others. Teachers (or students) who are not comfortable with this, or who think every math problem has to have a single correct answer, may choose to stick with the second method, but they are not being honest about the validity of their method.

After writing this, I ran across an article that gives a perspective similar to mine:

What if I have open-ended classes?For open classes (i.e. classes that don’t have an upper limit or a lower limit), in most cases you can assume those have the

same width as the other classeswhen doing your calculations.In an elementary statistics class, it’s highly unlikely your instructor will throw you a curve ball by creating an unusually wide open-ended class.However, if you’re working with

real-world data—perhaps from a graduate study or work-related study— you may need touse your best judgmentwhen it comes to the midpoint for open classes. If the open class is extremely large, or extremely small, your best guess might be better than a calculated midpoint.

If you are doing a really serious study, and you can’t get more detailed data, then you need to approach the matter scientifically, using a model based on what you know about the subject you are studying:

But when you really want a good number, you would want to

model the overall data setin such a way that you can estimate the distribution of values in the tail. One place I found this discussed (just as an example) is

https://arxiv.org/ftp/arxiv/papers/1210/1210.0200.pdfTo answer these questions, we have to estimate the mean and variance from the bins. How can we do that? A simple approach is to

assume that each family’s income is at the midpoint of itsbin. For example, we might assume that all households with incomes in the bin [$0,$10,000) have an income of exactly $5,000. This assumption isunrealistic, but it can be serviceable if the bins are narrow. If the bins are wide, then the midpoint approximation may be less accurate, since within some bins the distribution of households may be highly variable and may not be centered around the bin midpoint. The midpoint approximation also runs into practical difficulties if the data are “top-coded” so that the highest bin is unbounded or censored on the right—as in the Rancho Santa Fe school district, where nearly half the households are in the top bin [$200,000, +∞). Analysts commonly handle top-coding byassuming that the incomeswithin the top bin fit some distribution(e.g., Pareto). But such assumptions can be inaccurate and are hard to test (Hout 2004).A more sophisticated approach is to fit a flexible distribution not just to the top bin, but to the entire distribution.

Note that if you want this level of accuracy, you shouldn’t use the midpoint even for closed classes. That is far beyond the context of your question! But this is what you would do if it really mattered.

A simpler example of modeling might be to recognize that within any one class (bin), the data is likely more dense on the side toward the mode, and use the overall shape of the distribution to estimate the slope of the underlying curve, and use that to choose a better number than the midpoint to represent that class. (This is just the musing of a non-statistician; I am not aware of a technique that actually does this.) For an open-ended class, however, this would be *extrapolation* rather than *interpolation*, and therefore more risky. The suggestion here is to use knowledge of the nature of the data to choose an appropriate distribution (the choice being justified by the characteristics of the histogram), and then use that to estimate the parameters. This is well beyond my knowledge.

That’s probably a longer answer than you want. Ultimately,

if you are in a class, you need to ask your instructor what to do in these cases;if you were doing serious statistical analysis, you might need professional advice.

There are more places than you might think in math, where there is no general consensus, and you just have to find out what conventions are used in your class (or in your field).

Our reader found the answer helpful, even though it was open-ended, as we might say …

]]>Here is a question from 1999:

Statistics of Grouped Data I am preparing to take a statistics course after many years of being able to avoid doing stats. I am doing some preparatory work before the course starts. Can you please give me answers to the following questions? Grouped Data Income (*$1000) Midpoint(x) Number of Purchasers --------------- ----------- -------------------- 20 - 29.99 25 50 30 - 39.99 20 40 - 49.99 31 50 - 59.99 39 60 - 69.99 35 70 - 79.99 30 80 - 89.99 25 90 - 99.99 18 The above table is data from a survey of recent purchasers of superannuation plans. 1. Find the mean and standard deviation of the income of people purchasing superannuation plans. Find the mean Find the variance Find the standard deviation 2. Find the median class. 3. Choose a suitable graph and display the frequency distribution. 4. Summarize the findings.

Tony is asking for basic instruction in calculating the mean, variance, and standard deviation of a frequency distribution. The table (a frequency distribution) shows that, for instance, 50 people in the survey had incomes from $20,000 through $29,999.99 (assuming that 29.99 doesn’t mean, literally, $29,990, but really means “anything less than $30,000”; some authors would write “20 – <30”). These numbers are called “class boundaries”, and are relevant when the data are continuous, allowing in effect any real number from 20,000 up to (but not including) 30,000 in this class.

The midpoint column is not filled in except the first line; it represents the average of the high and low values for the class, in this case \((20 + 30)\div 2 = 25\). The idea is that if the people in this class are uniformly distributed across this interval, their average income would be $25,000. In effect, we will be pretending that there are 50 people with that income, and 20 with $35,000 (the midpoint of the next class), and so on.

Let’s fill in the rest of the midpoints:

Income (*$1000) Midpoint(x) Number of Purchasers --------------- ----------- -------------------- 20 - 29.99 25 50 30 - 39.99 35 20 40 - 49.99 45 31 50 - 59.99 55 39 60 - 69.99 65 35 70 - 79.99 75 30 80 - 89.99 85 25 90 - 99.99 95 18

Doctor Mitteldorf replied, answering each question in turn:

To start: find themeanof the distribution as follows. First, find thetotal number of buyers. Do this by adding up the column with the numbers from each income category. Second, find thetotal of all their incomes. This you can do approximately, since you have an estimate of the incomes of the people in the group. On each line, multiply the midpoint income times the number of people in the group. Add up the products. This should give a reasonable estimate of the total income. Divide total income by total buyers to give the mean income.

There are 50 people in the first class, 20 in the second, and so on; so the total is 248. We’ll write this sum at the bottom of that column.

If there are 50 people with 25 thousand dollars, their total income is \(50\times 25 = 1250\) thousand dollars. We do that for each row, and add them up:

Income (*$1000) Midpoint(x) Number N*x --------------- ----------- ------ ---- 20 - 29.99 25 50 1250 30 - 39.99 35 20 700 40 - 49.99 45 31 1395 50 - 59.99 55 39 2145 60 - 69.99 65 35 2275 70 - 79.99 75 30 2250 80 - 89.99 85 25 2125 90 - 99.99 95 18 1710 ------ --- ----- Total: 248 13850

So the 248 people have about $13,850 thousand total, and the mean income is \(13850\div 248 = 55.85\) (thousand dollars).

To find thevariance, you should create another column in which you aresquaring the midpoint incomesbeforemultiplyingby the number of people. Add up those numbers, and divide, as before, by the total number of people to obtain the <mean squared income>. This is not the same as the <mean income> squared - that quantity is just the number you calculated at first, multiplied by itself. In fact: you can subtract the <mean income> squared from the <mean squared income> to give the variance of the income distribution.

Here is our table with two new columns, the square of the midpoint, *x*, and the product with the number, \(Nx^2\):

Income (*$1000) Midpoint(x) Number N*x x^2 N*x^2 --------------- ----------- ------ ---- --- ------ 20 - 29.99 25 50 1250 625 31250 30 - 39.99 35 20 700 1225 24500 40 - 49.99 45 31 1395 2025 62775 50 - 59.99 55 39 2145 3025 117975 60 - 69.99 65 35 2275 4225 147875 70 - 79.99 75 30 2250 5625 168750 80 - 89.99 85 25 2125 7225 180625 90 - 99.99 95 18 1710 9025 162450 ------ --- ----- ------ Total: 248 13850 896200

The mean squared income is \(896200\div 248 = 3613.71\).

The variance is then \(3613.71 – (55.85)^2 = 494.85\).

There are other formulas you may see, but this is the easiest.

Thestandard deviationis the square root of the variance. I hope that gives you a start. Dive in, try it, and report back what you think you understand. Explain as much as you feel comfortable with, and a little more. We'll try to help with a "mid-course correction" if you're not getting the ideas 100%.

So our standard deviation is \(\sqrt{494.85} = 22.25\) (thousand dollars). Tony didn’t answer back.

In 2000, Laura asked a similar question:

Standard Deviation of Grouped Data Interval (grouped) data: Interval (group) Frequency ---------------- ----------- 37-46 19 47-56 23 57-66 27 67-76 28 What is the standard deviation of the data? I have another question. If I had an answer of 31.615 for the mean of a different data set, would I round it off to 32 or 31.62?

Doctor TWE answered in detail, with some small differences that make this worth going through:

I'll break this down by steps. Step 1: Find thenumber of data points. To find the number of data points, add up the values in the Frequency column of the table: Interval Freq. -------- ----- 37-46 19 47-56 23 57-66 27 67-76 28 --- 97

We’ll need this both for the mean and for the standard deviation.

Step 2: Find themidpointof each interval range. To find the midpoint, add the top and bottom of each interval range and divide by two. For example, the first interval range is 37 to 46, so the midpoint is: (37 + 46) / 2 = 83 / 2 = 41.5 Do this for each interval range. Add a column to your table for this (I'll put it between the Interval and Frequency columns): Interval Midpt. Freq. -------- ------ ----- 37-46 41.5 19 47-56 : 23 57-66 : 27 67-76 : 28 --- 97

Notice one difference so far from the problem above: the class intervals this time are given as sets of discrete values, rather than as a continuous sets, so the highest value of the first is 46, not “just before 47”. The data are taken to be integers. As a result, the midpoint of “37 through 46” is the average of the first and last values.

The first and last discrete values in a class are called “class limits”, as opposed to the “class boundaries” we had last time. This will come back to haunt us below!

I’ll be leaving the rest of the work “as an exercise for the reader” this time, with the answer at the end.

Step 3: Find the estimatedsum of the data. To find the sum, multiply the midpoint of each interval range by the frequency of that interval range. For example, the midpoint of the first interval range is 41.5 and the frequency is 19, so the sum is: 41.5 * 19 = 788.5 Do this for each interval range. Add another column to your table for this (I'll put it after the Frequency column), then find the sum of that column (I'll just call this S): Interval Midpt. Freq. Sum -------- ------ ----- ----- 37-46 41.5 19 788.5 47-56 : 23 : 57-66 : 27 : 67-76 : 28 : --- ----- 97 S

This S is called the **estimated** sum, because it is based on the assumption that every value in a class is equal to the midpoint, which is almost certainly not true. The sum will be valid if the average of the values in each class is equal to the midpoint; that is probably not exactly true, but may well be a good approximation.

Step 4: Find the estimatedmean(or "average") of the data. Divide the sum of the data (S, found in step 3) by the number of data points (found in step 1). In our example, Mean = S / 97

Again, we don’t know the exact mean because we don’t have the exact data; but this is the best we can do.

Step 5: Find thesquares of the midpointsof each interval range. For each interval range, find the square of the midpoint. Add another column to your table for this (I'll put it after the Sum column). For example, the midpoint of the first interval range is 41.5, so the square is: 41.5^2 = 1722.25 Do this for each interval range: Interval Midpt. Freq. Sum Midpt^2 -------- ------ ----- ----- ------- 37-46 41.5 19 788.5 1722.25 47-56 : 23 : : 57-66 : 27 : : 67-76 : 28 : : --- ----- 97 S

This is just what we did before.

Step 6: Find the estimatedsum-of-the-squares of the data. To find the sum-of-the-squares, multiply the square of the midpoint of each interval range by the frequency of that interval range. For example, the square of the midpoint of the first interval range is 1722.25 and the frequency is 19, so the sum-of-the-squares is: 1722.25 * 19 = 32722.75 Do this for each interval range. Add another column to your table for this (I'll put it after the Midpt^2 column), then find the sum of that column (I'll just call this S2): Interval Midpt. Freq. Sum Midpt^2 Sum-Sqrs -------- ------ ----- ----- ------- -------- 37-46 41.5 19 788.5 1722.25 32722.75 47-56 : 23 : : : 57-66 : 27 : : : 67-76 : 28 : : : --- ----- -------- 97 S S2

It’s important to note that this is not the sum of the squares *of the midpoints themselves*, but of *all the data* — as if we had 19 41.5’s, and 23 51.5’s, and so on.

Step 7: Find the estimatedmean squareof the data. Divide the sum-of-the-squares of the data (S2, found in step 6) by the number of data points (found in step 1). In our example, Mean square = S2 / 97

We have, in effect, added 97 squares, and then divided by the count, giving us an average square for all the data.

Step 8: Find the estimated variance and standard deviation of the data. To find the variance, square the mean (from step 4), then subtract it from the mean square. Note that the mean square and the square of the mean are not the same! Var = (Mean square) - (Mean)^2 To find the standard deviation, take the square root of the variance. StDev = sqrt(Var) Note that these values are estimates, because with grouped data, you don't have the exact figures to work with. Your means, squares, variance and standard deviation are all based on estimations of the actual data.

Let’s finish the work:

Interval Midpt. Freq. Sum Midpt^2 Sum-Sqrs -------- ------ ----- ----- ------- -------- 37-46 41.5 19 788.5 1722.25 32722.75 47-56 51.5 23 1184.5 2652.25 61001.75 57-66 61.5 27 1660.5 3782.25 102120.8 67-76 71.5 28 2002 5112.25 143143. --- ------ --------- 97 5635.5 338988.3

So the mean is \(\frac{5635.5}{97} = 58.098\approx 58.1\), the mean square is \(\frac{338988.3}{97} = 3494.7\), the variance is \(3494.7-58.098^2 = 119.35\), and the standard deviation is \(\sqrt{119.35} = 10.9\).

As for the final question about rounding:

That depends on the accuracy and precision of the original data. In some scientific fields, there are very specific rules for determining the number of significant figures to leave in an answer, and they can get quite complicated. As a general rule, your final answer should have the same precision (i.e. the same number of decimal places) as the LEAST precise data point. So, for example, if I had the data set: 16.725, 31.0625, 24.5, 22.50, 19.75 I'd compute the mean as: (16.725 + 31.0625 + 24.5 + 22.50 + 19.75) / 5 = 22.9075 Then I'd round it to 22.9 (NOT 22.91) because my least precise data point (the 24.5) had only one decimal place in it.

We’ve discussed significant figures elsewhere. Note that this advice doesn’t really fit the present situation, where we weren’t given any actual data values; my rounding to the nearest tenth seems appropriate.

In 2009, we got the following question pertaining to the boundaries used in distributions:

Class Intervals in Statistics I can't feel comfortable with the issue of having a negative boundary when we have data which is made up of purely positive numbers. The best way to explain would be with an example: The number of breakdowns in a machine with the data is grouped from 0-4, 5-9, 10-14, 15-19 etc.. The midpoints of each interval would be taken from the midpoints of the lowest and highest boundary. No problem normally: the midpoint of the 5-9 boundary is the midpoint of 4.5 and 9.5, i.e. 7 But what about 0-4? Surely the lower boundary must be zero, giving a midpoint of 2.25? However, textbooks tend to say it should be -0.5, giving a midpoint of 2. I believe that if the data is essentially positive, the boundaries can't go below zero. Trivial it may seem, but I hate ambiguity.

This requires some background. Oliver has been given discrete data as in our second example, where the values are all exactly integers (0, 1, 2, 3, or 4 in the first class, for example). But for some purposes, we want to treat the data as if it were continuous; this is done for histograms, and for particular procedures such as a “normal approximation”. (We don’t know for sure what Oliver’s context is.)

In this process, called “continuity correction”, we take the boundary between classes 0-4 and 5-9 is taken to be the average of 4 and 5 – halfway between. So the **boundaries** of the class with **limits** 5 and 9 are 4.5 and 9.5; and the midpoint of that class is \((4.5 + 9.5)\div 2 = 14\div 2 = 7\). (You can also get the midpoint directly from the limits, as I’ve shown above: \((5 + 9)\div 2 = 14\div 2 = 7\).)

That makes sense … until you look at the lower boundary of the first class, which is taken as -0.5. How can we use negative numbers in a problem about non-negative numbers (machine breakdowns)?

I replied:

What's happening here is sort of a pretend "boundary" being used to convert a discrete variable (the number of breakdowns, which must be a whole number) into a continuous variable (location on the x-axis of the histogram). You want columns on a histogram whose MIDPOINTS represent the actual values. If you didn't have classes, there would be columns for 0, 1, 2, 3, 4, and so on; if the midpoint of a column is at 0, and the width is 1, then it must extend from -0.5 to +0.5: +-+ | | | +-+ | | | +-+ | | +-+ | | | | | | | | | | +-+ | | | | | | ===+=+=+=+=+=+=... 0 1 2 3 4 5

Recall that a bar graph has bars representing counts of discrete things, and the labels just name that thing, such as “1”. But in a histogram, a bar’s width represents an interval of values containing the data. A bar centered around the number 0, with width 1, will naturally extend 0.5 on either side of the 0.

With classes, you will have one bar representing the entire class: +---------+ | | | | | | | | | | ===+=+=+=+=+=+=... 0 1 2 3 4 5 It should still cover the same interval on the axis, so it goes from -0.5 to 4.5; its midpoint is (-0.5 + 4.5)/2 = 2. That's just a formality, and allows us to pretend that any value, not just whole numbers, is allowed. Note that the midpoint is the same as what it is if you ignore all this and just take the actual discrete values: (0+1+2+3+4)/5 = 2; or if you just treat 0 and 4 as the endpoints (leaving a gap of 1 between bars, which is a no-no): (0+4)/2 = 2.

It’s also worth noting that there are 5 numbers in this class (0, 1, 2, 3, 4), and the *width* of the bar is just what it ought to be: \(4.5 – (-0.5) = 5\). Without the halves on each end, this would not be true.

So you'll never really get a count of -0.5, any more than you'll get a 4.5; these boundaries are equally fictitious! And if you prefer, you never really have to mention -0.5 in your calculations. But it allows us to have a histogram like +---------+ | | | +---------+ | | +-- | | | | | | ===+=+=+=+=+=+=+=+=+=+=+=... 0 1 2 3 4 5 6 7 8 9 10 that uniformly covers the axis, rather than +-------+ | | | | +-------+ | | | | +- | | | | | | | | | | ===+=+=+=+=+=+=+=+=+=+=+=... 0 1 2 3 4 5 6 7 8 9 10 where there are gaps, and the endpoints teeter on the edge of their bars.

Oliver replied:

Thank you for your reply, I'm happier with this now, as you mention that it's just a formality to help us create histograms and also the midpoint of the actual discrete values is the same. I mistakenly thought before that including the -0.5 gave a negative bias, but now it's clearer. Much appreciated!

The negative bias would mean that *everything* is shifted over to the left; but in each bar, we have shifted both to the left *and* to the right, widening the bar. There is no bias.

We’ll close with a question from 2016, when a teacher asked about a student’s unusual method for finding a standard deviation:

Standard Deviation, Non-Standard Definition In the formula for standard deviation, we always use 'x' for the midpoint of the group. But can 'x' represent the upper boundary of the group? This comes from a test question that asked my students to find the standard deviation of grouped data. I wrote out my own steps, with x representing the midpoint of each group, and got 10.49 kg. One of my students used 'x' to represent the upper boundary of each group -- and she got 10.49 kg, too.

We used the **midpoint** in our calculations above; but using the **upper boundaries** seems to give the same result. Is there something funny here?

I answered:

The student is wrong, of course, because you need to use the rightdefinition. But we can see why, as long as she is consistent, it turns out not to matter, as far as the standard deviation is concerned. First, note that any attempt to calculate statistics from grouped data is just an approximation -- not the real thing. We don't know the actual data, so we can't find the real mean or standard deviation. We are PRETENDING that all the data in a group (also called a class or a bin) are equal; and we are finding the mean and standard deviation of THOSE values, not the real data. But using the midpoint makes sense, because -- as the name suggests -- it is likely to be in the middle of whatever the actual values are in each class, if they are uniformly distributed over the interval, and therefore will be close to the actual value. So this is how we define the mean and standard deviation in this situation.

So the midpoint is appropriate in the definitions (of mean and of standard deviation), because it is likely to be a good approximation of the data.

But if we take the midpoint-based fake data and replace everything with upper boundaries, we are simply adding half the width of a class to every (fake) data value. The result is thatthe mean will also be w/2 higherthan the mean obtained from the midpoints; and the standard deviation will be exactly the same, because it is based only on the deviations from the mean, which are unchanged. This fact is something students should know (eventually, at least)

So the student’s work will result in the wrong mean, but the right standard deviation, because the latter is all about differences (deviations), which don’t change. The student has biased her work to the right, but that bias doesn’t affect the standard deviation, which measures only the **spread**, not the **location**, of the data. (This is easier to see in the other formula for standard deviation that we haven’t looked at here …)

I would expect this student to get the mean wrong but the standard deviation correct. If she got the mean right, I'd be surprised -- and I'd want to ask whether she did what she did intentionally, getting the mean using the midpoints but the s.d. using upper boundaries. Maybe she knew what she was doing!

I suspect not, though. We never heard back to be sure.

Next time, I’ll look at a different sort of boundary issue.

]]>Here is a question from 2003:

Choosing Appropriate Units I don't get how to choose appropriate units for measurements. For example, what units would you use to describe the size of a garbage can?

Some other typical questions of this sort that we have seen are:

Which is the best unit for the length of (a) a paper clip; (b) a pencil; (c) a railroad route; (d) scissors? You may use meters, centimeters, decimeters, or kilometers.

What is the best unit of measure for finding the surface area of a rectangular prism measured in inches? ... for finding the distance around a table measured in cm? ... for finding the space inside a carton measured in feet?

The BEST unit to measure the weight of a person would be: a) ounces b) grams c) milligrams d) kilograms The amount of blood circulating through a person's body at any one time would best be measured in: a) kiloliters b) liter c) milliliters d) gallons

In which case would it be most appropriate to use miles as a unit of measurement? A. length of a soccer field B. distance from Washington to New Orleans C. length of a crayon D. diameter of a penny

What is the more reasonable unit of measurement? A. length of an arm m cm mm B. length of an automobile dkm m dm C. distance from NY to LA km hm dam D. weather satellites orbit earth at the altitude of km hm m

These questions are meant to develop a student’s sense of how large each unit is; and a large part of the thinking involved will be visualization. But there can be more to it than that.

Doctor Ian answered by making up three such questions of his own:

Suppose someone asks you 1. The distance from Los Angeles to New York. 2. The distance from your house to the one across the street. 3. The distance from your elbow to your wrist. These are all examples of length, right? And there are lots of different units of length: millimeters, centimeters, meters, inches, feet, yards, kilometers, miles, light years, city blocks, and so on.

In the present question, we haven’t been told whether to use metric or other units; that is probably an omission by the student. Most such questions we see seem to be about metric units, which in American schools would be emphasized because they are less familiar.

How do you know which units to choose? Well, suppose you tried to tell someone the distance from LA to NY in inches. That would seem silly, right? Why? Because the number would be enormous! And people have difficulty dealing with numbers that are really large. On the other hand, saying 'about 3000 miles' is something you can deal with.

So the main issue is the size of the numbers involved (180 million inches, in this case).

On another occasion (unarchived), Doctor Ian said, “In general, a unit is ‘appropriate’ if it gives you a number that is ‘mind-sized’… a number that you can visualize because it is within your daily experience. Usually this means no smaller than 0.01, and no larger than 1000.”

What about the distance from your elbow to your wrist? To express that in miles would give you a really, really small number! And people have difficulty dealing with numbers that are really small. But saying 'about 10 inches' is something you can deal with.

This would be about 0.00016 miles.

What about the distance from your house to the one across the street? Miles will give you too small a number; inches will give you too large a number. Feet might work, or yards.

If it’s 200 feet, we’d have to say 2400 inches, or 0.038 miles. Or we could say “about 70 yards”. There’s some choice there, isn’t there?

The point is, people strongly prefer numbers that aren't really big or really small. So this is why we _have_ so many different kinds of units! They let us keep the numbers manageable, regardless of whether we're talking about the distance between two atoms (which might be a dozen angstroms) or the distance between two stars (which might be a dozen light years).

One benefit of the metric system is that all the units of a given type (say, length) are related in a simple way, so whether we choose centimeters or meters, all we’re doing is moving the decimal point. It makes a bigger difference in American traditional units.

A teacher wrote us in 2002, wondering if a test question on this subject was inappropriate — and bringing some real-life insight into the question:

Choosing a Unit of Measurement Hi! I am a second-grade teacher at a Florida elementary school. My students just took the County's math assessment and faced this question (worded something like this): If Sue were to measure the length of all the butterflies in her collection, which would be the best unit of measurement for her to use? a-millimeter b-centimeter c-kilometerI observe butterfly and moth biology for a hobbyand could not figure this one out for myself so went home and consulted some standard field guides including Audubon's, Peterson's, Simon and Schuster...Some of the guides gave approximate wingspans in centimeters, while some did in millimeters.I expressed this fact to my County administrators, thinking there was a problem with the answer choices to the question. The County claimed the test question was valid because in math, generally,the unit used to measure an object is the one that is the smallest possible unit where the object is not less in length than the unit in question.... I think I am getting the wording mixed up here... For example, by this argument, one would measure a car's length in yards, not inches, feet, or miles. The county also said that second graders were not responsible for understanding measurement in millimeters. I need to check on that too. I am surprised that they did not care to invalidate the question.Some of my students chose cm and some chose mm.Please send me some sort of help for this situation. I am looking for an outside, objective opinion. Thanks!

Hmm … if millimeters are not in the vocabulary of a second grader, why are they in the question? (I’m guessing the question was remembered wrong, inserting millimeters because that’s what some books use.) But more important, scientists (who should know) don’t agree on which unit to use! There’s a lot going on here.

Doctor Douglas was the first to respond:

Hi Karen, and thanks for writing. I'm not exactly sure I understand you here. If the object is *not* less in length (emphasis mine), then we should always choose the smallest unit (e.g. micron, or nanometer, or even smaller) possible. I think probably what is meant is that one should choose the *largest* unit such that the unit (e.g. yard) fits at least once into the object (e.g. car).

Yes, Karen got the wording mixed up a little, as she suspected. The claim is that the best unit to use is the **largest possible unit less than the length of the object**.

Having said this,I don't believe that this rule is the best one to use. In my opinion, one should use the units that give themost convenientnumbers, or areconsistent with what other people use. In other words, while it is true that cars might be "best" measured in yards (and give nice, small numbers such as 3.18 yards), what is often relevant is how that object compares to OTHER objects in various contexts: How many (thousands of) feet to the toll booth? Is my car too big for my garage? Am I parking too close to the stop sign? Here it becomes clear that feet are probably more convenient than yards, because of the variety of contexts in which we need to know the dimensions of cars. After all, it's rare that we're driving a car on a football field.

So we might choose feet rather than yards for a car because we will be comparing that length to a variety of other things, and it is customary to use feet for other items within a fairly broad range of sizes. Using the same units makes it easier to compare them. (It would be interesting to study where yards are used, and figure out what special feature calls for that choice.)

So we have at least two considerations: **small numbers** greater than 1 in our measurements, and **few different units** used for a range of sizes. (The yard is relatively rare perhaps because it competes for “space” with the foot, which differs from it by only a factor of 3. The metric system is spared this difficulty because everything is in tens.)

Now, what would be "most convenient" for the measurement of butterflies and moths? Here I would probably lean toward cm, because to adequately measure the spread in different sizes, the cm seems to be the most natural unit in that most butterflies and moths that I know are somewhat bigger than a centimeter. However, if there were a multitude of butterflies with wingspans under one cm (or perhaps we are interested in their sizes at various stages in their growth), then I would probably lean toward using mm.

I’ll have another possible issue here below.

I think the County question is poorly written, because both cm and mm seem to be natural choices. If instead the mm is changed to "micron" or something quite a bit smaller than a butterfly, then the "correct" answer would be clear.

In real life, there is often more than one reasonable choice; and there is no “official rule” about this that makes one definitely correct.

Then I joined in, with an additional perspective:

I agree with Dr. Douglas that millimeter and centimeter are both perfectly reasonable units for this case. It's not clear to me whether the problem deals with length or wingspan, but in either case most butterflies will probably be in the range from a centimeter or two to ten or twenty centimeters at the most; and in that range, both units will give small numbers that are reasonably easy to handle.

Since 1 centimeter is 10 millimeters, and 20 centimeters is 200 millimeters, neither gives unreasonable numbers.

In my mind the "best" unit would be one for which all commonly found values will be greater than 1 (to avoid needing small numbers like 0.43, where the decimal point is easy to miss. But also, in many cases, I would like a unit that gives reasonable precision without needing a decimal point at all. I might prefer to say 12 mm rather than 1.2 cm.

If our preference is to avoid decimals, then we will be inclined to choose smaller units. This, I think, is a matter of taste.

Moreover, for some purposes centimeters are to be avoided. The SI metric system recommends keeping to powers of 1000, avoiding "centi-": How Many? A Dictionary of Units of Measurement - Russ Rowlett http://www.unc.edu/~rowlett/units/prefixes.html The prefixes hecto-, deka-, deci-, and centi- are widely used in everyday life but are generally avoided in scientific work. Contrary to the belief of some scientists, however, the SI does allow use of these prefixes. So rules besides the "small number" rule may make us choose millimeters rather than centimeters, just as Dr. Douglas pointed out that we should use whatever units are commonly used by others. In particular, if, say, the wingspans of eagles are typically in the tens of centimeters, we would not choose to use dekameters, both because that is a rarely used unit, and because eagles are likely to be compared to, say, sparrows, which would certainly be measured in centimeters or millimeters.

My suggestion here is that scientists stick to the power-of-3 units (kilo, mega, milli, micro, …) in order to have fewer different units in play, which each cover a broad span of sizes in order to make comparison easy. This could be why some of the field guides use millimeters: They are following up-to-date guidelines for scientific usage. This recommendation seems to be missed at the elementary level – at least in part because authors there are largely dealing with everyday use of units, not scientific usage.

If this much can be said about a problem, then it is certainly too ambiguous!

So, yes, the problem should have been changed.

]]>Although there were several short answers in the *Ask Dr. Math* archive already, our first extended discussion of reading a basic protractor was this, in 1998:

Using a Protractor I need to know how to find angles and to use a protractor. I made two parallel lines - now I have to intersect those lines making a 65-degree angle. Help!

Michelle had a specific task to perform, but her main need was to learn how to use the protractor; so I used the former as an example for the latter:

Hi, Michelle. Protractors can be a little confusing. Let's take out your protractor and look closely at it. Mine is not very round after passing through e-mail, but you should more or less recognize it: 90 / _ - -+- - _ / / 9|0 \ /D / | / \ 135 / | / \ 45 / 45 | / 135 \ / | / \ / | / \ / | / \ / | / \ / | / \ | | / | | |/ | -180+0-------------------+-----------------180+0--- A /C B /

I was very new to Ask Dr. Math, and inserting pictures was not easy, so I did it the hard way. Here is a better picture (of a really cheap school protractor!):

What you should see on yours (and on mine if I could draw it better!) are: - a line along the bottom (AB), which you have to line up with the line you want to make a 65 degree angle with; - a cross or dot in the middle of that line (C), which marks the center of the angle you are going to measure (so you should first make a small mark there to show the point your line will go through); - a semicircle around the edge, with little degree marks; - degree labels around the outside, probably going in both directions (clockwise and counterclockwise) from 0 to 180.

Note that the vertex of the triangle does not go on the edge of the protractor (though that might be true for some that are not transparent like mine); the protractor has to be placed over the line you are using, with the vertex directly under the cross, and the line passing under the 0 mark.

Now let’s draw that 65° angle:

To draw a line at a 65-degree angle, facing to the right, look for the numbers that start at 0 on the right (that's the outside set of numbers on mine) and look along them until you come to 60 and 70. Just as on a ruler, there should be some number of divisions between them, and you should be able to identify a mark for 65 degrees halfway between 60 and 70. Make a small mark there (D), and then remove the protractor. Now you can use your ruler to draw a line between the marks (CD), and you're done!

On my protractor, we used the inside numbers. There are 10 marks between 60 and 70, and we chose the large one in the middle, for 5.

Though the question was only about drawing an angle, I added a comment on the opposite task:

If you have tomeasurean angle you've been given - maybe where the line you just drew meets the other parallel line - then you do the same thing in reverse. You line up the bottom edge with one of the lines that make the angle, with the center mark (C) over the place where the lines meet. Then you find which set of numbers starts at zero on the first line, and follow it until you come to the second line. Then you read the numbers just like a ruler. Have fun drawing and measuring, and let us know if you need more help, or if you discover any great theorems while you explore!

Three years later, someone named Chad asked another question that was added on to this page:

How do you use a protractor to find the three angle measures of a triangle?

Again, the goal is a specific task, but this time it’s not finding and drawing an angle, but measuring one that is already there, as in my extra comment above. There’s a little more to be said about that.

After referring to the answer above (the reason this one was added to the same page), I said,

I'll give you a few additional pointers for measuring a triangle. First, if the triangle is smaller than your protractor, you will have to extend the edges, like this, so they are long enough to reach the edge of the protractor: \ / \ / + / \ / \ ------+--------+------ / \ / \

This is not always needed, but it’s a good thing to check before you get started.

Now put the center of the protractor on one of the vertices, and line up the zero on one of the two curved scales with one of the edges coming from that vertex. Follow along that scale (watching the numbers increase from zero) until you come to the other edge (ray) of that angle. The number you find there is the measure of the angle.

My little triangle has a 65° angle:

You may have to count small marks to decide how many degrees each mark indicates. For instance, if the scale looks like v --+---+---+---+---+---+---+---+-- | | 70 80 then since there are five spaces between 70 and 80, each is 2 degrees (1/5 of 10 degrees), and the mark I am measuring is at about 77 degrees.

This example has marks every 2 degrees; 1-degree markings like mine are probably most common, but I wanted to show the harder case.

The 2° markings are common on better magnetic compasses (which, by the way, are more like protractors than like drawing compasses):

Cheaper compasses may have markings only every 5°.

The next year, a student named Rezzi wanted information about various sizes of angles:

Measuring Angles with a Protractor I want to know how to measure acute, reflex and obtuse angles with a protractor.

Doctor Rick first reviewed the basics of protractor use that we have seen:

To measure anacute angle(less than 90 degrees), put the vertex at the center of the straight side of the protractor, lining up one half of that side with one side of the angle, so that the other side is under the protractor. On the curved edge of the protractor are (usually) two sets of numbers; one scale starts with zero on the left edge and increases to the right, while the other starts with zero on the right edge and increases to the left. Choose the scale that starts on the side that you lined up with a side of the angle. Read off the number from this scale at the point where the other side of the angle crosses the protractor.

This is the case we saw above. There’s really nothing different about an obtuse angle:

Measuringobtuse angles(between 90 and 180 degrees) works exactly the same way. It may be that some protractors have only one scale, with zero on both ends and 90 in the middle. If so, then you'll read a number between 0 and 90, and you'll need to subtract it from 180 to get the measure of the obtuse angle. For instance, if you read 45, the angle is really 180 - 45 = 135 degrees

I couldn’t find any examples of such a protractor online, so this is probably an imagined worst case. Normally, it will just look like this (where the 45 is there, but we didn’t have to use it):

Many readers will not be familiar with the term “reflex angle”; that measures from 180° to 360°:

A reflex angle is the "outside" of an acute or obtuse angle. Use the protractor to read the measure of the "inside" of the angle. Then subtract from 360 to get the measure of the reflex angle. If the inside is the same obtuse angle we just measured, then the measure of the outside is 360 - 135 = 225 degrees

Here is that 225° angle, in standard position, going counterclockwise from the right; we would have subtracted 225 from 360 in order to decide to make the mark at 135:

Some protractors are complete circles, so you could measure this angle directly.

Here are two other, shorter discussions of how to use a protractor, focusing on how to choose which scale to read:

Using a Protractor Reading a Protractor

In 2003 Gloria asked a specific question about measuring an angle, which was appended to the one above:

I have different pictures of a protractor with arrows pointing to different numbers on it. None of them starts at 0. This protractor is 0-180. If I have arrows pointing to 80-150, do I start at 0? Or would I just say 70 and subtract from 180? I'm confused!

I replied, guessing at the meaning of the problem:

I think you mean that you are supposed to find the angle between the 80 and 150 marks on the protractor. Although you would normally set the protractor so that one leg of the angle lies at zero, you can also use it this way. Remember that a protractor is really just a curved ruler, and can be used in exactly the same ways (apart from having to put the vertex of the angle in the right place). On a ruler, you can measure starting at zero: 0 1 2 3 4 5 6 7 8 +---+---+---+---+---+---+---+---+ ===================== 5 inches or between any two positions: 0 1 2 3 4 5 6 7 8 +---+---+---+---+---+---+---+---+ ===================== 7-2=5 inches You subtract the readings of the two ends to find the distance between them. So the angle between the 80 and 150 marks is just 150-80 = 70 degrees. What you are really doing is subtracting two angles. If the points at the 0, 80, and 150 degree marks are P, Q, and R, and the center of the protractor is at O, then you are measuring angles QOR as POR - POQ. Draw a picture of that and see if it makes sense.

Here is an entirely different question. In the past, we have talked about the fact that not every angle can be constructed exactly by compass and straightedge (in the post about trisecting angles); we recently looked more generally at the role of compass and straightedge in geometry. A reader named Ivan put these ideas together in an interesting question:

Precision in Measurement: Perfect Protractor? In the reply to a question regarding constructing an angle of one degree it was stated that an angle of one degree cannot be constructed using just a straight edge and a compass because the sine and cosine of one degree both require cube roots and only square roots can be constructed (http://mathforum.org/dr.math/problems/callanta.5.25.00.html). My understanding is that only construction using "perfectly" reliable instruments will give "perfectly" accurate results. If we have to resort to measuring and calculating then there has to be (as I see it) a certain amount of uncertainty regarding the result. Given that protractors are expected to be accurate to the degree and in some instances the minute or secondhow are such angles accurately constructed and marked?On the subject of precision, would you please tell me how to "60-sect" a one-degree angle so as to generate the markings for the minutes? Thanks for your help. Unfortunately I just don't have the maths to have even started on this. Once I got past bisection and trisection I referred to this resource and discovered that construction just can't be done (as I read it).

I focused on several different aspects of the question, starting with what compass and straightedge are primarily meant for: *thinking* about geometry, not *making* real things.

You are right that it is impossible to construct all the angles you need with compass and straightedge. But that isn't really necessary. First, there are other tools available to do constructions;the restriction to compass and straightedge is just part of the ancient Greek game of geometry, due to their desire to reduce all truth to the fewest possible starting points, or axioms. There are many ways to trisect an angle, starting with Archimedes' construction using a MARKED straightedge. See the Dr. Math FAQ: Impossible Constructions http://mathforum.org/dr.math/faq/faq.impossible.construct.html

That is, even mathematicians have considered other ways to do a theoretically exact construction of 1/3 of an angle, and the same could be done for other angles, in principle. But that’s just theory; practicality is different:

Second,a perfect construction is not necessary, or even useful, in making a protractor. The kind of perfection we talk about here is only a theoretical perfection: if we had a perfect straightedge, a pencil that actually draws LINES (with no width!), and so on, then we could prove that the result would be exact. But often such constructions, done with real tools, are actually LESS accurate than just measuring and estimating. That's because the construction can get very complicated and introduce errors repeatedly whenever we have to draw a line through a non-quite-perfectly defined "point."

Real pencil lines are never exact anyway!

One very reasonable wayto mark a protractor would be to make a special tool with a 60:1 gear ratio, so that for every 60 degree turn of the handle, the marker would move one degree. Having marked degrees, if you are making a protractor large enough to show minutes, you could use the same tool to divide each degree into 60 parts. But usually we don't get that kind of precision. The marks on a typical protractor are not much less than a degree wide, so greater precision would be useless!

The protractor in my images is very imprecise, as you may have noticed (unless it’s just the image that’s distorted); this is probably true of many. But a very exact protractor can be made without much difficulty, given that the precision need be no better than the thickness of the lines drawn. If more precise angles are needed, my guess is that you would not use a protractor at all, but a machine much like what I described.

In summary: outside of a mathematician's mind, there are no "perfectly reliable instruments," nor is there a need for (or the possibility of) "perfectly accurate results."There is always uncertainty in the real world.So use whatever kind of measuring or calculating you wish. One reasonable way would be to use basic trigonometry to construct triangles with, say, 1- and 10-degree angles. (The larger one would allow you to construct the major divisions without having to build up large angles from many small ones, accumulating errors as you go.) You might construct 30- and 60-degree angles with a compass, since they are very simple and reliable, then fill in every 10 degrees, and then fill in the degrees between.

Ivan’s question was originally titled, “Marking my own protractor”, so he was asking about practical methods without fancy machines.

We can close, as we did on rulers, with a question from 1999 about word origins:

Meaning of the Words Compass and Protractor Why is a compass called a compass when it protracts and retracts, and why is a protractor called a protractor when it is used to measure degrees? I know this is not really a high school math question, but it intrigues me a great deal.

“Protract” means “extend”, and a compass can be moved in and out, so one could call that protracting! And measuring angles doesn’t sound anything like “protraction”. What’s going on?

I replied:

Hi, Heather. This is not exactly a math question, but I like to talk about words too! The word "compass" means "to go around, encircle"; it originally meant "to measure" by pacing (passus). Since a compass "goes around," this makes some sense.

You can envision “encompassing” something as making a circle around it, “pacing off” a circuit. As far as I can tell, the magnetic compass we briefly looked at above is called that mostly because its scale “goes around”.

"Protract" originally meant "to drag out, lengthen." How is that related to what a protractor does? My dictionary gives one definition as "[surveying] To draw to scale by means of a scale and protractor." I suspect that because scale drawings or maps make heavy use of angles (remember the AAA similarity theorem: a scale drawing is similar to the original), the protractor was originally thought of as a tool for protracting in this sense.

The “tract” in “protract” is the same as in “tractor”, meant for pulling things like plows (or spacecraft, if you’re thinking of a “beam”). But the words “drag”, “draw”, and “draft” are also closely related; in *drawing* or *drafting* a map, you are *dragging* a pencil around. To “protract” a map meant to “draw it out”.

Words don't have to make sense to be used, but I think we can see at least a little sense in the history behind these words.]]>

We can start with an early question and answer (1997) that we regularly used as a reference for students asking about this:

Reading a Ruler I need to read a ruler, and I can't. Can you help me?

Doctor Rob replied:

I hope so. There are two main kinds of rulers in general use, and other, more obscure kinds. We will ignore the obscure ones.

Here is an example of a ruler that combines the two kinds we’ll be seeing:

First there is the ruler marked in inches, and each inch is subdivided into 16 parts. The lines on it look something like this sketch of a part of a one-foot ruler: 8 9 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------------- D C D B D C D A D C D B D C D The lines have different lengths to help you figure out what lengths they represent. The shortest lines (D) represent an odd number of sixteenths of an inch. The next shortest lines (C) represent an odd number of eighths of an inch. The next shortest lines (B) represent an odd number of quarters of an inch. The next shortest lines (A) represent an odd number of halves of an inch. The longest lines (8 or 9) represent whole inches, and are numbered. The lines labeled 8 and 9 above mark points on the edge of the ruler that are eight and nine inches from the left-hand end of the ruler. All distances aremeasured from that same left-hand end of the ruler, which could be, but probably isn't, marked "0".

Some rulers have an actual mark for 0, while others just start at the edge of the stick.

This traditional set of markings is based, not on 10 as in our decimal system of counting, but on 2: the denominator is always a power of 2. (As we’ll see below, the markings may go only to eighths, or go further than 16, but a denominator of 16 is common.) We’ll have different pictures below that may be easier to follow.

The linehalfwaybetween them labeled A above marks a point on the edge of the ruler, which is 8 1/2 inches from the end. That makes sense because 8 1/2 ishalfway between 8 and 9. The next shorter lines labeled B above are halfway between 8 and 8 1/2, and halfway between 8 1/2 and 9. The former one marks 8 1/4 inches, and the latter marks 8 3/4 inches. These make sense because 8 1/4 is halfway between 8 and 8 1/2, and 8 3/4 is halfway between 8 1/2 and 9. In the words of arithmetic, 8 + ([8+1/2]-8)*(1/2) = [8+1/4], ^^^^^^^^^ distance from 8 to [8+1/2] and [8+1/2] + (9-[8+1/2])*(1/2) = [8+3/4]. ^^^^^^^^^^^ distance from [8+1/2] to 9 Likewise, the next shorter lines labeled C above are halfway between 8 and [8+1/4], between [8+1/4] and [8+1/2], between [8+1/2] and [8+3/4], and between [8+3/4] and 9. They must therefore mark the distances [8+1/8], [8+3/8], [8+5/8], and [8+7/8], respectively. Finally, the shortest lines labeled D above are halfway between adjacent pairs of longer lines, and mark [8+1/16], [8+3/16], ..., [8+15/16].

That looks as if there were a lot of arithmetic with fractions needed here; it will get simpler in a moment!

When I measure a distance, Iput the "0" end of the ruler at one end, and then pick the mark on the ruler which is closest to the other end of the distance. The nearest inch line to the left gives me the number of whole inches. I then figure out whether this line is a 1/16 line (shortest), a 1/8 line, a 1/4 line, a 1/2 line, or an inch line. That tells me what thedenominatorof the fraction of an inch will be. From the inch line I count the lines the same length as my chosen one using odd number: "1, 3, 5, 7, ..." until I find my line. That tells me what thenumeratorof the fraction of an inch will be. I then combine the number of whole inches with the fraction to get the distance.

This is what replaces the arithmetic above! We count down in size to the type of mark we are looking at (doubling the denominator), then count odd numbers to find the numerator.

The metric system uses decimal numbers rather than fractions, so here we use tens.

The other common kind of ruler measures centimeters instead of inches. Each centimeter is divided into 10 parts (each called a millimeter). The lines on it look something like this sketch of a part of such a ruler: 13 14 | | | | | ||||||||||| ------------- CAAAABAAAAC The longest lines labeled C represent whole centimeters. The next longest line labeled B represents a half centimeter. The shortest lines labeled A represent tenths of centimeters, or millimeters. Since 1/2 = 5/10, the B line also represents 5 millimeters.

The fractional ruler used twos because those are easy to count; here we have a special mark at 5 for the same reason. In effect, the markings use base 2 and 5 (like Roman numerals).

To measure a length, put the left end of the ruler, which could be labeled "0" but probably isn't, at one end, and pick the closest mark on the ruler to the other end. Find the closest centimeter mark to the left of your mark. That will tell you the number of whole centimeters (13 in the above example). Thedenominatorof the fraction of a centimeter is fixed at10. Thenumeratoris found by counting from the whole centimeter mark you found above, and the medium-length lines B help you count by showing you where 5 tenths or half a centimeter is. If you are close to the "13" mark, you count up as you move to the right starting with "0" for the "13" mark itself. If you are close to the "B" mark, you can count up as you move to the right or down as you move to the left, starting with "5" for the B mark itself. If you are close to the "14" mark, you can count down as you move to the left, starting with "10" for the "14" mark itself. This will tell you the numerator of the fraction of a centimeter.

Normally we either write the length in millimeters, or use decimals rather than fractions. We’ll have an illustration of this later.

Our next answer is from 1999, and just fills in some details in the pictures:

Finding Fractions on a Ruler My question is simple. I want to know how to read all the measurements within 1 inch. We know that every line on a ruler or tape measure (whichever) has a fractional meaning. I want to learn how these fractions are arranged. I hope you understand what I mean by all fractions in an inch. Example: where is 7/8 of an inch compared to 5/16 of an inch on a tape measure? I know where the basic measurements are, such as 1/4, 1/2, and 3/4 of an inch. It's just the other measurements that I don't get.

I first referred to the answer above, and then added some more detailed diagrams:

I'll give you a simpler, quick answer that may meet your needs. Here's a ruler showing 16ths, with the meaning of each mark indicated: 0 1 | 1/2 | | | | | 1/4 | 3/4 | | | | | | | 1/8 | 3/8 | 5/8 | 7/8 | | | | | | | | | | | 1/16 | 3/16 | 5/16 | 7/16 | 9/16 | 11/16 | 13/16 | 15/16 | | | | | | | | | | | | | | | | | | +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Some rulers only go to eighths, and it might be easier to start with that: 0 1 | 1/2 | | | | | 1/4 | 3/4 | | | | | | | 1/8 | 3/8 | 5/8 | 7/8 | | | | | | | | | | +-------+-------+-------+-------+-------+-------+-------+-------+ 0 1/8 2/8 3/8 4/8 5/8 6/8 7/8 1 As the marks go down in size,the denominator of the fraction doubles. The biggest mark between two inches is thehalf; between that and either inch mark the next largest mark is thequarterinch; and so on. To findeighths, just go down the scale finding 1/2, then 1/4, then 1/8. All the marks that are that size or bigger are eighths: 1/8, 2/8 (which is the same as 1/4), 3/8, 4/8 (which is the same as 1/2), 5/8, and so on. If you want, you can just count all the marks that are the same size, counting by odd numbers: 1/8, 3/8, 5/8, 7/8.

As Doctor Rob explained, we can count only odd numbers; or we can just count all the lines of at least the length of the one we are reading. This is because all the even numbers correspond to fractions with a smaller denominator (and longer lines). For example, the 3/4 mark is the third line of at least its length, or we can count 1, 3 looking only at lines of the same length.

Try doing the same thing on the 16ths ruler above; the only hard part will be to ignore the smaller 16ths marks while you're looking for 8ths. To find 5/8, for example, find which size marks are 8ths, then count 1, 3, 5/8 until you find it. Now you should be able to work out 32nds yourself: 0 1 | 1/2 | /2 | | | | 1/4 | 3/4 | /4 | | | | | | 1/8 | 3/8 | 5/8 | 7/8 | /8 | | | | | | | | | | 1/16 | 3/16 | 5/16 | 7/16 | 9/16 | 11/16 | 13/16 | 15/16 | /16 | | | | | | | | | | | | | | | | | | 1 | 3 | 5 | 7 | 9 | 11| 13| 15| 17| 19| 21| 23| 25| 27| 29| 31| /32 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The 16th ruler would, of course, just lack the smallest lines here.

In 2000, Doctor Ian answered a very similar question with details on how to find the length of an object. The question was this:

Reading a Ruler II What do the other little lines on the ruler stand for?

Doctor Ian first showed a metric ruler:

A metric ruler divides a meter into 100 centimeters, 0 1 2 99 100 |___|___|___ ... ___|___| and divides each centimeter into millimeters, 0 1 2 | | | ||||||||||||||||||||| ^ ^ 10 mm 19 mm 1 cm 1.9 cm 0.1 m 0.19 m which means that the distance between any of the smallest lines is 1/1000 of a meter.

This version lacks the longer line at 5; but we can see here that we are counting millimeters (mm), which are tenths of a centimeter, so that 19 mm is the same as 1.9 cm.

After showing fractional rulers as we’ve seen above, he took it further, showing exactly how to read a length:

So let's say I want to measure something with a ruler. I set it down next to the ruler, 0 1 2 3 | | | | | | | | | | | | | | | | | | | | | | | | |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_| ############################################ ############################################ ############################################ ############################################ ^_______________^ 2 to 3 in ^_______^ 2-1/2 to 3 in ^___^ 2-1/2 to 2-3/4 in ^_^ 2-5/8 to 2-3/4 in and I find the two closest inch-marks that are on either side of it. In the picture above, these are the 2- and 3-inch marks. So we know that the thing is between 2 and 3 inches long.

If all we care about is the nearest inch, we’re done! It’s between 2 and 3 inches, and closer to 3. But we want to be as precise as possible:

Next, we look at the 2-1/2 inch mark. The thing extends past it, so we know that the thing is between 2-1/2 and 3 inches long. Next, we look at the mark halfway between those, the 2-3/4 inch mark. The thing doesn't extend quite that far, so we know that the thing is between 2-1/2 and 2-3/4 inches long. And so on, until we've identified the two closest marks that are on either side of the end of the thing we're measuring. At this point, if it's very close to either mark, we can just call that the measurement. Or if it's right in between, we can take the average. In the example above, that would be: 5/8 + 3/4 2 and ----------- inches 2 5/8 + 6/8 = 2 and ----------- inches 2 11/8 = 2 and ----------- inches 2 = 2 and 11/16 inches Note that you can measure something to the nearest 16th of an inch, even though the ruler is only marked to the nearest 8th of an inch.

An alternative way to do this last calculation is just to see that it’s about 1/16 more than 5/8, which is 10/16, so we estimate it as 11/16.

The next question, in 2005, called for a still more detailed explanation:

Reading a Ruler III My daughter came home with a worksheet that had a ruler on it and it has letters on the ruler in which she is supposed to say the place (value) the letter represents on the ruler. Now it's been a while since I have done this. I am looking for simple, basic instructions to teach a child how to read a ruler. Of course I can instantly tell you where a 1/2 is; however, the rest is very vague to me. I somehow remember that we used to count the lines--where the line falls on the ruler is the top number of the fraction and how many lines in between 0 to 1 inch represents the bottom number (the whole). Is this correct? I like your examples on how to read a ruler. However, if you canjust count where the line isand put that in the numerator and counthow many lines make the wholeand put that in the denominator that would be helpful to make it so that everyone truly understands how to read the ruler.

Doctor Ian assumed that Mary had read the answers above, and needed more:

Let's look at an example. Suppose I'm measuring something that's something and 7/16 of an inch long. | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------------- xxxxxxxxxxxxxxxx

Here we see just the last inch of the object; the “something” would be the label on the inch mark at the left.

In practice, I'd look at the 1/2 mark, and say "That's too big." So I'd look at the 1/4 mark, and say "That's too small." So now I know it's between 1/4 and 1/2, i.e., between 1/4 and 2/4, which is between 2/8 and 4/8. What's halfway between them? 3/8. So I look at 3/8, and say "That's too small." So now I know it's halfway between 3/8 and 1/2, i.e., between 3/8 and 4/8, which is between 6/16 and 8/16. What's halfway between them? 7/16. And that works. So I'm done.

Another way would be to just see that it is one small mark (1/16) less than 1/2, and subtract; but that requires work with fractions. We’ll get to the “just count” approach Mary wants soon!

The whole system is based on successively breaking things into halves, so I just go with the flow. That is, you can imagine a ruler where the markings only go down to 1/2 an inch: | | | | | | | | | | | | | | ----------------------------------- 0 1/2 2/2 = 1 If we decide that's not fine enough, we double the denominators, | | | | | | | | | | | | | | ----------------------------------- 0 2/4 4/4 = 1 and that gives us room for more numerators: | | | | | | | | | | | | | | | | | | | | ----------------------------------- 0 1/4 2/4 3/4 4/4 And so on, through 8ths and 16ths and even 32nds, if we can make our markings finely enough. Of course, this is all but impossible to deal with unless you're pretty facile at converting between equivalent fractions with powers of 2 in the denominator, e.g., 1/2 = 2/4 = 4/8 = 8/16 = 16/32 That's the basic skill you need to make the system work.

What you don’t do this often enough to develop that skill?

The idea ofcounting all the marksto figure out what the denominator should be would work in theory, butin practice it would be pretty slow. But with some practice, you could do something sort ofequivalent but quicker. That is, you could start by identifying what size mark is next to the end of the thing you're measuring: | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ----------------------------------- xxxxxxxxxxxxx^ |_________ This is the size I want. Next, you could look at the 1/2 mark, and back up halfway to the 1/4 mark. If that's the right size, you can now just count quarters. If it's still too small, back up halfway to the 1/8 mark. If that's the right size (in the example above, it is), you can just count eighths. And so on. Note that when you're 'counting quarters' or 'counting eighths', you have to count any marks that areat least as large as the one you're using; but you can ignore any that are smaller. So that's pretty straightforward, but again it require at least a little facility with powers of 2 in the denominator.

In this example, we’ve identified the mark as an eighth, so we can just count marks this big or longer: 1, 2, 3; or, as we’ve said above, just count the actual eighth marks by twos: 1, 3.

Several of us are interested not only in numbers, but also in words. I find the following fascinating:

History of the Word I was just wondering, why is a ruler called a ruler?

Doctor Rick answered:

Do you sometimes look at a word you see every day, and suddenly it seems strange? I do that. I, too, wonder how a word can mean two things that seem very different. The history of words (called etymology) has lots of strange stories that sort of make sense when you think about them. A good place to start answering questions about words is a dictionary. Looking in my dictionary (Random House Webster's College Dictionary), the etymology listing under "rule" says this: [1175-1225; (n.) ME riule, reule < OF riule < L regula straight stick, pattern, der. of regere to fix the line of, direct (see -ULE). In other words: The word was first used around 1200. Before that (in "Middle English", it was spelled with an "i" or an "e". Why? Because it was spelled that way in Old French. The French took the Latin word "regula", dropped the g and changed the a to a silent e. (The French did that sort of thing a lot, dropping letters and not pronouncing a lot of the letters they kept!)

English does that too …

The Latin word "regula" meant astraight stick. That's a ruler: a device for making straight lines. Whether the Romans put marks on their rulers to measure lengths, I don't know, but apparently the most important thing about it was that it was straight. Why do I say that? Because it comes from the word regere, which meant "to guide or direct, tomake straight." When you use a ruler to make a line, it keeps the line from going astray in either direction; it guides your pencil the way you want it to go.

We still (sometimes) talk about “ruled” paper, meaning it has straight lines drawn on it.

Now that we know that the word from which we get "rule" or "ruler" had "a straight stick" as its primary meaning,our question gets turned around: How did "rule", meaning "a straight stick", come to mean "tell others what to do", or "a law that we have to follow"? But maybe you can already see the answer.A ruler sets the standard; it's a pattern to be followed. If you put a ruler next to a line, you can tell right away if the line is crooked. A rule or regulation (notice where *this* word comes from?) sets a standard, so everyone knows if you "break the rule". Someone who sets the rules for others to follow is called a ruler.

Did you notice that the original meaning of ruler was just “straightedge”, without reference to having marks on it?

There are lots of other words related to rule, ruler andregulation. Something that is "regular" follows the rules; somethingirregularbreaks the rules. The Romans put the prefix de- in front of regere and it became derigere: to straighten or direct--we get the worddirect(and the worddirigible, something that can be steered) from derigere. And notice the "rect" part of direct: all sorts of words, such asrectangle, and evenright(angle), come from Latin rectus, which means "made straight" from regere, to make straight. There's more on that here: Left Angles http://mathforum.org/library/drmath/view/58420.html If you found that interesting, I hope you can take a Latin class in a few years. You'll learn a lot about words that way. If you lost interest after the first paragraph, ... anyway, I'm done now.

Next time, let’s look at protractors!

]]>I’ll start with this question from 1998:

The Importance of Geometry Constructions I am doing a report on constructions in geometry. I would like to knowwhy constructions are important. I realize that they challenge us to use different tools but there must be more to it then that. So I was wondering if you could give me more of a reason why constructions are so important?

Since many things we ask children to do are largely to get them used to certain ways to use their hands or bodies, it is understandable that Kel would suppose that we teach constructions just because compasses and straightedges are worth knowing how to use. But that isn’t really it. Ultimately, it’s because our *minds* are worth knowing how to use!

I answered:

Hi, Kel. That's a good question. We tend to teach it out of tradition, and forget to think about why it's worth doing! Certainly learning how to use the tools is useful. Some of the techniques are useful in construction (of buildings, furniture, and so on), though in fact sometimes there are simpler techniques builders use that we forget to teach. But I thinkthe main reason for learning constructions is their close connection to axiomatic logic. If you haven't heard that term, I'm talking about the whole idea of proofs and careful thinking that we often use geometry to teach.

I’ve used compass constructions when I helped renovate a church building; but then the “compass” was a length of string. It was the idea behind it that really mattered.

Euclid, the Greek mathematician who wrote the geometry text used for centuries, stated many of his theorems in terms of constructions.His axioms are closely related to the tools he used for construction.Just as axioms and postulates let us prove everything with a minimum of assumptions, a compass and straightedge let us construct everything precisely with a minimum of tools. There are no approximations, no guesses. So the skills you need to figure out how to construct, say, a square without a protractor, are closely related to the thinking skills you need to prove theorems about squares.

A construction is, at root, a theorem: If you follow this sequence of steps, the result will necessarily be the object you claim to be creating, such as the bisector of an angle, or a triangle that meets certain requirements. So learning to design a construction is practice in “constructing” geometrical proofs. Practice in construction is not primarily practice in using your hands, but your mind.

I closed my short answer by referring to the first proof in Euclid’s *Elements*, Proposition I.1:

## Proposition 1

To construct an equilateral triangle on a given finite straight line.Let

ABbe the given finite straight line.It is required to construct an equilateral triangle on the straight line

AB.

Describe the circleBCDwith centerAand radiusAB.Againdescribe the circleACEwith centerBand radiusBA.Join the straight linesCAandCBfrom the pointCat which the circles cut one another to the pointsAandB.Now, since the point

Ais the center of the circleCDB,thereforeACequalsAB.Again, since the pointBis the center of the circleCAE,thereforeBCequalsBA.But

ACwas proved equal toAB,therefore each of the straight linesACandBCequalsAB.And things which equal the same thing also equal one another, therefore

ACalso equalsBC.Therefore the three straight lines

AC, AB,andBCequal one another.Therefore the triangle

ABCis equilateral, and it has been constructed on the given finite straight lineAB.Q.E.F.

This theorem is in reality a construction. Note that the steps involve making circles (with a compass) and making lines (with a straightedge); and at the end he puts “Q.E.F.”, short for “Quod erat faciendum”, Latin for “Which was to be done”. (Euclid, of course, actually used Greek, “ὅπερ ἔδει ποιῆσαι”, “hoper edei poiēsai”.)

In 2002, we got a similar question from a teacher, that called for a little more detail on how the axioms (Euclid’s Postulates) relate to the compass and straightedge:

Why Straightedge and Compass Only? My Geometry students want to knowwhy constructions can only be done using a straightedge and a compass. They want to know why they can't just measure a line segment to copy it or use a protractor to construct an angle. What's the difference? We have searched our book as well as some internet sites containing constructions, but to no avail.

I referred back to the previous answer, then elaborated.

There are two ways that I can see to explain the restrictive rules for constructions, which come to us from the ancient Greeks: 1. They are justthe rules of a game mathematicians play. There are many other ways to do constructions, but the compass and straightedge were chosen as one set of tools thatmake a construction challenging, by limiting what you are allowed to do, just as sports restrict what you can do (e.g. touching but not tackling, or tackling but no nuclear weapons) in order to keep a game interesting. Other tools could have been chosen instead; for example, geometric constructions can be done using origami.

Euclid could have started with any tools he wanted; but a major goal was to restrict what could be done, as sort of a game to see how little we can use, to do how much.

For more on axioms or postulates, see my series in July 2018, beginning with Why Does Geometry Start With Unproved Assumptions?

But it’s not just a game; it’s *the* game:

2. They are the basis of an axiomatic system, with the goal of ensuring that geometry is built on a solid foundation. Euclid wanted to start with as few assumptions as possible, so that all of his conclusions would be certain if you just accepted those few things. So he listed five postulates (in addition to some other assumptions even more basic); I've taken these from the reference given in my answer above: Postulate 1. [It is possible]to draw a straight linefrom any point to any point. Postulate 2. [It is possible]to produce a finite straight linecontinuously in a straight line. Postulate 3. [It is possible]to describe a circlewith any center and radius. Postulate 4. That all right angles equal one another. Postulate 5. That, if a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles.

He starts with the existence of lines and circles, then adds only two additional facts. (His system is not quite complete, and additional axioms are now known to be necessary.)

The first two postulates say that you can use astraightedge: line it up with two given points, and draw the line between them, or line it up with an existing segment, and draw the line beyond it. That's the first tool you are allowed to use, and those are the only ways you are allowed to use it.

As is often noted, you are not allowed to do other things, like measure or copy a length by making marks on the straightedge. This is not just because Euclid wanted to keep his tools clean! It’s because he wanted to minimize his assumptions, proving as much as possible starting with as little as possible.

The third postulate says you can use acompassto draw a circle, given the center and radius (or a point on the circle). That is the only way you are allowed to use the compass; you can't, for example, draw a circle tangent to a line by adjusting its radius until it _looks_ tangent, without knowing a specific point the circle has to pass through. The last two postulates relate to angles, and are less associated with the construction process itself than with what you see when you are done.

Again, the restrictions are to minimize the assumptions, not because his compass was defective. I actually understated the restriction in this case. (More on that later.)

So really the two tools Euclid required for a construction just represent the assumptions he was willing to make: if these two tools work, then you can construct everything he talks about. For example, you can use these tools, in the prescribed manner, to construct a tangent to a given circle through a given point; but it takes some thought to find how to do so (without just drawing a line that _looks_ tangent), and it takes several theorems to show that it really works.

Here we are back to the challenge! And the goal is not just to make something that looks right, but to be able to **prove** something.

Of course you CAN just measure a line or an angle, if your goal is just to make a drawing - andusually that will be more accurate than a complicated compass construction! But when you use only the tools allowed in this game, you are actually playing within an axiomatic system, getting a feel for how proofs work. You are simultaneously playing a challenging game, and doing one of the few things in life that can give you absolute certainty:if these lines and circles were exactly what they pretend to be (with no thickness, etc.), then the point I construct would be exactly what I claim it is. And it's that sense of certainty that the Greeks were looking for.

I didn’t mention above a special restriction on the compass, which turns out to be entirely theoretical. We got a question about that in 2003:

Collapsible Compass I need to knowwhat a collapsible compass isand what it is used for. All I know is that when you pick it up from the paper, you lose your place.

Again, I answered the question, keeping it brief:

The collapsible compass is not something that is "used"; rather, it represents the fact that Euclid wanted to make as few assumptions (postulates, or axioms) at the base of his proofs as possible. So rather than assume that it was possible to move a line around, keeping the same length (as you could do with a real, fixed compass), or equivalently that you can draw a circle with a given center and length,he assumed only that you can draw a circle with a given center and through a given point. Then he went on to prove that if you could do that, you COULD then construct a circle with a given radius, or move a line to a given place: Collapsible Compass http://mathforum.org/library/drmath/view/52601.html

The reference is to a short answer that links to the proposition I am about to discuss.

Where I quoted Euclid’s postulates above, it may look as if you can just set the compass to any radius you want, contrary to what I’ve said here: “[It is possible] to describe a circle with any center and radius.” But the word “radius” to Euclid does not refer to a *number*, as we think of it today, but to a specific *segment*! This is made explicit in the commentary to the Elements on Joyce’s site that I’ve referred to before, Postulate 3:

Circles were defined in Def.I.15 and Def.I.16 as plane figures with the property that there is a certain point, called the center of the circle, such that all straight lines from the center to the boundary are equal. That is,

all the radiiare equal.The given data are (1) a point A to be the

centerof the circle, (2) another point B to be on thecircumferenceof the circle, and (3) a plane in which the two points lie. …Note that this postulate does not allow for the compass to be moved. The usual way that a compass is used is that is is opened to a given width, then the pivot is placed on the drawing surface, then a circle is drawn as the compass is rotated around the pivot. But this postulate does not allow for transferring distances.

It is as if the compass collapses as soon as it’s removed from the plane.Proposition I.3, however, gives a construction for transferring distances. Therefore, the same constructions that can be made with a regular compass can also be made with Euclid’s collapsing compass.

Here is Proposition I.3:

## Proposition 3

To cut off from the greater of two given unequal straight lines a straight line equal to the less.

In the construction, Euclid effectively draws a circle whose radius is the length of a given segment located elsewhere. In order to do this, he uses Proposition I.2:

## Proposition 2

To place a straight line equal to a given straight line with one end at a given point.

This is equivalent to setting a compass to the length of the given segment. So with these propositions, in effect we have used the “collapsible compass” to make a new, better compass that doesn’t collapse.

Continuing with my answer,

Once that was proved, you didn't have to use a collapsible compass, but could use a regular one, knowing thatany construction you could do this way, you could do with a collapsible compass. Note that a real "collapsible compass" would not work, because the radius would change as you drew the supposed circle! The idea is that it holds together and keeps its radius as long as the center point is in place, but loses it when you pick it up. I don't know of any design for a real-world compass that would work that way;this is a tool that exists only in the mind of a geometer, as a description of Euclid's postulate 3.

Let’s close with a question from 2011, asking about the history of real compasses:

The Collapse of Compasses that Do Not Copy Segments, and the Lengths We Go To http://mathforum.org/library/drmath/view/76724.html Having studied both ancient Euclidean geometry and more recent treatments of the subject, I am wondering exactlywhen it was that compasses began to be used to copy lengths. I have read that Euclid disallowed compasses from doing this because in his geometry there was no motion. To me, this restriction makes a lot of sense both from a practical and a theoretical viewpoint: one cannot set the compass at a certain length and then truly claim the ability to duplicate that length anywhere other than at a segment radiating from its current center. Anyway, Euclid covers this problem in his second proposition. But in our current geometry textbook, my students and I encounter "Copying a segment is easy:we just set the compass to the length of the segment, and then copy it elsewhere on the page." Perhaps this concession comes only after centuries' worth of students saying, "Why can't we just measure the length and then copy it?" Obviously, this does make a certain type of sense, but to me it seems a sour departure from Euclid's impregnable logic. What really confuses me is that Euclid already HAD a solution to this problem. Granted, that construction is somewhat involved, even difficult -- certainly out of proportion for such a simple objective. But not only does it WORK, it maintains our ability to do consistent geometry without introducing this vague notion of "moving" lengths. My current work on this problem involves the simple observation that when we move compasses, they often shift slightly. Unless we have a drafting compass or something, the "pick-up-the-compass-and-move-it" part will usually involve some alteration in the setting of the compass, however imperceptible.

As I’ve said, it was never really about physical compasses. I referred to the page above and continued:

I can't say that I know anything about the history of compasses. Many modern compasses (and likely many old ones) can hold their settings reasonably well. Sinceusing them that way is simply a shortcut to something that CAN be done with the collapsible compassassumed in Euclid's postulates (a la Proposition 2, to which you refer), to require students not to trust them to do this would make constructions a terrible chore, andwould not make math any more understandable, much less enjoyable.

We don’t have to restrict ourselves, because the nature of the compass is nothing more than a reflection of an axiom. (The ones we use in class reflect a theorem instead!)

So how should we teach students about this?

In a course built around postulates, I would certainly mention that the postulate does not in itself allow transferring a length using a compass, but that we can use that process as ashortcut to a longer procedurethat could be done with the so-called collapsible compass. ButI would not subject students to more than one exercise in which they actually had to perform the long procedure of copying a segment!Once they know it can be done, doing it would be a waste of time. I imagine that may have been true even in Euclid's time.

Mathematicians make jokes about their propensity to skip over anything that has been “reduced to a previously solved problem”. Once you know you *could* do it with a collapsing compass, you are allowed to do it directly.

But there’s some reality there, too:

Of course I should note that there are also many (cheaper)compasses that don't hold their settings well enough to make even a single circle accurately! In my math course for elementary teachers, most of them buy a compass of poor quality, and I have to take some time to teach them how to use it so that it will retain its setting (which involves holding the compass in such a way that all forces are perpendicular to the direction in which it opens). This is probably a good thing for teachers to have to learn! For high school students, good compasses would be a good investment.

But those cheap compasses don’t just collapse between circles like Euclid’s imaginary compass; they slip even as you draw! Never use them for anything that matters.

]]>First, here are a couple paragraphs from the 2017 answer I discussed last time (Even More on Order of Operations), that transition to this final topic:

In talking about the extra “juxtaposition” rule taught in some textbooks, I pointed out,

What many people don't realize is that the "rules" we teach are only an attempt atDESCRIBING what mathematicians did for a long time without explicitly stating what rules they were following. They do not PRESCRIBE what inherently must be done, a priori. In just the same way, English grammar came long after English itself, and has sometimes been taught in a way that is inconsistent with actual practice, in an attempt to make the language seem perfectly rational.

At this point, I referred to the post I’ll be discussing below, on the history of the order of operations. Then I concluded,

In my opinion,the rules as usually taught are not the best possible description of how expressions are evaluated in practice. (This is supported by a recent correspondent who found articles from the early twentieth century arguing thatthe rules newly being taught in schools misrepresented what mathematicians actually did back then.) Unfortunately, for decades schools have taught PEMDAS as if it must be taken literally, so that one must do all multiplications and divisions from left to right, even when it is entirely unnatural to do so. The better textbooks have avoided such tricky expressions; but others actually drill students in these awkward cases, as if it were important.

I’ll examine that 1917 article later.

We’ve often been asked where the rules came from. The fullest answer we’ve given to that question was to this version in 2000:

History of the Order of Operations I was teaching a computer class and the history of order of operations came up. Where, when and with whom did the order of operations first originate? Was it the Greeks or Romans? Thank you! There is a whole class waiting to hear the answer.

The problem is that not much is written about this history; I had to pull a few ideas (some of them just speculations) from a variety of areas. My answer is, in fact, given as a reference in the Wikipedia article on the subject. I began:

The Order of Operations rules as we know them could not have existed beforealgebraic notationexisted; but I strongly suspect that they existed in some form from the beginning - in the grammar of how people talked about arithmetic when they had only words, and not symbols, to describe operations. It would be interesting to study that grammar in Greek and Latin writings and see how clearly it can be detected.

As mathematicians through the 17th century gradually moved from stating equations entirely in words, to modern symbolic notation, the grammar of the symbols was part of that development, and likely carried along some of the grammar of their languages. For a quick look at what some of the early notations looked like, see here. Every writer used a slightly different notation, which he explained at the beginning of a book or chapter.

Subsequently, mathematicians just informally and tacitly agreed on how to read their various notations; and textbook authors formalized the “rules”, largely in the 1800’s.

At the other end, I think that computers have influenced the subject, so that it is taught more rigidly now than it used to be, sinceprogramming languageshave had to define how every expression is to be interpreted. Before then, it was more acceptable to simply recognize some forms, like x/yz, as ambiguous and ignore them - something I think we should do more often today, considering some of the questions we get on such issues.

Many people have written to us, convinced that the rules had changed since they were in school. That is, in fact, possible in some areas! Computers need well-defined rules more than people do, so some details that humans had had no trouble working around were formalized in computer languages, and some of that has leaked back into ordinary mathematical writing and teaching.

I spent some time researching this question, because it is asked frequently, but I have not found a definitive answer yet.We can't say any one person invented the rules, and in some respects they have grown gradually over several centuries and are still evolving.

We’ll see some of that current evolution at the end.

The easiest and earliest part seems to have been the central hierarchy of operations:

1. The basic rule (thatmultiplication has precedence over addition) appears to have arisen naturally and without much disagreement as algebraic notation was being developed in the 1600s and the need for such conventions arose. Even though there were numerous competing systems of symbols, forcing each author to state his conventions at the start of a book, they seem not to have had to say much in this area. This is probably becausethe distributive property implies a natural hierarchyin which multiplication is more powerful than addition, and makes it desirable to be able to writepolynomialswith as few parentheses as possible; without our order of operations, we would have to write ax^2 + bx + c as (a(x^2)) + (bx) + c It may also be that the concept existed before the symbolism, perhaps justreflecting the natural structure of problemssuch as the quadratic.

What I’ve said here is closely related to the reasons for the order of operations discussed in Order of Operations: Why These Rules?.

You can see an example of early notation in "Earliest Uses of Grouping Symbols" at: http://jeff560.tripod.com/grouping.html where the use of a vinculum (an early version of parentheses) shows, both in its presence (around an additive expression) and its absence (around the multiplicative term "B in D") that the rules were implicitly followed: In Van Schooten's 1646 edition of Vieta, ________________ B in D quad. + B in D is used to represent B(D^2 + BD).

The example is also found in Cajori’s A History of Mathematical Notations, Vol 1, p. 182, and again (in a discussion of aggregation, or grouping, symbols) on p. 386.

At this point in the development of notation there was a mixture of words and symbols; multiplication was indicated by the word “in” (I’m not sure why!), and not yet by any of our current symbols (much less by juxtaposition). But in order to ensure that the two terms are added before the multiplication by B, they must be grouped; whereas under the vinculum we clearly have two terms, each formed by multiplication before the addition is performed.

2. There were some exceptions early in this development; in particular, math historian Florian Cajori quotes many writers for whom, in the special case of a factorial-like expression such as n(n-1)(n-2) the multiplication sign seems to have had some of the effect of an aggregation symbol; they would write n * n - 1 * n - 2 (using a dot or cross where I have the asterisks) to express this. Yet Cajori points out that this was an exception to a rule already established, by which "nn-1n-2" would be taken as the quadratic "n^2 - n - 2."

This reference is to Cajori Vol. 1, p. 396, where he says, “In \(n\cdot n-1\cdot n-2\), or \(n\times n-1\times n-2\), or \(n, n-1, n-2\), it was understood very generally that the subtractions are performed first, the multiplications later, a practice contrary to that ordinarily followed at that time.”

There was also an early notation in which a multiplication would be replaced by a comma to indicate aggregation: n, n - 1 would mean n (n - 1) whereas nn-1 meant n^2 - 1.

This use of commas is explicitly mentioned on page 390. It does seem helpful to have a symbol that combines multiplication and grouping for cases where that is appropriate. Still, this is all a special case distinct from the multiplication-first order that was already well-established.

If the “rules” evolved gradually through usage, it should not be surprising that some are still not fully settled:

3. Some of the specific rules were not yet established in Cajori's own time (the 1920s). He points out that there wasdisagreement as to whether multiplication should have precedence over division, or whether they should be treated equally. The general rule was that parentheses should be used to clarify one's meaning - which is still a very good rule. I have not yet found any twentieth-century declarations that resolved these issues, so I do not know how they were resolved. You can see this in "Earliest Uses of Symbols of Operation" at: http://jeff560.tripod.com/operation.html

Cajori makes this statement on page 274.

4. I suspect that the concept, and especially the term "order of operations" and the "PEMDAS/BEDMAS" mnemonics, was formalized only in this century, or at least in the late 1800s, with the growth of the textbook industry. I think it has been more important to textbook authors than to mathematicians, who havejust informally agreed without needing to state anything officially.

By “this century” I meant, somewhat belatedly, the 20th century. I don’t have specific information on the earliest uses of these terms, but I’ll get to one evidence of it below.

The rules were never decreed “officially”, and even now are unstable, as some parts are not taught consistently (the topic of the last post):

5. There is still some development in this area, as we frequently hear from students and teachers confused by texts that either teach or imply thatimplicit multiplication(2x) takes precedence over explicit multiplication and division (2*x, 2/x) in expressions such as a/2b, which they would take as a/(2b), contrary to the generally accepted rules. The idea of adding new rules like this implies that the conventions are not yet completely stable; the situation is not all that different from the 1600s.

As in early writings on symbolic algebra, it is *still* necessary to state the rules one is using!

I concluded that some rules are inherent in the way operations work, and are clearly appropriate, while others are more debatable:

In summary, I would say that the rules actually fall into two categories: thenatural rules(such as precedence of exponential over multiplicative over additive operations, and the meaning of parentheses), and theartificial rules(left-to-right evaluation, equal precedence for multiplication and division, and so on). The former were present from the beginning of the notation, and probably existed already, though in a somewhat different form, in the geometric and verbal modes of expression that preceded algebraic symbolism. The latter, not having any absolute reason for their acceptance, have had to be gradually agreed upon through usage, and continue to evolve.

That’s where I left it in 2000.

In 2017 we had a long discussion (never archived) with a reader named Karen, in the course of which there was a reference to an interesting article by N. J. Lennes in the *American Mathematical Monthly* of February 1917: Discussions: Relating to the Order of Operations in Algebra. Here are my comments on it:

I agree with some aspects of the article, and in fact said something like it both in my "History of the Order of Operations" and in my comment to you aboutwhat my ideal rules would be. When I answer questions about the issue, I take the usual teaching, and the current contradictory rules, for granted, and don't generally dig into whether the rules make sense. But the article is about exactly what I usually leave unsaid.

I generally talk about what we should do *given the way the order of operations is currently taught*, rather than what it would be if I had my say. Here, I have my say, because that is what Lennes is doing at a time in history when that was easier.

After quoting some of what I said above on the history, specifically on disagreements over the order of multiplication and division, I continued:

One interesting thing about Cajori's comment is that he only talks aboutthe order of the obelus (÷) and the explicit multiplication sign(Greek cross, ×), and doesn't mentionexpressions combining the obelus and implicit multiplication(juxtaposition). The same is true of all the references in the Earliest Uses page except the modern example. The article you found (which I haven't seen before) is from a little before Cajori, and the first section likewise does not mention juxtaposition. It is my impression that the "rules" for order of operation (which as I have mentioned elsewhere are, like many prescriptive "rules" of grammar, really descriptions rather than actual underlying rules) weredeveloped in such a context, using only explicit multiplication, where it feels reasonable since all the signs are the same size! When you start using juxtaposition (as in the second section of your article), things change.

For instance, in \(2\div 3\times x\), the symbols *look* similar and separate the numbers by similar distances, whereas in \(2\div 3x\), the multiplication appears “tighter” and is naturally treated as a single unit. And the latter is where Lennes, but not Cajori, focuses his attention:

As Lennes points out, the "rules" that were (and are now) taught as if they were laws of nature,do not actually reflect what was found in real use, *in cases when juxtaposition is used for multiplication*. The whole idea is reallya false extrapolation from what is done in easy cases to a general rule, making everything seem neater that it really should be. (Educators have made that same sort of mistake in other areas as well.) That has led to generations of students being taught a simplistic set of rules that really don't work in mathematicians' own writings. That, ultimately, is what leads to the ambiguity we have been discussing, aspeople have been forced to fill in the gap between rules and reality in whatever way they can.

When rules don’t fit nature, people don’t follow them.

This is not unlike pseudo-rules of grammar like “never end a sentence with a preposition” that are based on false assumptions about how things work, and not on how real people talk.

Lennes says that Chrystal (whoever he is -- I haven't been able to find such a textbook that might have been an early source of the "order of operations") iscareful never to use an obelus followed by a product(which is true of many modern texts as well), but that others do, and interpret, say, 10bc -:- 12a as (10bc)/(12a), so that they areinconsistent with their own stated rules. (My first exposure to this issue came from students asking about similarly inconsistent modern texts.)

Just as I have said about modern textbooks, the best of them avoided examples like this \(10bc\div 12a\), but those that include them too often failed either to follow their own rules, or to state that they are making an exception, and just evaluate as if it were \(\frac{10bc}{12a}\). Why? Because they think that rules are rules, but they are too human to really *follow* them.

I disagree with Lennes in his conclusion, however. He says that the rule should be that all multiplications are to be done first. As I have said to you,if I were free to decree the rule, I would have only implicit multiplication done before divisions, and that perhaps only when the division is expressed with the obelus. Lennes gives no examples of following his rule with explicit multiplication combined with an obelus, which I think would be less convincing. So he is perhaps making the same mistake that Chrystal does.

When you use only one type of example, you can fail to show reality, whichever direction that is.

In the end, he comes to the same sort of comparison I make between math and grammar, saying thattreating 12a as the divisor is an "idiom" that must be recognized. As he says, this is a matter not of logic but of history -- it is not something that can be proved, or that can be done by consistently following axioms, but by accurately describing actual use. I agree:the rules as taught are not accurate. I support them only because that is what students learn, and with the strong caveats that parentheses must be used to clarify, and that the obelus is best avoided.

I have taken this position (for example, “the alternative rule is not unreasonable”), ever since my first answer to the question, while also warning against ever writing a division followed by a multiplication (of any sort) without clarifying the meaning by parentheses. It was nice to learn that this view went back a hundred years!

Here is what Lennes says:

*Idiom* is not exactly the right word here, but the idea is important: The order of operations “rules” are not binding, but should only describe actual usage.

Incidentally, Karen also referred to this page by George Bergman,

Order of arithmetic operations; in particular, the 48/2(9+3) question

which as I said, could have been written by me, though its author evidently was new to the issue. As he says of PEMDAS (which he clearly is not familiar with as a teaching tool),

But so far as I know, it is a creation of some educator, who has taken conventions in real use, and extended them to cover cases where there is no accepted convention. …

Shouldthere be a standard convention for the relative order of multiplication and division in expressions where division is expressed using a slant? My feeling is that rather than burdening our memories with a mass of conventions, and setting things up for misinterpretations by people who have not learned them all, we should learn how to be unambiguous, i.e., we should use parentheses except where firmly established conventions exist. If expressions involving long sequences of multiplications and divisions should in the future become common, then there may be a movement to introduce a standard convention on this point. (A first stage would involve individual authors writing that “in this work”, expressions of a certain form will have a certain meaning.) But students should not be told that there is a convention when there isn’t.

It’s good to know I’m not alone in my opinions.

]]>Let’s first look at one of the earlier questions we had about this issue, in 1999, to set the stage:

Order of Operations The problem was presented like this: a = 1.56 b = 1.2 x = 7.2 y = 0.2 ax/by = ? Here are two ways that I solved it: 1) I first rewrote the problem as [1.56(7.2)/ 1.2](0.2). Second, a was multiplied by x. The product was 11.232. Then, since no parentheses were present, I followed the order of operations and divided 11.232 by b, which was 1.2. The quotient was 9.36. Then I multiplied 9.36 by y, which was 0.2. The final answer was 1.872. 2) The other way, the first thing I did was multiply a by x. The product, which was 11.232, was set aside for the time being. Then b was multiplied by y, which gave the product of 0.24. The problem was now solved by dividing 11.232 (or ax) by 0.24 (or by) to reach a final answer of 46.8. Can you please tell us which answer is correct and why?

(Note that at that time, the only way to type division in our email was to use the slash, \(a/b\), which I generally assume represents an expression actually written as \(a\div b\). I will occasionally be inserting an obelus, ÷, where we made rough attempts to simulate it.)

The first way follows PEMDAS literally, as usually taught and as I’ve presented it here, by evaluating from left to right as \(a\cdot x\div b\cdot y = ((a\cdot x)\div b)\cdot y\).

The second sees it as \(ax\div by = (ax)\div (by)\). This isn’t explained as following any taught rule, but just as doing what looks right, either because the division is read as if it were a fraction bar, or just because “*by*” looks like it belongs together as a unit. We’ll be seeing several reasons students have given for doing this.

Though I had been with *Ask Dr. Math* less than a year, this was already a familiar question, which I wanted to answer thoroughly for the sake of the archive:

You are not alone in wondering about this. We have had several other questions about expressions similar to yours, from confused teachers and students who have found that different books or teachers have different answers, and even calculators disagree.

Note that it is not only students doing what feels right, but also some textbooks and calculators that follow the second method.

I elaborated on the two methods, taking the PEMDAS version as correct (though I’ll have some second thoughts on that):

As written, your expression ax/by should be evaluatedleft to right: a times x, divided by b, times y. The multiplication is not done before the division, but both are done in the order they appear. Your first solution is right.Some texts make a rule, as in your second solution,that multiplication without a symbol ("implied multiplication") should be done before any other operations in an expression[except exponents], including "explicit multiplication" using a symbol. Following this rule, you would multiply a by x, then multiply b and y, then divide one by the other. Some (probably most) texts don't mention such a rule - butsome of those may use it without saying so, which is far worse.

I think I had made up the term “**implied, or implicit, multiplication**” when I answered my first question on the topic a few months before, to refer to multiplication indicated by just putting two numbers or variables or parenthesized expressions next to one another – “**juxtaposition**“, as others call it – like \( ab\) or \( 2b\) or \( a(b+c)\), as opposed to explicitly writing \( a\times b\) or \( a\cdot b\).

We had seen some questions from students whose textbooks taught only the usual PEMDAS, yet evaluated the second way in examples or solutions, without comment. This might have been due to the answers in the back being written by someone other than the author, but it is an inexcusable inconsistency.

Why would an author make this extra rule? I have had different opinions at various times about whether the rule is a good idea, but have always recognized that it is not what is usually taught:

I don't know of a general rule among mathematicians that implied multiplication should be done before explicit multiplication. As far as I'm concerned, all multiplications fit in the same place in the order of operations.It's not an unreasonable rule, though, since it does seem that implied multiplication ties the operands together more tightly,at least visually; but the idea of Order of Operations (or precedence, as it is called in the computer world) is supposed to be to ensure that everyone will interpret an otherwise ambiguous expression the same way - soif some texts change the rules, or if people do what feels natural, the purpose has been lost.

A rule that is not a rule is worthless, no matter how reasonable it is. Yes, the “new rule” is the natural way to read \(ax\div by\) because \(by\) looks like a single entity; but until everyone teaches that, we can’t do it and expect to be understood by all readers.

In particular, many students assume that it represents a horizontal version of \(\displaystyle\frac{ax}{by}\):

The problem here is that the expression looks as if it were meant to be ax ---- by In the Dr. Math FAQ about writing math in e-mail, one of our recommendations is touse parentheses wherever possible to avoid ambiguity, even where the rules should make it clear, because it can be easy to forget them in some situations. So in e-mail we would write it like this: ax/(by) or (ax/b)*y depending on what is intended.

By using parentheses, we can avoid writing something that people who were taught different rules, or who ignore the rules they were taught, might take differently than we intend.

In my research for another Dr. Math "patient," I found that some calculators have experimented with this rule. Calculators have somewhat different needs than mathematicians, since they have to take input linearly, one character after another, so they are forced to make a decision about it. On the TI Web site I learned that they deliberately put this "feature" into the TI 82, and then took it out of the TI 83, probably because they decided it was not a standard rule and would confuse people.

The link there went bad long ago; but when a specific question about a calculator came up in 2008, I quoted from what TI said in their Knowledge Base:

Implied Multiplication and TI Calculators ... Solution 11773: Implied Multiplication Versus Explicit Multiplication on TI Graphing Calculators. Does implied multiplication and explicit multiplication have the same precedence on TI graphing calculators?Implied multiplication has a higher priority than explicit multiplicationto allow users to enter expressions, in the same manner as they would be written. For example, the TI-80, TI-81, TI-82, and TI-85 evaluate 1/2X as 1/(2*X), while other products may evaluate the same expression as 1/2*X from left to right. Without this feature, it would be necessary to group 2X in parentheses, something that is typically not done when writing the expression on paper. This order of precedence was changed for the TI-83 family, TI-84 Plus family, TI-89 family, TI-92 Plus, Voyage™ 200 and the TI-Nspire™ Handheld in TI-84 Plus Mode. Implied and explicit multiplication are given the same priority.

This makes it clear that calculator designers have to decide on their own rules, which don’t have to be the same as rules for writing on paper; but educators seem to have convinced them to keep things as much the same as possible for students’ sake.

In conclusion (back to the 1999 answer):

So to answer your question, I thinkboth answers can be considered right- which means, of course, thatthe question itself is wrong. I prefer the standard way (your first answer) when talking to students,unless their own text gives the "implicit multiplication first" rule; but in practice if I came across that expression, I would probably first check where it came from to see if I could tell what was intended. The main lesson to learn is not which rule to follow, but how to avoid ambiguity in what you write yourself.Don't give other people this kind of trouble.

Subsequently, we had many more questions about this; I’ll just quote a few unique bits from some of those answers.

Here is a typical example of a school conflict, from 2000:

Order of Operations Dispute The problem reads: N ÷ ml where n=12, m=6, and l=3. I believe the correct answer should be .6666, as 12 divided by 18 equals this. My husband agrees with me. My son came home very upset from school, with a note from his teacher that the answer was wrong. She indicated that I should have divided the 6 (m) into 12 (n) before I divided the 3 (l) into the equation. Her answer was 6. My son is very upset with me; his teacher told him I was doing "old fashioned math." Do I need to go back to school?

The problem is \(N\div ml\), and the parents are doing the multiplication first. I replied, in part:

I can give you some good news and some bad news. First, the bad news: according to the usual order of operations rules now taught, your answer is wrong. ...

I explained the standard rules, and added:

BUT... You are not alone in your opinion. This part of the rule - doing multiplication and division together - is probably the last rule to have stabilized; I know that in the 1920's, at least, there was no agreement. It seems that an agreement developed, but it is unraveling now, as I hear from many students whose texts answer questions like this the way you did. It appears that they are addingan unstated rule, which seems entirely reasonable in this context, that an implied multiplication (indicated by simply placing two variables or expressions together, as in "ml") should be done first. It certainlylooksas if it should mean that. The problem is that, although I've heard of this rule beingfollowedfrequently, I've hardly ever heard of it beingtaught, so these texts are not following their own stated rules.

I’ll have more on the history next time.

Since this type of expression is so ambiguous, with people disagreeing on the rules, and the rules being easy to overlook, my own opinion is thatneither your answer nor the teacher's is right: the question is wrong. No responsible mathematician would write such an expression; we would just say n --- m l so there would be no question about its meaning. After all, the purpose of rules is to allow us to communicate clearly, not to help us trick students and start fights among families. So you may in fact be "old-fashioned"; or you may be on the cutting edge. In any case, I'm afraid you'll just have to learn how they are doing it in class, and follow along. There shouldn't be many more issues like this to worry about.

More recently, the fights tend to be on social media!

I’ll close with the most recent archived discussion. This question is from 2017:

Even More on Order of Operations I'm curious to know what the answer is for this: 8/4(3 - 1) Following strictly PEMDAS, the answer is 4: 8/4(2) 2*2 4 However, if you follow the distributive property, you get 1: 8/((4*3) - (4*1)) 8/(12 - 4) 8/8 1 Which one would be correct and why? Both are valid, so I'm conflicted as to what would be the correct answer. It should be right or wrong, not two different answers being right.

I answered with a collection of my standard answers to this sort of question; even my first archived answer on the topic in 1999 was largely a standard response I had given to others before. Here, I’ll just look at a few points I made that haven’t been fully covered above.

I first summarized what was going on:

The problem is not a conflict between PEMDAS and distribution; it is thatstrict interpretationof PEMDAS conflicts with one'snatural impressionof the meaning of the expression, so that you unknowingly apply an alternative interpretation when you think you are just applying the distributive property.

If you recall earlier statements that PEMDAS is (a) in harmony with the properties of operations, and (b) fitting with the visual impression of our notation, then some alarm bells should already be going off!

When you distributed, you ASSUMED that it was the 4, not 8/4, that was multiplying the (3 - 1). In doing so, you were bypassing the rules and justdoing what felt right. If you followed the rules AND distributed, you would get this: (((8/4)*3) - ((8/4)*1)) ((2*3) - (2*1)) 6 - 2 4

It is not really the distributive property that led to the “wrong” result, but the fact that in distributing, the 4 was seen as the multiplier.

Those who say you shoulddistribute firstare putting the cart before the horse: you can't apply tricks to evaluate an expression before you first know what it MEANS, but they are thinking that the distributive property affects the meaning. (In fact, the distributive property is a waste of time here, because it makes you do two multiplications where only one is needed!) The meaning is determined by the order of operations. Is the multiplication supposed to be done before or after the division?

I have observed that many students learn to distribute so well that they automatically do it when it is not helpful, as here, and even when it is not applicable, as when there is no addition at all!

In an unarchived answer in 2012, I listed reasons students have given for doing the multiplication first; on this occasion the question was about the expression \(a^2\div 4b + c\):

In fact, there are several different reasons people have given (this is a very popular question), some of which are better than others. As your friend argues, the rules as usually taught tell us to do all multiplications and divisions left to right (within any cluster of them), and make no exceptions that would cause 4b to be evaluated first. Many of us here would agree with that, and be done with it. Some people would evaluate 4b first because of a misunderstanding of PEMDAS,thinking it means multiplication should be done before division. I think you know they are wrong. Another wrong reason, applied to a slightly different sort of expression, is amisunderstanding of parentheses: the rule that parentheses "come before" everything else leads them to believe that in an expression like 12/4(4-1), the multiplication 4(4-1) has to be done first. But the rule about parentheses really only says that what's INSIDE parentheses has to be evaluated first; the result is treated like any other number. (I sometimes call this the "sticky parentheses" view.) Another reason given in relation to this second type of expression is the idea that thedistributive propertyforces you to do the multiplication first, because they first evaluate 4(4-1) = 4*4-4*1 = 12 and then divide; but this begs the question, because the only reason they took the 4, rather than the 12/4, as the multiplier on the left, is that that's the way it looked to them. And, of course, the distributive property is only a way you may, if you wish, rewrite an expression to give the same value; it is outside of the question of what the expression in itself MEANS. Ultimately, most people probably do it justbecause it feels right: the 4b looks closer together, so we naturally tend to want to do that first. But they can point to no rule that justifies that; and since math is about proof, and about doing what you KNOW is right, not just what feels right, this is not good.

For an example of “sticky parentheses”, see

Is the 2 Related to the Numbers in Parentheses?

For an example of seeing the division sign as a fraction bar (and a long discussion of not being swayed by visual appearance), see

Order of Operations and Fractions

Back to the 2017 answer …

In books and handwritten math beyond the elementary level, we hardly ever use the horizontal division symbol, but use fraction bars instead, which leaves no ambiguity. As a result, the math community has never had a need to make a choice on this situation!It's essentially been left undefined, and it is textbook authors who came up with explicit "rules" to describe what is really just a language that developed organically,based not on carefully stated rules but on tacit agreement. So which is the "right" way to read such an expression depends on what rules are in force in a particular community (math class, journal, etc.) -- and what was intended by the writer.

I closed with a plea for peace:

As a result, in problems such as this, the error is being made primarily not by those who give "wrong" answers, but by those who post the problem in the first place (or pass it on). Anyone who really wants to do math correctly will want to communicate clearly about it, and will avoid anything ambiguous or uncertain. They should either fully parenthesize, or use the horizontal fraction bar, which makes the order clear: 6 6 -------- or ---(2 + 1) 2(2 + 1) 2

Arguments on social media about this sort of thing are a waste of time. But thinking about our conventions can be very enlightening. Next time, I’ll close everything out with a look at history, and some solid reasons to think the “new rule” is in fact correct.

]]>Last time we looked at the subtle distinction between the *order of operations*, which defines the *meaning* of an expression, and *properties* that allow us to *do* something other than what an expression literally says. Here I want to look at one longer discussion that brings out these issues nicely.

Here is the first half of this long question, from Terri in 2010:

Fractions: On the Order of Operations and Simplifying

The 2nd rule in the order of operations says to multiply and divide left to right. I've been thinking that the only reason for this "left to right" part is so I don't divide by the wrong amount. For example, in the problem 3 / 6 * 4, if I didn't follow the order of operations, but instead did the 6 * 4 first, I'd get a wrong answer. Now, my text saysI can avoid having to work left to right if I convert division to multiplication by the reciprocal. This makes sense. My question is:when I write a division problem using the fraction line, do I ever have to worry about following the left to right rule, or does writing it as a fraction void the need for this rule just as writing division as multiplication of the reciprocal did? It seems that in my math text, when it comes to fractions such as ... 24(3)x ------ 8(3)y ... theycanceland do the division and multiplication within a fraction in any order. For example, I would cancel the 3's and divide the 24 by 8, which isn't doing division and multiplication from left to right, nor does that treat the fraction line as a grouping symbol. Evenmultiplication of fractionsdoesn't seem to go by the left to right rule, because we're multiplying numerators first before we're dividing the numerator by the denominator of each particular fraction. I can write the problem above as multiplication by the reciprocal and see that I can divide and multiply in any order. So I'm wondering if I can make this a general rule: in fractions, the left to right order is not an issue.

The question was long enough that I want to pause here and look at my answer to this part.

First, on avoiding left-to-right:

Yes, I've said the same thing; in a sense this is the reason for the left-to-right rule, since a right-to-left or multiplication-first rule would give different results.

I discussed this in Order of Operations: Common Misunderstandings.

Next, on fractions:

You're partly confusing order of operations (which applies toEVALUATINGan expression -- that is, to what it MEANS) with techniques forsimplifyingor carrying out operations in practice. Properties of operations are what allow us to simplify, or to find simpler ways to evaluate an expression than doing exactly what it says. For example, thecommutative propertysays that if the only operation in a portion of an expression is multiplication, you can ignore order.

This is the main topic I discussed last time, in Order of Operations: Subtle Distinctions. Simplifying (including canceling in fractions) is a step taken *after* understanding how an expression would be evaluated literally, as written, and involves *changing* the expression to one that is equivalent. That is, once we know what an expression means, we can find alternative ways to evaluate it that will take less work. In particular, this includes canceling in a fraction, and “multiplying across” to multiply two fractions.

Terri’s question continued:

Of course, it seems that just when I think I can generalize about something, there's a case where it doesn't hold true, and I'm wondering why, if this is the case, I've never seen it written anywhere. I've been looking on the Internet and in algebra books to see if anyone addressesthis particular part of the order of operationsin detail, and it seems that most just generalize about the order of operations. I'm wondering if there is an unwritten rule thatwhen you write division using the fraction line, you no longer need to do the division and multiplication from left to right. Another math website stated the order of operations and then said there are a lot ofshortcutsthat a person can use because of the associative and commutative rules, but the site didn't elaborate. Is writing division using the fraction line one of these shortcuts that allows you to avoid the left to right rule when multiplying and dividing? Thank you for taking the time to read this problem. Sorry to be so long-winded. I appreciate your time and help very much.

It’s true, as I said last time, that this is something not often discussed explicitly. It is not discussed under order of operations, because it is not really part of that! Rather, it’s part of the overall context. My answer continues:

In a fraction, the bar acts as a grouping symbol, ensuring that you evaluate the entire top and the entire bottom before doing the division. Thus,the division is out of the "left-to-right" picture entirely. In fact, since here the division involves top and bottom rather than left and right, I'm not sure what it would even mean to do it left to right.

About the other site’s comment on shortcuts:

Yes, that's what you're talking about -- shortcuts that essentiallyrewritean expression (without actually doing so) as an equivalent expression that you can evaluate easily. Again, that isoutside of the order of operations. As an example, multiplying fractions is explained here in terms of the properties on which it is based: Deriving Properties of Fractions http://mathforum.org/library/drmath/view/63841.html

The idea here is that whereas a multiplication like \(\displaystyle\frac{4}{15}\cdot\frac{35}{64}\) as written *means* to multiply the first fraction by the second (left to right), in actually *carrying it out*, we can get the same result by canceling common factors anywhere in a numerator and a denominator, without regard to location, and then multiply all numerators and all denominators separately: \(\displaystyle\frac{4}{15}\cdot\frac{35}{64} = \frac{1}{3}\cdot\frac{7}{16} = \frac{7}{48} \). But what we are really doing is applying properties to *rewrite* the original product as the single fraction \(\displaystyle\frac{4\cdot 35}{15\cdot64}\), and then applying further properties to rewrite that by dividing the entire numerator and the entire denominator by the common factor 20.

Terri quickly wrote back:

Thank you for your time in answering my question. I appreciate it. If you have time, I have just two more questions to make sure I can get this straight in my head... You mentioned that, for a fraction, the division is out of the "left-to-right" picture entirely. So, I'm guessing that I can safely say that theleft-to-right rule applies only to division that is written on one line. Last question: another website says that if I have the problem ... 4(12) ---- 3 ... then I need to multiply the 4 and 12 first before dividing by the 3, according to the order of operations, using the fraction line as a grouping symbol. But when I cancel, of course, I'm not doing it in this order. Sois canceling one of those "properties of operations"you mentioned that allows us to evaluate this without having to stick to the order of operations?

I answered the first question:

Right. When division is written as a fraction, the order is forced by the grouping-symbol aspect of the fraction bar; it's as if division were always written like (a * b) / (c * d) Mathematicians rarely write division in the horizontal form, probably because indicating it vertically makes it so much clearer what order is intended.

Fraction bars, like parentheses, override default rules about order, and make a visible division (no pun intended) between the numerator and denominator. There is no left and right except within the numerator and the denominator separately.

As to the second question, on canceling as a property:

Again,cancelingis not the same thing asevaluating; the order of operations only applies to what an expression MEANS, not to how you must actually carry it out. To EVALUATE this expression, in the sense of doing exactly what it says, I get 48/3 which becomes 16. I followed all the rules. To SIMPLIFY the expression, I can follow the rule of simplification. This says that if I divide ANY factor of the numerator (wherever it falls -- it doesn't matter because of commutativity) and ANY factor of the denominator by the same number, the resulting fraction is equivalent. The reason I can use the properties is because the canceling is equivalent to this sequence of transformations: 4(12) 4 * 4 * 3 4 4 3 4 4 ----- = --------- = --- * --- * --- = --- * --- * 1 = 16 3 1 * 1 * 3 1 1 3 1 1 All sorts of properties of multiplication come into play here, but the idea of canceling wraps it all into a simple process in which, again, the order doesn't matter. But that only works when it is ONLY multiplication in either part.

This is what I demonstrated above, but expressed a little differently.

Terri responded again the next day:

Thank you very much for your help. I guess my questions must have sounded very confusing; I was confused, looking at the expression ... 10 --- * 2 5 ... as being 2 steps in the order of operations -- a division of 10 by 5 and a multiplication -- like the expression 10 divided by 5 times 2 written all on one line (with no fractions). But now I see that in my first example above, the fraction is considered to be just one number for the purposes of the order of operations so there is just 1 step -- a multiplication of the fraction times 2. Even though the fraction line means division, it doesn't count as division in the order of operations. Hope I got this right. A HUGE thank you for taking the time to make sense out of my confusion!!! Have a great week!!

(Terri’s expression was accidentally modified when the question was archived, making the question and my answer a little confusing; I have fixed it here to match the original.)

I answered,

For many purposes it is easiest to say thata fraction is just treated as a number in the order of operations(in fact, I usually do that); butyou don't have to, and that isn't what I've been saying, because I don't think it's what you've been asking about. Your example certainly CAN be treated as a division followed by a multiplication, and it doesn't violate anything; you are still working left to right. What's different from the horizontal expression 10 / 5 * 2 is just that everything isn't left or right of everything else, so left-to-right isn't the only rule applied.

In \(\frac{10}{5}\cdot 2\), the division *must* be done first simply in order to get a value that will then be multiplied, because the fraction/division is the first operand; there is no way to read it so that the multiplication would be done first. In \(10\div 5\cdot 2\), an order of operations must be invoked.

The fraction bar primarily serves togroup the numerator and the denominator, as I've said; I suppose, though I haven't said this, that it alsogroups the entire divisionrelative to anything to its left or right, since it forces you to do the division first. A clearer example would be ... 10 2 * --- 5 ... which amounts to 2 * (10 / 5), where we technically have to divide first (so in a sense we are deviating from the left to right order). However, this is one of those cases where it turns out not to matter, because the commutative property and others conspire to make that expression EQUIVALENT to ... 2 * 10 ------ 5 ... and therefore if you multiply first and then divide, you get the same answer. But this is NOT really left-to-right, because the 5 is not "to the right of" the division in the original form. It's just a simplified version -- a NEW expression that has the same value, not the way you directly evaluate it. And that's been my main point:HOW you actually evaluate something need not be identical to WHAT the expression means, taken at face value.

Rewriting is a feature of most of what we do with fractions, if you think about it.

Observe here that in \(2\cdot\frac{10}{5}\), the fraction must be treated as a single quantity (as if it were in parentheses), simply because of the typography: the entire fraction is written as the second factor. Writing on one line, \(2\cdot 10\div 5\) would not have that same constraint, so in order to have the same literal *meaning*, it would have to be written as \(2\cdot (10\div 5)\). But, again, this still gives the same *value* as \(2\cdot 10\div 5\) evaluated left-to-right, because both mean \(2\cdot 10\cdot \frac{1}{5}\), to which the associative property applies. (That’s what I meant by my mistaken reference to the *commutative* property.)

But these examples, instructive as they are, don’t have the same features as in the original examples.

Your questions until now were about something different -- where the numerator or denominator was not just a single number -- so it couldn't really be considered a mere fraction. For example, you asked about 4(12) ----- 3 There, you can't just say the fraction is treated as a single number; you have to use the grouping properties of the fraction bar to determine the meaning of the expression.

Indeed, until the mention of \(\frac{10}{5}\cdot 2\), we saw no fractions except in the technical sense of an algebraic fraction, which is really a division expressed in a certain way.

To summarize,the fraction bar groups at two levels, first forcing the numerator and denominator to be evaluated separately, and then forcing the entire division to be done before anything to the left or right. Thus, this expression ... 2 + 3 1 + ----- * 6 4 + 5 ... means the same as this: 1 + ((2 + 3) / (4 + 5)) * 6 In simple cases, where the numerator and denominator are single numbers, this implies that the one will be divided by the other before anything else, so for all practical purposes you canthink of the fraction as a single number(the result of that division).

Terri concluded:

Thank you for your patience in answering my questions which I'm guessing were a headache to answer. I apologize for my inconsistency and confusion in writing them. I have not seen "spelled out" in my algebra books the relationship between order of operations and evaluating versus shortcuts like simplifying. I've read and reread your answers, and I think I'm hopefully understanding it. Thanks again. Have a good week!

There is a lot hidden in the way we write expressions, isn’t there?

]]>