We’ve looked at the scalar (dot) product and the vector (cross) product; but there is one answer in the *Ask Dr. Math* archives that was too long to fit in either post. Here we’ll see again where the two familiar products come from, while looking deeper into the math behind and beyond them.

Here is the question, from 2012:

Vector Products and Possibilities I have been studying maths for a long time, but one thing keeps puzzling me:how do we get to define vector dot products and, more importantly, cross products?When solving problems, you hear, "Now we take the vector cross product ..." or, "We take the dot product ..." I have always failed to see how they get to this step without knowing the answer in advance. What are the justifications for these definitions? I do realize that in physics there are certain phenomena that depict these definitions, but for me that cannot be satisfactory motivation for them.

Doctor Fenton provided the information we’ve already seen on the dot and cross products. But Cyrus felt that his question was deeper than that, and wrote back:

Thank you very much for your reply; however, it does not address my problem. I do know how to derive the cross product and dot product. It is about THE VERY DEFINITION of these two -- the justification of these definitions -- wherein lies my problem.How did we get the notion of defining the vector cross product as it is defined?Is it purely because of observation in natural phenomena, or what? In all the textbooks I have looked at, they start by defining the vector product and the cross product, and go on from there to work them out BASED on these definitions.I do not see where they get these definitions.Thank you for your consideration.

I think his question was really answered in what Doctor Fenton said, as in my last two posts on vectors, but it takes subtlety to fully understand how the derivations justify the definitions.

Doctor Fenton tried to clarify the question:

I'm not sure what you are asking. Is the idea of finding a vector perpendicular to two given vectors not sufficient motivation? There are certainlymany applicationswhere orthogonal directions are particularly convenient for computation, to say nothing of physical phenomena, such as the motion of a charged particle in a magnetic field where a force acts at right angles to the field and the velocity vector. If you read the references on the dot product, the basic problem is to find the angle between two vectors, andthe formula for the dot product drops out of the computation. Similarly, if you need to determine a direction orthogonal to a given plane (or two non-collinear vectors),the cross-product drops out. We make definitions because they are useful, and it is useful to be able to compute the angle between vectors, and to find a vector perpendicular to a given pair of vectors or a plane. What else are you looking for?

Entities that arise in the midst of attempts to solve useful problems deserve names, and that is what is happening here. Cyrus tried again:

Thank you for your kind answer. I do appreciate it. But it seems that I fail to make my problem clear. It is clear to me that from the dot product, you get the angle between the two vectors; and from cross product, you get the vector which is perpendicular to the plane of the two vectors. My question is: are these definitions simply based on the observation of physical phenomena or on something else?Could one have defined them differently?

After yet another attempt that was even less successful, Doctor Fenton asked other Math Doctors to join in, and two days later, Doctor Jacques offered a long essay on Cyrus’s last question.

I think your question is, "Are there other possible definitions for vector products?" The answer to that question is a big YES. I will show you below how to define other kinds of vector products.Not all of them are interesting; but some of them are used in many applications, although less often than the dot and cross products. The reason why the dot and cross products are encountered so frequently is probably linked to theiruse in physics. There are alsodeeper, purely mathematical, reasons, which I will discuss. I will not enter into too many details, nor give proofs of all the claims I will make; I will rather focus on the reasons behind the concepts.

Incidentally, the concepts of vectors we have now arose from some of the bigger ideas to be discussed, so in some sense we will be seeing the true historical background of the dot and cross products – which are more complicated than we usually want to get into!

The discussion here is based on the concept of **vector spaces**, which is taught in Linear Algebra. I won’t try to define them here, but basically they are just sets of vectors, such as all the vectors in a plane, or all the vectors in 3D space.

In all generality,a vector product is an operation that takes two vectors u and v as operands (arguments) and produces a vector w. We can see that as a function: f : U x V -> W [1] Here, u is in U, v is in V, w is in W, and w = f(u, v). U, V, and W are vector spaces (possibly different, but over the same scalar field). Unless explicitly specified otherwise, we will restrict ourselves to a particular case: * U = V * The base field is R (the real numbers). * The vector spaces have finite dimension. We will let n = dim U = dim V, and m = dim W. For example, in thedot product, n can be anything, and m = 1 (i.e., the dot product is a scalar, which is essentially the same thing as a vector space of dimension 1). In thecross product, we have n = m = 3 (the reason of the explicit 3 will be given below).

So we’re looking for ways to define a product of two vectors *in the same space* that produces a *vector*. Note that last paragraph: The *real numbers* themselves can be thought of as vectors (they have magnitude and direction!), so this does not exclude the dot product, whose result is a real number!

If we consider arbitrary functions f, there is not very much to say. To have a more interesting structure, we add some conditions that express that the product operation is"compatible" with the vector space structure. Specifically, we require the function f to bebilinear, i.e., for all vectors a, b, c in U and for all scalars k in R, we must have: (1) f(a + b, c) = f(a, c) + f(b, c) (2) f(a, b + c) = f(a, b) + f(a, c) (3) f(ka, b) = f(a, kb) = kf(a, b)

That is, we wouldn’t call something a product if it didn’t follow these rules, which amount to the familiar distributive and associative properties that are true of ordinary multiplication:

Note that we can also write the product inoperator notation. For example, we can write f(a, b) =a # b, where '#' is a symbol that describes the specific product (we write a.b for the dot product and a x b for the cross product). Using that notation, the bilinearity axioms above can be written as: (1') (a + b) # c = (a # c) + (b # c) (2') a # (b + c) = (a # b) + (a # c) (3') (ka) # b = a # (kb) = k(a # b) (1') and (2') are very similar to thedistributive law. (3') is vaguely similar to theassociative law, although it is not the same thing, because k, a, and b do not belong to the same set: k is a scalar, and a and b are vectors (and a # b can be a vector of another dimension). This property is sometimes called "mixed associativity." This shows that requiring the axioms (1) - (3) is not unreasonable.

That’s a mathematician’s reason for setting those rules for a “product”; they are similar to ideas I discussed in a different context in What is Multiplication … Really?.

In addition, (bi-)linearity is very important in physics.Many physical lawsare linear; this is somehow related to the fact that space and time "look the same" everywhere. From now on, a vector product will mean a function like [1] that meets all the requirements above.

Now we can try inventing some products, starting in the simplest case:

The case m = 1 ----- Let us consider the case where m = 1, i.e., when f(u, v) is ascalar. Such a vector product is called a bilinear form. (The expression "scalar product" would also be appropriate, but it is often used with a more restricted meaning.)

Recall that *m* is the dimension of the output, so we are multiplying two vectors and getting a one-dimensional “vector”, a scalar.

The simplest possible product we can define is thetrivial product: f(u, v) = 0 for all u and v. It is easy to see that this satisfies all the requirements, but I think you will agree that it is not very interesting.

This just squashes everything down to zero, which isn’t useful.

Another simple bilinear form is the elementary product given by: f_{ij}(u, v) = u[i] * v[j] Here, i and j are *fixed* integers in the range {1, ..., n}, and u[i] is the i-th component of the vector u. This is as simple as a non-trivial bilinear form can get. However, such a product depends heavily on the choice of a coordinate system.

This means that, for example, we might define \(\langle a,b,c\rangle\#\langle d,e,f\rangle = b\cdot f\), always multiplying the second component of the first by the third component of the other. Ignoring all but one component of each factor isn’t very useful. (But there may be times when that’s exactly what you want!)

We can nextcombine elementary productsto get something more interesting. Let us define: f(u, v) = SUM (a[i, j]*u[i]*v[j]) Here, i and j range from 1 to n, and the a[i, j] are a set of fixed real numbers. It is easy to check that this defines a bilinear form. If we write A for the matrix {a[i, j]}, and if we represent u and v as column vectors, we can write the above product as: f(u, v) = u' A v Here, x' is the transpose of the vector u (i.e., u' is a row vector). Now, the important point is thatany bilinear form can be written in this way: given a bilinear form f, you simply define a[i, j] as f(e_i, e_j), where the {e_i} are basis vectors. The trivial product corresponds to the matrix A = 0. The elementary product f_{ij} corresponds to a matrix that has 1 in position [i, j] and 0 everywhere else.

As a simple example, in two dimensions we might (arbitrarily) define $$\mathbf{A} = \begin{bmatrix}1 & 2\\ 0 & 3\end{bmatrix}$$ so $$\mathbf{u}\#\mathbf{v} = \begin{pmatrix}u_1 & u_2\end{pmatrix}\begin{bmatrix}1 & 2\\ 0 & 3\end{bmatrix}

\begin{pmatrix}v_1\\v_2\end{pmatrix} = u_1v_1+2u_1v_1+3u_2v_2$$

None of the specific products we’ve tried so far are interesting. But …

Another simple case iswhen A is the identity matrix. In that case, we have: f(u, v) = SUM (u[i]*v[i]) You will recognize this as thedot product: f(u, v) = u.v. This product is also very simple, andit has more symmetrythan the previous ones: it is invariant under many coordinate transformations, including a permutation of the basis vectors. This is one of the reasons why it occurs so frequently.

We’re defining it here as: $$\mathbf{u}\cdot\mathbf{v} = \begin{pmatrix}u_1 & u_2\end{pmatrix}\begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix}

\begin{pmatrix}v_1\\v_2\end{pmatrix} = u_1v_1+u_2v_2$$

There’s a lot more to be said for the dot product:

Bilinear forms can have some additional properties. A bilinear form f issymmetricif f(u, v) = f(v, u) for all vectors u and v. This will be the case if the matrix A is symmetric; in particular,the dot product is symmetric. A bilinear form f is said to bepositive-definiteif f(u, u) > 0 for all non-zero vectors u (the axioms imply that f(0, 0) = 0). Symmetric bilinear forms are interesting becausethey allow us to define the length of a vector: we define the length |u| of the vector u as sqrt(f(u, u)), which is legitimate if f(u, u) >= 0. The square root is important, because it ensures that multiplying a vector by a real k multiplies its length by |k|. This allows us to define a geometric structure on the vector space.The dot product is positive-definite, because u.u = SUM (u[i]^2). A very important point is that some kind of converse is true:any real symmetric and positive-definite bilinear form is equivalent to the dot productin a suitable coordinate system. This universality property is probably one of the main reasons for the importance of the dot product. The dot product has alsomany physical interpretations; for example, if a force F produces a displacement u, theworkdone is the dot product F.u. There areother interesting bilinear formsthat are not positive-definite. For example, in special relativity, one deals with 4-dimensional vectors (t, x, y, z), and the form defined by this plays a special role: f((t1, x1, y1, z1), (t2, x2, y2, z2)) = c^2*t1*t2 - x1*x2 - y1*y2 - z1*z2 This form is not positive-definite, because, for example, if u = (0, 1, 0, 0), then f(u, u) = -1.

This is part of the weirdness of relativity!

Before talking about the cross product, we will first describe two other, more general, vector products. The tensor product ------------------ In the case of m = 1, we have seen that any vector product can be constructed as a linear combination of the elementary products f_{ij}. In the general case, we can use these elementary products in another way: we canconsider all the products u[i]*v[j]as separate coordinates of a vector in another space W. As there are n^2 such products,the dimension of W will be m = n^2. This product is called the tensor product, and we write f(u, v) = u ⊗ v. A basis of W is the set {e_i ⊗ e_j}, where the e_k are basis vectors of U. This product is, in some sense, the most general product that can be defined on U.Any such productf: U x U -> E, where E is a real vector space, can be defined by a normal linear map: g : W (= U ⊗ U) -> E The map g can be described by a matrix G of dimension m by n^2, where m = dim E. This universality makes the tensor product (and other generalizations) omnipresent in many fields of mathematics and physics. We can get other universal constructions by imposing some additional conditions, likesymmetry or antisymmetry, on the tensor product.

There is, of course, a lot more that could be said!

The wedge product ----------------- We turn now toantisymmetric vector products, i.e., products where f(u, v) = -f(v, u). Note that this implies that f(u, u) = 0. Like the tensor product, we can define a "most general" antisymmetric vector product by taking these elements as coordinates of a vector in a vector space W: w[i,j] = u[i]*v[j] - u[j]*v[i] This product is called thewedge product, and written u ∧ v. As w[i, i] = 0 and w[i, j] = -w[j, i], there are n(n - 1)/2 independent coordinates, and we can take W as a vector space of dimension n(n - 1)/2. Like the tensor product, this product is "universal" in the sense thatany antisymmetric vector productcan be obtained from a linear map defined on the wedge product.

In the particular case n = 3, we have n(n - 1)/2 = 3, and W has the same dimension as U. This suggests that we may try to associate the product u ∧ v with a vector of U, as they have the same dimension. It turns out that, if we impose some additional conditions (like the behavior under changes of coordinates), there are essentially two ways to do this, corresponding to two possible orientations of the space. The usualcross productis simply a representation of the universal wedge product with a particular choice of orientation (a "right-hand rule"). The fact that this isequivalent to a universal constructionis one of the main reasons for its importance. Note that this only works with n = 3, because only in that case have we n = n(n - 1)/2.

So the cross product only exists for 3D vectors, effectively because the number of components is the same as the number of pairs of coordinates.

The case m = n -------------- A vector space U with a (bilinear) vector product and range of U, itself, is called an algebra. In that case, the vector product is an internal operation. Depending on that product's additional properties, you can have different types of algebras -- for example, commutative or associative. As we have seen, if n = 3,the cross product defines an algebra on R^3; this algebra is anti-commutative, and is not associative.

This, of course, is not what we mean by “algebra” in high school!

Conclusion ---------- To summarize,it is possible to define many things that can be called vector products. The fact that the examples above are more frequently encountered stems from the fact that these products are, in some sense,the most general products that can be definedunder some reasonable conditions. This comes also from theirimportance in physics, although the two reasons are probably related. I hope that this sheds some light on the subject; please feel free to write back if you want to discuss this further.

Cyrus responded,

Thank you very much for your kind and detailed answer. I really did appreciate it. Not being a mathematician but rather an engineer, I may not have followed all the arguments presented. However, you resolved a very important, nagging curiosity that I had, and that was that, certainly, other definitions of vector product can and do exist, but more popularly we take these two definitions simply because they fit the natural physical laws that we observe. In effect, these definitions are merely the result of physical observations of nature.

I would rather say these definitions are used because they have been found useful, *both* abstractly *and* practically.

A recent question (from May) about approximating the binomial distribution with the normal distribution led to some (accidental and otherwise) insights about the method.

I have to solve this problem:

A manufacturing company uses an acceptance scheme on items from a production line before they are shipped. The plan is a two-stage one.

Boxes of 20items are readied for shipment, and a sample of 10 items is tested for defectives. If any defectives are found, the entire box is sent back for 100% screening. If no defectives are found, the box is shipped.Now, suppose that the manufacturing company develops a

new scheme: an inspector takesone itemat random, inspects it, and then replaces it in the box; next 8 inspectors do likewise. Finally, a tenth inspector goes through the same procedure. The box is not shippedif any of the ten inspectors find a defective.Q: What is the

probability that a box containing only one defective will be sent back for screening?Use

normal approximationfor answering.I know that this is a

binomial random variable, but I am having a hard time using normal approximation to answer this question.Am I supposed to use

z scoreto solve this problem since it is normal approximation?I am not sure but I found that:

Probability of non-defective (p) = 19/20

Probability of defective (q) = 1/20

This is easy to misread; the “old scheme” called for testing one sample of 10 (half a box), while the “new scheme” has ten independent samples of 1. We are evaluating only the “new scheme”.

As we’ll see, the requirement to use the normal approximation is problematic. But Sydney is thinking well so far. We have ten independent repetitions of the inspection process, and each time (assuming there is exactly one defective item in the box) the probability of choosing (and correctly identifying) the defective item is \(\frac{1}{20}\). This is what a binomial distribution is.

I replied,

Hi, Sydney.

Yes, for the new scheme the random sample can be considered binomial with p=19/20. (You could instead think of a

defectiveitem as a “success”.) And the z-score is used whenever you apply the normal distribution. The trick here will be, the z-score corresponding towhat value?Please show your attempt at using the normal approximation, so I can see where you are having trouble; at the least, you can show the formula or method you were taught for that, as different books or instructors may express details in different ways, and I want to work with what you are being taught.

It is actually quite common to think of something bad (finding a defective part, getting sick, …) as the “success” in a binomial distribution; and often it makes things easier.

Sydney replied with two questions. First, accepting my suggestion,

Since the defective item is success, then would p = 1/20 and q = 19/20 ??

Or is it the other way around?

To which I replied, explaining the value of this choice:

You can take either event as a “success”; that’s an arbitrary choice. I only suggested treating defective as success because then the question asks about P(x = 1), that is, there is one defective item. You don’t have to do it that way; but if you do, then p = P(defective) = 1/20.

In saying \(P(x = 1)\) without thinking fully, I’ve made my first mistake, which will lengthen our conversation a bit!

Second, Sydney showed some details:

Then, I would find μ = np and σ = √(npq) to compute to z = (x − np) / √(npq) .

Then would x be 0.5??

The formulas here are correct; we use the mean and standard deviation of the binomial distribution for our normal distribution. I suspected the 0.5 Sydney mentioned was related to the “continuity correction” we’ll be discussing, but I needed to be sure where it came from. I answered,

You’re right about finding μ and σ. I’m not sure what you mean about x being 0.5, but you can clarify that by going ahead and doing the work as you understand it. You may mean the right thing, but you actually need two values of x.

What I was looking for from you are specific facts about the normal approximation, specifically the “

continuity correction“. That may or may not be what you are asking about.If you need more explanation than you were given, these might help (the first explaining more about what it means, the second giving explicit instructions):

The normal approximation to the binomial, Simonoff

Normal Approximation to Binomial, Jones

Again, just do what you have in mind, which is probably at least partly right, and we’ll have more to discuss.

The two links are to good brief explanations I found of the technique, as there is not much about it in *Ask Dr. Math*. We’ll see later that my comment about needing two values of *x* is actually wrong, a result of my earlier hasty statement.

Now Sydney could show some work:

Yes, I would use normal approximation, which is P (z ≤ (x + 0.5 − np) / √(npq) )

Since p = 1/20, q = 19/20, and n = 20,

μ = 1, σ = √(19/20)

Now I just need to find x and compute all the information that I have.

Yet, I understood this question as to find P(x=1).

But since z score is ≤, I am confused on what x score I am supposed to use.

We’ll get back to that “≤” (which is not quite right) … and also the “*n* = 20″. For now, pretend it’s correct that we are looking for \(P(x = 1)\), that is, the probability that exactly one defective item is found; that let us talk about the full idea of the normal approximation.

I answered:

As indicated in the pages I referred you to, the normal approximation to P(x=1) is P(0.5 < x < 1.5) using the normal distribution. So you need to find z

both for 0.5 and for 1.5, find P(Z≤z) for each of these values, and subtract to find the probabilitybetweenthem.The second link includes a table with this example:

Discrete … Continuous

x = 6 …….. 5.5 < x < 6.5The first link, similarly, says this:

The continuity correction requires adding or subtracting .5 from the value or values of the discrete random variable X as needed. Hence to use the normal distribution to approximate the probability of obtaining exactly 4 heads (i.e., X = 4), we would find the area under the normal curve from X = 3.5 to X = 4.5, the lower and upper boundaries of 4.

When you wrote P(z ≤ (x + 0.5 − np) / √(npq)), this is only one side of the interval for which you need to find the probability. Do you see that?

Here is a typical illustration of the normal approximation to the binomial, showing the role of the continuity correction:

The bars represent the binomial distribution (in this case with *n* = 6 and \(p = \frac{1}{2}\)). Each bar should have an area equal to that under the normal distribution curve between \(x\ -\ \frac{1}{2}\) and \(x +\frac{1}{2}\).

What Sydney wrote would in fact be correct for \(P(x\le 1)\); and that is closer to what we really need than what I’m teaching at this point! But we’ll get there.

Sydney could now move forward:

I think I misunderstood the textbook and now I understand what is going on.

Thank you for your explanation.

Just for clarification, I am supposed to find P(0.5 < x < 1.5).

Then P (-.513 < z < .512)

Then P (z < .512) – P(z < -.512)

= .696 – .302 = .392

So, 39.2% would be the answer right?

This was good, as far as it went. But we weren’t quite finished:

I just realized I didn’t check your numbers. You said n = 20; that’s wrong. The entire

populationis 20! What is the sample size?The problem was:

Now, suppose that the manufacturing company develops a new scheme: an inspector takes one item at random, inspects it, and then replaces it in the box; next 8 inspectors do likewise. Finally, a tenth inspector goes through the same procedure. The box is not shipped if

any of the ten inspectorsfind a defective.(By the way, though this doesn’t affect your work, I’d like to make a comment on the problem. Have you observed how silly it would be for each inspector to put back an item that has already been inspected, so that it might be inspected more than once? The reason for that is that without that feature, this would not be a binomial problem, and would be harder to work with. They are giving you an unrealistic problem to keep it simple for you!)

Anyway, try it again with the correct n; and I’d like you to show a little more of your work (such as what actual value you get for σ in the sampling distribution) so I can make sure of a couple things; I think you made an error in your calculations last time.

Sydney provided a redo:

So then n = 10

μ=10 × 0.05 = .5

σ=√(10 × .05 × .95) = .6892

Then, P(0.5 < x < 1.5).

P (0 < z < 1.451)

= P(z < 1.451) – P(z < 0)

= .92661 – .5 = .42661

Would this be the correct calculation?

But, no …

Good work … but we’re not yet answering the right question.

I apologize for missing another detail. I guess I’m giving you extra practice with this topic …

We have to think about what the event is that they are asking about. It was, “What is the probability that a box containing only one defective will be sent back for screening?”, which in turn means, in effect, “What is the probability that

at least oneof the ten inspectors finds a defective item.” They didn’t explicitly say “at least one”, but that is how we have to interpret “any“.So I misstated the goal from the start. We don’t want P(x=1); we want P(x≥1). Right?

Luckily, this is very easy now. You’ve done more than enough work.

And, as it turns out, it could have been answered using the binomial distribution itself, or even without it.

You’ll find, if you do that, that your answer is different; that isn’t a problem, as the answer they are asking for is known to be an approximation.

So Sydney’s use of an inequality, and therefore a one-sided test, was right; it was just in the wrong direction.

Now Sydney could really finish:

We are finding P(x≥1), and we still need to use normal approximation.

So, using normal approximation, we will need to find P(x > 0.5), right?

= 1 – P (x <= 0.5)

so, P ( z <= 0)

Is this right?

To finish this off, \(P(x\le 0) = 0.5\), since this is the probability of being less than the mean in a symmetrical distribution. So the answer, by the normal approximation, is that the probability that such a box would be sent back is 50%.

I replied:

Correct. So the answer is incredibly simple after all that work, isn’t it?

Have you tried finding the

exactprobability, without the normal approximation?

This is not part of the problem, but is a valuable thing to try. When we use an approximation, but the exact value is within reach, we should (in an educational context, at least) find both, in order to learn something about the approximation.

Sydney did it, and did it well:

So we need to find

P(at least 1 out of 10 inspectors selected defective item)

Because the events “at least 1 out of the 10 inspectors selected a defective item” and “all 10 inspectors selected non defective items” are mutually exclusive, we have to do

P(at least 1 out of 10 inspectors selected defective item) = 1 – P(all 10 inspectors selected non defective)

There is 1 defective item among 20 items in the box, so there are 19 non-defective items in the box. So, the probability of selecting a non-defective item from the box is 19/20.

Since all 10 inspections are independent, we do (19/20)^10 = .5987

1- .5987 = .4013

This is the answer I got, but I got a different answer using normal approximation.

And that is exactly right! I concluded:

Excellent.

The normal approximation is closer to reality when the numbers are larger;

the significant difference here is to be expected.You may have learned the rule of thumb that the normal approximation is considered reasonable when both np and nq are at least 5; here they are respectively 10*0.05 = 0.5 and 10*0.95 = 9.5, so it

doesn’tfit that criterion, andwe don’t expect a good approximation.I wouldn’t be surprised if this problem were designed to help you see this situation; if it wasn’t, then I’ve done so myself!

I unintentionally made Sydney solve the problem three times, using the wrong *n*, then the wrong inequality, and then (as instructed, not by accident) the inappropriate approximation – and then exactly, which the problem didn’t say to do. Now everything has been practiced that one might need to learn …

To finish off, let’s look briefly at how the binomial distribution and its normal approximation compare in this inappropriate case. First, if we had *p* = 0.5 as in the usual pictures we see (like the one I showed from Wikipedia above), the approximation would be quite close (note that *np* and *nq* are both 5), as shown in this side-by-side bar graph:

Here is a rough picture of the normal distribution itself superimposed on the binomial:

(The normal is shifted a little to the right, because of how I made this in Excel.)

But for our actual problem, here is how the approximation fares:

As we found, the probability that *x* is greater than 0 is 0.5, according to the approximation, but it’s only 0.4 in reality. Here are the binomial and normal superimposed:

They don’t look alike at all. Clearly, the instructions to use the approximation were not a good recommendation – except if you want to learn why you shouldn’t do it!

]]>As we saw last time, the dot product is hard to explain in simple geometric terms, so that it seems unlikely that anyone would invent it in that form. Rather, it is a simple way to combine two vectors algebraically, that has properties suitable for multiplication, and that happens to incorporate the angle between the two vectors, specifically its cosine, making it very useful. The same is true of the cross product, which is complicated to describe geometrically, but is (in some sense) very neat algebraically, and turns out to involve the *sine* of the angle between them.

I’ll start, as I did last time, with a question about the essential meaning of the operation; this time I have to use a question from 2016 that was not published in the *Ask Dr. Math* archive. It starts with the geometric description, with a mention of the algebraic form using determinants, which we’ll talk more about below.

What does the cross product of two vectors actually represent?Ex. a X b = |a||b|sin(angle between the vectors).u, where u is a unit vector in the direction of a X b My Physics professor once stated that there is aphysical definitionof cross product of vectors. Though he didn't elaborate, I am still thinking what it might be. I tried breaking the vectors and using thedeterminanttechnique to evaluate the result. I carefully examined the determinant and the definition to understand. Upon finding that the determinant represents a parallelogram's area, I concluded that it might representvector area of the parallelogram. However, I still don't understand what the physical meaning is.

Here is an example showing a cross product:

Vectors **u** and **v** form a parallelogram with diagonal **u** + **v**; the cross product **u**×**v** is perpendicular to that plane, and its magnitude is \(|\mathbf{u}| |\mathbf{v}|\sin(\theta)\), which is the area of that parallelogram.

I replied, first giving references to the three pages we’ll be looking at below and next time, then continuing:

These give successively more detailed answers to what the cross product is. I think your description as "vector area of a parallelogram" (or "directed area") is good. I don't know just what your professor might have had in mind as a "physical definition"; it is *defined* mathematically, not by physics, but it can be *applied* in physics in various ways that justify its use. It appears in several rather distinct fields of physics, so no one of those can be called the meaning of the cross product in general. Maybe you should ask your professor!

Avinash’s term “vector area”, which I suggested calling a “directed area”, refers to the fact that the **magnitude** of the cross product is the area of the parallelogram defined by the vectors, while its **direction** is perpendicular to the plane of the two vectors.

For a particularly simple example, consider the cross product of the unit vectors **i** and **j**, which is a vector perpendicular to both, with magnitude 1, the area of a 1 by 1 square:

Avinash replied,

When replying to my question, you mentioned "directed area". I am a bit confused, as tohow area being a scalar can be called a vector. I know, I talked about vector area and so you might assume that I know about it.But the truth is that I had to accept the fact as is. I hope that you can give me a convincing explanation of vector area.

I answered,

Well, that's why we don't just call it area!Area itself is a scalar; but we can associate it with a direction, and it happens that the cross product naturally combines them. I think you made up the term "vector area"; when we talk about that or "directed area", we simply mean attaching a direction to an area. That is mentioned here: https://en.wikipedia.org/wiki/Cross_product In mathematics and vector calculus, the *cross product* or *vector product* (occasionally *directed area product* to emphasize the geometric significance) is a binary operation on two vectors in three-dimensional space (R^3) and is denoted by the symbol x. This directed area is just thearea times a unit vector perpendicular to the plane(which defines the direction of the plane).

The term “vector area” is in fact used, though not usually in this context.

We can also talk about "signed area", which is closely related; this commonly arises in connection with a formula for the area of the region enclosed by a polygon, which gives a positive area if you go around counterclockwise, but a negative area in the other direction. This formula is closely related to the cross product. Here are two discussions of this: Quadrilateral Area http://mathforum.org/library/drmath/view/60583.html Geometric Proof of Area of Triangle Formula http://mathforum.org/library/drmath/view/72141.html

This last link mentions the cross product, and provides an introductory proof of what we’ll be seeing later; I previously discussed it in Polygon Coordinates and Areas.

Avinash pursued the question:

Thank you for your response.You helped me a lot, but could you please explainwhy do we associate direction with an area?It is kind of hard to come up with an idea of combining area with direction. I mean,how could someone think of using directions in area, a scalar quantity? Is it because of applications in finding flux or something alike?

This is the core of the question: **Why do all this at all?**

I said,

The first reason is probably that when welook for a way to make a vector perpendicular to the planeof two given vectors, it just happens that the magnitude of the cross product, which does that job, is equal to the area. So we accept that as a gift! It's not that we were seeking that goal; it was just handed to us as a nice result. It also turns out the this idea is useful, as you mention, for concepts like flux. Someone might have thought of that first; I don't know. But however it came about, it's a good idea. Everything works together very neatly.

As we pursue a proof below, we’ll see that what I said is exactly right. Avinash concluded:

Thank you, thank you, thank you, thanks a million for answering all my questions without grumbling. If I had asked all these questions to my math professor, he would have regarded it as 'stupid' and 'irrelevant'. Glad that the members of Dr.Math are always ready to help.

To which I replied,

You're welcome. I actually like the odd questions that don't really have answers, but that make you look deeper into things! I don't get those enough in my classes.

Now let’s look at the first of the archived answers I referred Avinash to.

This question is from 2012:

Crossed up by Scalars and Vectors I have never really understood why dot products and cross products have been defined the way they are.Why is the result of a dot product a scalar, and that of a vector product a vector?I have consulted many books, and I have searched the net, but nobody seems to give any plausible explanation for this.

Doctor Jerry replied, starting with the **dot product**, much as we saw last time:

Hello Karthik, Thanks for writing to Dr. Math. Given the idea of a vector, it is natural tolook for a way to calculate the anglet between vectors (a1, a2, a3) and (b1, b2, b3). Letting ||a|| be the length of the vector a, if you apply the Law of Cosines to the triangle with sides a, b, and a - b, you will see (a1 - b1)^2 + (a2 - b2)^2 + (a3 - b3)^2 = a1^2 + a2^2 + a3^2 + b1^2 + b2^2 + b3^2 - 2||a||*||b||* cos[t] Simplify and you will find this, where "x y" means "x times y": a1 b1 + a2 b2 + a3 b3 = ||a||*||b||*cos[t] This makes it clear that the left side is a useful combination of a and b.

This is about as concise an answer as one could give. We can imagine starting with the Law of Cosines in the hope of finding the angle. Having found this simple formula, we just define the left-hand side as $$\mathbf{a}\cdot\mathbf{b} = a_1 b_1 + a_2 b_2 + a_3 b_3$$ and we have proved that $$\mathbf{a}\cdot\mathbf{b} = ||\mathbf{a}|| ||\mathbf{b}||\cos(\theta)$$

(By the way, notice that for ease of typing, we have often used \(|\mathbf{u}|\) for magnitude, i.e. length, but the proper notation is \(||\mathbf{u}||\).)

Now we can try something similar to invent the **cross product**:

For thecross product, if you have vectors a = (a1, a2, a3) and b = (b1, b2, b3), it seems clear thata vector that is perpendicular to both of a and b would be useful. So, we seek a vector (c1, c2, c3) such that a.c = 0 b.c = 0 If you solve this system of two equations for c1 and c2, you will find c1 = (a3 b2 - a2 b3)/(a2 b1 - a1 b2) * c3 c2 = (-a3 b1 + a1 b3)/(a2 b1 - a1 b2) * c3 If you choose c3 as a2 b1 - a1 b2, this will give a solution and clean up the mess a bit.

Having defined the dot product, we use it to find whether two vectors are perpendicular. The resulting system of equations looks like this: $$\left\{\begin{matrix}a_1c_1+a_2c_2+a_3c_3=0\\ b_1c_1+b_2c_2+b_3c_3=0\end{matrix}\right.$$ There are only two equations for three variables, so the solution is not unique; as it turns out, it only determines the proportions (i.e., the *direction* of the resulting vector).

He treats \(c_3\) as a constant in order to solve (a quick way would be to use Cramer’s Rule), observing that the solution really expresses \(c_1\) and \(c_2\) as multiples of \(c_3\), so that taking \(c_3\) as the denominator of both fractions gives a convenient result: $$\mathbf{a}\times\mathbf{b} = \mathbf{c} = (a_3b_2-a_2b_3, a_1b_3-a_3b_1, a_2b_1-a_1b_2)$$ This has a nice pattern to it, which can be restated in various ways, including using determinants, which are just a shorthand way to write exactly these kinds of expressions.

Notice that this approach, as I suggested earlier, focuses on making a perpendicular vector; *we don’t even know yet the significance of its magnitude*. We’ll get to that in a moment.

Note, again, that the original goal was not to find a scalar product and a vector product! It was to find an *angle*, which led to a scalar, and a *perpendicular vector*. When these were discovered, it was natural to name them the scalar product and the vector product, and eventually to use the dot and cross multiplication symbols to distinguish them. The actual development of these ideas started from a different perspective, as vectors themselves were only gradually being developed; the dot and cross representations came around 1880, after 40 years or so of development.

We’ve derived the definition of the cross product as a vector perpendicular to the given vectors, but its magnitude was just whatever arose from the easiest choice. Can we prove that the magnitude is what we say it is, the area of the parallelogram? I don’t find that in any published answers, so I’ll use two that were not archived. First, consider this one from 2003:

How can one prove that for 2 vectors a and b that the vector product (a cross b)= mod(a).mod(b).sin(theta).n Where does the sine come in to the equation - is it through the determinant definition of the cross product?

Doctor Fenton answered:

Hi Peter, I assume that n is a unit vector in the direction of (a x b), so your question is essentially why |a x b| = |a||b|sin(θ) , where θ is the angle between a and b. Here, |a| is the modulus of a vector a. This quantity is also the area of the parallelogram with sides a and b. - - - - - a /: / :h / / : /θ : / ---------- b The altitude h is |a|sin(θ), so the area is |a||b|sin(θ).

The fact that the magnitude is the area is not part of the proof, but a useful fact. Now, where can we can get a sine from? You might think of the Law of Sines, as we used the Law of Cosines for the dot product, but I don’t think anything would come of that. But since we already know that the dot product is related to the *cosine*, we might try to use that:

sin^2(θ) = 1 - cos^2(θ) [ a.b ]2 = 1 - [ ------ ] [ |a||b| ] [|a|^2|b|^2 - (a.b)^2 ] = ----------------------- |a|^2 |b|^2 so we want [|a||b|sin(θ)]^2 = |a|^2|b|^2 - (a.b)^2 . Just expand the right side, simplify, and you will have |a x b|^2 .

We now have to show that the right side agrees with our algebraic definition of the cross product. Doctor Fenton left that to Peter; let’s give it a try:

$$|a|^2|b|^2 – (a\cdot b)^2 \\= (a_1^2+a_2^2+a_3^2)(b_1^2+b_2^2+b_3^2) – (a_1b_1+a_2b_2+a_3b_3)^2\\ =(a_1^2b_1^2+a_1^2b_2^2+a_1^2b_3^2+a_2^2b_1^2+a_2^2b_2^2+a_2^2b_3^2+a_3^2b_1^2+a_3^2b_2^2+a_3^2b_3^2) \\- (a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+2a_1a_2b_1b_2+2a_2a_3b_2b_3+2a_1a_3b_1b_3)\\ =a_1^2b_1^2+a_1^2b_2^2+a_1^2b_3^2+a_2^2b_1^2+a_2^2b_2^2+a_2^2b_3^2+a_3^2b_1^2+a_3^2b_2^2+a_3^2b_3^2\\- a_1^2b_1^2-a_2^2b_2^2-a_3^2b_3^2-2a_1a_2b_1b_2-2a_2a_3b_2b_3-2a_1a_3b_1b_3\\ =a_1^2b_2^2+a_1^2b_3^2+a_2^2b_1^2+a_2^2b_3^2+a_3^2b_1^2+a_3^2b_2^2-2a_1a_2b_1b_2-2a_2a_3b_2b_3-2a_1a_3b_1b_3$$

But what is \(|a \times b|^2\)?

$$|a \times b|^2 \\= |(a_3b_2-a_2b_3, a_1b_3-a_3b_1, a_2b_1-a_1b_2)|^2 \\= (a_3b_2-a_2b_3)^2+(a_1b_3-a_3b_1)^2+(a_2b_1-a_1b_2)^2 \\= (a_3^2b_2^2-2a_2a_3b_2b_3+a_2^2b_3^2)+(a_1^2b_3^2-2a_1a_3b_1b_3+a_3^2b_1^2)+(a_2^2b_1^2-2a_1a_2b_1b_2+a_1^2b_2^2) \\= a_1^2b_2^2+a_1^2b_3^2+a_2^2b_1^2+a_2^2b_3^2+a_3^2b_1^2+a_3^2b_2^2-2a_1a_2b_1b_2-2a_2a_3b_2b_3-2a_1a_3b_1b_3$$

They are equal!

The hard part there was the algebra at the end. With more expansive understanding of the algebra of vectors (provable from the purely algebraic definition we are using), it is possible to make the work look a little neater, at least. Here is an unarchived question from 2006; I’m going to abridge this discussion heavily, removing a number of wrong paths along the way:

Prove that ||A x B|| = ||A||.||B||.sin θ using the properties of cross product.Properties are: 1. A x B = -B x A 2. A x (B + C) = A x B + A x C 3. A x A = 0 4. r(A x B) = r(A) x B = A x (rB) 5. A x (B x C) = (C.A)B - (A.B)C 6. A.(B x C) = (A x B).C 7. B.(B x C) = (B x C)C = 0 I have no idea where to start! I have been given the hint of "Use the properties, especially 5, 6, and 7", but to me the properties seem useless because the question I'm trying to prove doesn't seem to be related to the properties. I'm getting really, really confused by the fact that I'm trying to do something with magnitude, but I'm supposed to be manipulating properties that don't involve the magnitude symbols.

Doctor Fenton answered again:

Hi Lucy, Start with ||v||^2 = v.v for any vector v. Applying that to this problem gives you a dot product of two vectors which are themselves cross-products. Property 6 will let you rewrite that, and then you need to use property 5 to further rewrite the result. You will also need to use the property of dot products that A.B = ||A|| ||B|| cos(θ) where θ is the angle between A and B.

Lucy responded (omitting some side trails):

Hi Dr. Fenton, I've looked at your suggestions and I've gotten a lot farther, but I'm still not quite there yet. So far I have: Since ||v||^2 = v.v, we can square both sides and get (||A x B||)^2 = (A x B).(A x B)

Doctor Fenton answered:

Hi Lucy, For convenience, I rewrite the properties: Properties are: 1. A x B = -B x A 2. A x (B + C) = A x B + A x C 3. A x A = 0 4. r(A x B) = r(A) x B = A x (rB) 5. A x (B x C) = (C.A)B - (A.B)C 6. A.(B x C) = (A x B).C 7. B.(B x C) = (B x C)C = 0 First, apply property 6. It's somewhat confusing to have A,B, and C represent both specific vectors (in your problem) and generic vectors (in the formulas). Let me rewrite property 6 in terms of generic vectors U, V, and W: U.(V x W) = (U x V).W . Now apply it to (A x B).(A x B), with U = A x B, V = A, and W = B: (A x B).(A x B) = ((A x B) x A).B . Next, you want to apply property 5 to simplify the triple vector product on the right side, but that requires the first cross-product to be in the second factor, so you must first apply property 1, and then you can apply property 5, to get -((A.B)A - (A.A)B).B . Now, you have a vector dotted with the difference of two vectors. The dot products A.B and A.A are just constants (scalars): it's like -(sA - tB).B . Compute that dot product, using the distributive property, before applying the dot product identity. Then, put back in the values of s and t.

Lucy did that:

After applying the distributive property, I get -((A.B)A).B - ((A.A)B).B)

Doctor Fenton continued:

This expression is equal to -(A.B)(A.B) + (A.A)(B.B) = ||A||^2 ||B||^2 - (A.B)^2 Now apply the identity A.B = ||A|| ||B|| cos(θ) .

Lucy was able to finish:

I got it!! Now I have ||A||^2 ||B||^2 - (||A|| ||B|| cos(θ))^2 ||A||^2 ||B||^2 - ||A||^2 ||B||^2 cos^2(θ) ||A||^2 ||B||^2 (1 - cos^2(θ)) ||A||^2 ||B||^2 (sin^2(θ)) ||A|| ||B|| sin(θ) Which is the end of the proof! Thank you so much for helping me, I really appreciate this wonderful favour you have done for me and the time you took to type out your thorough answers.

Putting all this together, here is the work, which parallels what we did with components before:

$$(||A \times B||)^2 = (A \times B)\cdot(A \times B) \\= ((A \times B) \times A)\cdot B \\= -((A\cdot B)A – (A\cdot A)B)\cdot B \\= -((A\cdot B)A)\cdot B – ((A\cdot A)B)\cdot B) \\= -(A\cdot B)(A\cdot B) + (A\cdot A)(B\cdot B) \\= ||A||^2 ||B||^2 – (A\cdot B)^2 \\= ||A||^2 ||B||^2 – (||A|| ||B|| cos(\theta))^2 \\= ||A||^2 ||B||^2 – ||A||^2 ||B||^2 \cos^2(\theta) \\= ||A||^2 ||B||^2 (1 – \cos^2(\theta)) \\= ||A||^2 ||B||^2 (\sin^2(\theta))$$

$$||A \times B|| = ||A|| ||B|| \sin (\theta)$$

For a very different perspective, you may want to read this, from 2001, which was the second link I gave to Avinash:

Cross Products; Rotating in Three Dimensions

Next time we’ll look at the third link, a long discussion of dot and cross products and a lot more.

]]>A recent question provided an opportunity to examine some ideas about ratios, and also ways to tame a potentially huge product.

Here is the question, received from Sithum in late April:

Hi Dr. Math!

There are some problems that involve a given ‘equation with ratios (a:b=c:d)’ to prove a connection between two other ratios. I tried many methods but I haven’t got the slightest answer for the sums. Here’s a few I got stuck solving.

1) if a:b=c:d, prove that

i) (2a+3b):(2c+3d)=(2a-3b):(2c-3d)

ii) (3a+5b):(3a-5b)=(3c+5d):(3c-5d)

2) if (2a+3b)(2c-3d)=(2a-3b)(2c+3d) prove that a:b=c:d

Other problems are the same and I couldn’t solve any of them. I have never met anything like that before and am totally baffled. I should say I’m confident in algebra so far and have good basic knowledge in maths. It would be wonderful if you could offer some help. (Sorry if I’m not clear with my English.)

Thank you!

Problems (1 i) and (2) are converses; (1 ii) is much like (1 i), yet a little different in the arrangement of signs.

Doctor Rick answered:

Hi, Sithum, thanks for writing to The Math Doctors.

I don’t know for sure what is taught in your school system; I know that in some places, specialized rules for ratios are taught, so perhaps there is a way to prove the propositions that I do not know. However, plain algebra is quite sufficient to prove things like these — algebra is a very general and powerful tool.

I will give you one hint to get you started. Knowing that a:b = c:d, we can write this as a/b = c/d (except in the special case of b = d = 0, which is allowed for ratios but not for division). Now, knowing that the two quantities a/b and c/d represent the same quantity, we can give a name to that quantity! Let’s give it the name “r”:

a/b = c/d = r

Do you see what to do with this idea? Hint: Write expressions to replace a and c.

The “specialized rules for ratios” he referred to are theorems like “componendo and dividendo” with old Latin names that I see mostly in math history books, but that are still taught in some parts of the world (it seems especially popular in India), and are quite useful if you need to work heavily with ratios. This one in particular says that $$\text{If }\frac{a}{b} = \frac{c}{d}\text{, then }\frac{a+b}{a−b} = \frac{c+d}{c−d}.$$ It can be easily proved (though it might take you a while to think of it) by supposing that \(\displaystyle\frac{a}{b} = \frac{c}{d} = r\), so that \(a = rb\) and \(c = rd\). Then $$\frac{a+b}{a−b} = \frac{rb+b}{rb−b} = \frac{(r+1)b}{(r−1)b} = \frac{r+1}{r-1},$$ and likewise $$\frac{c+d}{c−d} = \frac{rd+d}{rd−d} = \frac{(r+1)d}{(r−1)d} = \frac{r+1}{r-1},$$ so they are equal.

By memorizing several such theorems, you can manipulate ratios in many useful ways. In American practice, we emphasize more general tools of algebra which require less memory, and leave special tools for specialists who need them. It’s possible that this problem is designed as a playground for special tools.

The approach Doctor Rick suggested, you’ll observe, is the same technique I just used to prove componendo-dividendo. I have heard students call it “the *k*-method” because they use *k* where we used *r*.

We’ll have a little discussion soon of the comment about zero.

Sithum “took the idea and ran with it”:

Hi Dr. Rick! Thank you very much.

Though I tried many different systems to equate ratios I never thought about that! After taking your hint and applied it to the problems, it was really flowing. Solved all the ratio problems and reckon they are correct.

He attached this (correct) work for problem (1 i):

You can see how similar this is to my proof above, apart from the fact that it is written in terms of ratios rather than fractions; it is indeed a powerful method, and it applies to *anything that looks similar* to componendo and dividendo!

But he had a question pertaining to the converse type of problem (#2):

Just one thing. You said “a:b = c:d, we can write this as a/b = c/d (except in the special case of b = d = 0, which is allowed for ratios but not for division)” in the reply. So its inverse must also be true, right? So if

bc = adthen

(c/d) = (a/b)couldn’t be written as

c:d = a:b ?The thing is that I got a line at the end of a problem (attachment below) saying

bc = adand I was supposed to provec:d = a:b.I got no other way around except to flip fraction in to a ratio. I wonder if it’s correct.

Doctor Rick had said that we can only rewrite *a*:*b* = *c*:*d* as *a*/*b* = *c*/*d* if *b* and *d* are non-zero, because whereas the second term in a ratio may be zero, the denominator of a fraction can’t. Sithum is asking about rewriting in the other direction (which would be the *converse*): going from *bc* = *ad* to *c*/*d* = *a*/*b* to *c*:*d* = *a*:*b*, but without knowing that *b* and *d* are non-zero. Here is the attachment:

Doctor Rick replied:

Hi, Sithum.

The converse (or inverse, which is different but equivalent) of a true statement is not necessarily true, and in this case the converse is

nottrue. Wecanalways write a/b = c/d as a:b = c:d, because the two are equivalent whenever b and d are not 0, and if we’ve got a/b = c/d, it is already implicit that b and d are not zero.

So his step from *a*/*b* = *c*/*d* to *a*:*b* = *c*:*d* is always valid, because the former can’t be written unless *b* and *d* are non-zero. But …

What is not quite true in your work is the step going from bc = ad to a/b = c/d, because to get there you must have divided both sides by (bd). This is not valid if bd is zero — that is, if either b or d is zero. But

we can actually skip directly to the ratio statement, a:b = c:d, by the converse (valid in this case) of the rule of “means and extremes”. For more on this, and on the relation between fractions and ratios, see this item from the Ask Dr. Math Archive:

That is, it is not necessary to go through the fraction form, because we can conclude directly from *bc* = *ad* that *a*:*b* = *c*:*d*; if any term is zero, then at least one other term must be zero, in such a way that the proportion is true. (But this includes indeterminate cases, like 0:0 = anything:anything.)

Sithum added a further example, which raised a different issue:

This is a ratio problem I worked out. Reckon it is OK.

Doctor Rick responded,

I presume that this image is a new problem in which you were given the equation at top and asked to prove that a:b = c:d. I can’t follow the work; I had an idea of what you might be doing, but if I was right, you had some sign errors at the least. Could you please put in more words so I can follow your thinking? I know it’s really tedious to write out everything step by step, but at least explain the significance of the arrows and different-colored underlines.

If I am understanding the problem correctly, I like your idea of grouping the terms in each factor into pairs, but I would group them differently. It can be done so that each side of the equation is of the form

(X + Y)(X – Y)

which is a familiar and often useful form. I think that will make the work simpler and easier to present clearly.

A large multiplication like this is greatly simplified when you arrange the parts so that the first steps lead to enough canceling to make subsequent steps smaller, as we’ll see. I think Sithum is probably doing some canceling in his head, which makes it hard to check, and hard to explain to others.

I also think that Sithum changed the order of factors on the R.H.S., which looked like sign errors. With clues from the next response, the work looks correct.

Sithum answered,

Hi Dr. Rick

I thought to add that question you asked about but forgot to post it. Like you said it’s a proof and I ought to have done it the way you suggested. What I thought was if I had to expand the whole expression, I would get some 32 pieces as a result. So I grouped them around and figured out which would cancel each other. The pairs I underlined in same colours would cancel each other after expansion and pairs with arrows would remain intact after being multiplied by each other. It wasn’t the most effective step – I should’ve done it clearly with a few more steps. I did the whole thing over again in the way I first solved it and also using your hint. I don’t know whether I did it right but I didn’t find errors in the first method. Please tell me if I am wrong.

The second time I tried to be bit more descriptive as you could notice easily if I have errors.

I looked up the pages you referred to. They were really illuminating.

Thanks for everything.

This looks correct, but …

Doctor Rick answered,

Thanks for the reworking of your solution, Sithum. It is clearer for the most part, but there is still one point where I stumbled while reading it, because you omitted a step and it looked at first as if you had mistakenly dropped some terms.

After you say “As L. H. S. = R. H. S.”, I would have written:

(3a + 6b)(c – 2d) – (3a – 6b)(c + 2d) = (3a – 6b)(c + 2d) – (3a + 6b)(c – 2d) [other terms equal on both sides]

But here R. H. S.is the negative of L. H. S., so both must equal 0.

or more formally,

2(3a + 6b)(c – 2d) – 2(3a – 6b)(c + 2d) = 0 [subtracting R. H. S. from L. H. S.]

(3a + 6b)(c – 2d) – (3a – 6b)(c + 2d) = 0 [dividing through by 2]

(3a + 6b)(c – 2d) = (3a – 6b)(c + 2d)

In any case, you got it right (unless you really did make the mistake I thought at first, but you got lucky).

I like that you observed that you’d “get some 32 pieces as a result” — you looked ahead at the complexity of the problem, and that led you to look for a way to reduce the complexity. That’s what I did too; finding the conjugate pairs (x+y)(x-y) was my response, I think a more effective way to reduce the complexity. But observing and trying something in response are useful thinking steps.

The reference to 32 pieces comes from the fact that the original equation has two products of 4 terms times 4 terms, each of which expands to 4×4 = 16 terms. It is, indeed, wise to arrange for as much as possible to cancel early, because with less work to write, there will be fewer errors.

Here is my interpretation of Doctor Rick’s approach using conjugate pairs:

$$(3a+6b-c-2d)(3a-6b+c-2d) = (3a+6b+c+2d)(3a-6b-c+2d)\\

((3a-2d)+(6b-c))((3a-2d)-(6b-c)) = ((3a+2d)+(6b+c))((3a+2d)-(6b+c))\\

(3a-2d)^2-(6b-c)^2 = (3a+2d)^2-(6b+c)^2\\

(9a^2-12ad+4d^2)-(36b^2-12bc+c^2) = (9a^2+12ad+4d^2)-(36b^2+12bc+c^2)\\

9a^2-12ad+4d^2-36b^2+12bc-c^2 = 9a^2+12ad+4d^2-36b^2-12bc-c^2\\

-12ad+12bc = 12ad-12bc\\

24bc = 24ad\\

bc = ad\\

a:b = c:d$$

About 40 days later (mid-June) Sithum replied:

Dr.Rick,

I’m terribly sorry I couldn’t reply sooner. I was offline for more than a month. The lockdown was strict around here and only now did I get a reload. I did some similar problems and they weren’t really that hard after all the practice. Thank you awfully for the help. Really appreciate the support you’ve given all the time.

Looking forward to ask the math doctors,

Doctor Rick closed:

]]>You are welcome. Thanks for writing, and I’m glad you are getting back to your activities after the lockdown.

We’ve had a number of questions about what the dot product is, and have given almost contradictory answers, because it can be defined in two different ways. I’m going to try to put them together to make a coherent whole. Here is our first question, from 1998:

Explaining the Dot Product I am currently a high school student teacher teaching trigonometry. We are doing a unit on vectors. When inner (dot) product was taught, many questions were raised. Everyone understood that when given vector u and vector v,the dot product is ||u|| times ||v|| times the cosine of the angle between them, but we had a problem when we got the answer. Everyone understood that the answer was a scalar, not a vector, but there is no graphical representation forwhat this scalar stands for. I have checked numerous sources (every text book I could get my hands on, the internet, the math department at the university I am attending, and the math department where I am student teaching). I have had the students look for an answer on the internet and in the library. We did an activity drawing vectors and comparing the dot product with the vectors. None of us has been able to find an understandable meaning of dot product. We have exhausted our resources and hope you can help us. We have done problems involving work and the dot product, sowe have seen a real world application, but we are still confused as to what it really is. QUESTION:Exactly what does the dot product represent?Is there a graphical explanation for the resulting scalar? Please help us clear the confusion. Thank you.

Paul has been taught that the dot product of two vectors **u** and **v** is the product of their magnitudes and the cosine of the angle between them. To him, evidently, this doesn’t seem tangible enough, or important enough, to be its essence. What is it, really?

Doctor Anthony answered, changing the starting point:

You are perhapsthinking of dot products the wrong way around. The dot product of the vector (x1, y1, z1) and the vector (x2, y2, z2) is written down in a few seconds: v1.v2 = x1.x2 + y1.y2 + z1.z2 Now, having this, we can find the angle between v1 and v2, since: v1.v2 cos(theta) = --------- |v1|.|v2| where |v1| = sqrt(x1^2 + y1^2 + z1^2), and similarly |v2| = sqrt(x2^2 + y2^2 + z2^2). By expressing v2 as a unit vector, we can also write downthe component of v1 in the direction of v2. We can test for two vectors being perpendicular, since if they are perpendicular, cos(theta) = 0 and v1.v2 = 0.

There are two fundamental facts about the dot product, either of which can be taken as its definition:

$$\mathbf{v_1}\cdot\mathbf{v_2} = |\mathbf{v_1}| |\mathbf{v_2}|\cos\theta$$

where \(\theta\) is the angle between them, and

$$\mathbf{v_1}\cdot\mathbf{v_2} = (x_1, y_1, z_1)\cdot(x_2, y_2, z_2) = x_1 x_2 + y_1 y_2 + z_1 z_2$$

So from Doctor Anthony’s perspective, what the dot product **is**, is just a simple way to combine the components of two vectors, and nothing more, *in itself*. The component-wise definition is more fundamental to him; it is a not-unreasonable thing to try doing to two vectors. This then happens to have some **implications **that make it useful. Perhaps the most visual of these relates to the component (or projection) of one vector in the direction of another. In particular, if \(\mathbf{u}\) is a unit vector, then \(\mathbf{v}\cdot\mathbf{u}\) is the length of the **projection** of \(\mathbf{v}\) on \(\mathbf{u}\):

Therefore, the dot product \(\mathbf{v}\cdot\mathbf{w}\) can be thought of as the length of one, times the length of the projection of the other on it:

Or we could say that the dot product is the projection of either unit vector on the other, multiplied by both lengths. This, in my mind, is the answer to Paul’s central question. But Doctor Anthony is right that this is not something whose importance would be initially obvious, leading one to make this definition.

Since it is just as easy to work with vectors in 3 dimensions as in 2 dimensions, you will find thatmost 3D geometry is done using vectors, and the dot product turns up in just about every problem you can think of; for example, finding the distance of a point from a plane or from a line, or the shortest distance between two lines in space, or the equation of a plane defined by three points. Some of these can also be solved using VECTOR products, but that is a more advanced concept.

It is the usefulness of the dot product, not what it represents in itself, that makes it worth defining. Because of the close connection of the dot product to the angle between vectors, many calculations that would involve complicated trigonometry can be done quickly by vectors.

In short,we don't set out to find the dot product. We set out to find angles between vectors, the component of a vector in some direction, the distance of a point from a line or plane, the equation of a plane, and so on and so on, and we use dot products in getting the answers to these questions. In a similar way, you don't multiply two numbers for the fun of it. You multiply numbers to answer some question which requires the technique of multiplication as an essential aid.

So what **is** the dot product? A simple tool that has powerful applications. It can be hard to picture, but its power comes from something deeper. (By the way, have you ever tried to explain what multiplication of numbers really *is*, when you are not just working with whole numbers? It can get complicated!)

That answer begged the question of how the two “definitions” of the dot product, using components and using the angle, manage to come together. Later in 1998, we got this question:

Vector Angles: Prove A.B = |A||B|cosA Hi Doctors, I just finished my pre-calc class, and at the end we covered vectors, including this theorem: A.B = |A||B|cosA dot product = length(length)cos(angle between them). If the proof isn't too complicated,what is the basis for that theorem?Any help would be appreciated. Bryan

This assumes that the dot product has first been **defined**, presumably as the sum-of-products-of-components that Doctor Anthony started with, so that then a **theorem** can express it in a different form. Evidently it was presented in class just as an assertion, without a proof. Doctor Rick answered:

Hi, Bryan. No, I don't think this proof is very complicated. Let's see if you agree. To start with, let's draw a vector A with length a, making an angle alpha with the x-axis: y | +....... + A | /: | / : | / : | a/ : | / : x = a*sin(alpha) | / : | / : |/)alpha : +--------+------x y = a*cos(alpha) The x component of the vector is a*cos(alpha), and the y component is a*sin(alpha), from the right triangle I have drawn.

Now make another vector B, of length b and making angle beta with the x-axis: A = (a*cos(alpha), a*sin(alpha)) B = (b*cos(beta), b*sin(beta)) The dot product is now (just multiplying the x components and the y components, and adding them together) A . B = a*b*cos(alpha)*cos(beta) + a*b*sin(alpha)*sin(beta) = a * b * (cos(alpha)*cos(beta)+sin(alpha)*sin(beta))

All that's left is to remember the trigonometric identity, cos(alpha - beta) = cos(alpha)*cos(beta) + sin(alpha)*sin(beta) and we have A . B = |A| * |B| * cos(alpha - beta) But (alpha - beta) is the angle between the two vectors.

So we started from the definition of the dot product as a sum of products of components (in two dimensions), and proved the cosine formula. This would have been harder to do in three dimensions, where a different approach would be needed.

There was a delay in that answer, and in between, Bryan had asked the same question just a little differently, indicating that the context was in fact three dimensions:

Hi Doctors, In determining some angles in 3-D figures, I used the theorem: A.B = |A||B|cosA - pertaining to vectors I was thinking this looked like the law of cosines, but I couldn't make the mental connection. Could you please help me to understand how the above theorem was derived?

The Law of Cosines is indeed a good idea; hold that thought!

Doctor Anthony answered, sort of reversing what he’d said in his previous answer:

This is not so much a 'theorem' as a 'definition'. You DEFINE the scalar products of two vectors to be the product: a.b = |a| x |b| x cos of angle between a and b. Following this definition you get the very convenient result that two vectors given in component form such as: a = (x1,y1,z1) b = (x2,y2,z2) then the scalar product is: a.b = x1.x2 + y1.y2 + z1.z2

This, of course, is the reverse approach from what we’ve said so far: If you *define* the dot product in terms of the angle, then you have to prove the component-wise form from that. The important thing is that both characterizations of the dot product can be shown to be equivalent. So let’s do that next.

A couple months later, we had another similar question, to which we gave another two answers:

Deriving the Dot Product I would like a very thorough explanation ofhow we derive the formula |u| |v| cos(x) = u.v, the dot product. I have looked in several books and they merely write this as a given instead of explaining how and why we get this equation. I need the formula so as to calculate the angle between two points on the surface of a sphere, but I want to understand what I am doing

The context suggests that this is another question about three dimensional vectors, but we’ll still be holding off on that. Note that Nick clearly has defined the dot product based on components, and treats this as a theorem; but he’s written it with the trigonometric form on the left.

Doctor Schwa replied first, with an approach that starts with the trig:

Good question! There are books that do a better job of explaining this, but not a lot of them, soI'm not surprised you had trouble finding an answer. The idea is to find the angle between two vectors, and show that it is u.v divided by |u| |v|, where I'll use "." to stand for the dot product and |u| to stand for the length of u. Probably the best way to do this is to look at the angles made with the x-axis. We want to know the difference between the two angles, which I'll call theta_u and theta_v. Similarly I'll let the vector u have two components (x_u, y_u) and v be (x_v, y_v) (the standard notation is to use _ to denote subscripts). x = theta_u - theta_v

Like Doctor Rick, he is going to work only in two dimensions; in effect, this will be the same proof approached in reverse. He starts with the angle-difference identity, as Doctor Rick ended with it:

I want to find cos(x), which is: cos(theta_u - theta_v) = cos(theta_u) cos(theta_v) + sin(theta_u) sin(theta_v) = (x_u / |u|) (x_v / |v|) + (y_u / |u|) (y_v / |v|) and simplifying, you get u.v / |u||v| Very simple, it turns out, when you look at it the right way. I'm just about to be teaching these ideas to my 11th grade class here at Gunn HS in Palo Alto so answering your question was a good reminder for me of what to emphasize. Thanks for asking it!

The fact that \(\displaystyle\cos(\theta_u) = \frac{x_u}{|\mathbf{u}|}\) is equivalent to what we used before, that \(x_u = |\mathbf{u}|\cos(\theta_u)\), or, in Doctor Rick’s notation, \(a_x = a\cos(\alpha)\). This makes a nice, concise proof.

Then Doctor Ken answered, again sticking to two dimensions:

Hi Nick,Some books actually use |U|*|V|*Cos(X) as the definition of the dot product.There's another definition that's pretty common though; for two dimensional vectors, where U = (u1, u2) and V = (v1, v2), it is: U.V = u1*v1 + u2*v2 For three-dimensional vectors, it's: U.V = u1*v1 + u2*v2 + u3*v3 I bet you see the generalization.

So he has explicitly stated the important point: either can be taken as the definition; the goal is to prove that they are equivalent. He explicitly starts with the trig definition and derives the component form from it:

So let's see whether we can show that |U||V|Cos(X) = u1*v1 + u2*v2. We'll write out the left side in terms of the coordinates of U and V, and then simplify. We'll do it in 2 dimensions, butthe proof isn't very different in 3 dimensions. Imagine a plane with two vectors: U V | / \ | / \ | / \|/ --------------------------------+---------------- Add the coordinates to the diagram: U V | /| |\ | / | v2| \ t / |u2 | \|/ | --------------------------+-----O----+-----------X v1 u1 Angle t is the angle from U to V. It equals the angle from the x-axis to V, minus the angle from the x-axis to U. Let's call these angles XOV and XOU. So we have: |U||V|Cos(x) = |U||V|Cos(t) = |U||V|Cos(XOV - XOU) = |U||V|[Cos(XOV)Cos(XOU) + Sin(XOU)Sin(XOV)] = |U||V|[(v1/|V|)(u1/|U|) + (v2/|V|)(u2/|U|)] = |U||V|[(v1u1 + v2u2)/(|V||U|)] = v1u1 + v2u2

I’ve added some labels to his diagram, and called the angle *t*, to make it easier to follow.

You should be able to see how this is the same as our previous proofs, just given a different twist.

Make sense? Now, to make sure you understand it, since your problem is in 3 dimensions, why don't you try to derive the same result for3-dimensionalvectors? If you're slick, you can actuallyuse the 2-dimensional result(hint: there's a plane that contains the two vectors and the origin).

I haven’t found a nice way to extend this method of proof to three dimensions; when we have been asked about that comment, we have generally offered the method given in the next answer, which extends easily to that case. It is possible that the extension of the angle-difference method requires more advanced knowledge of either vectors or geometry than I want to assume. (If anyone finds a clear way to do it, let me know!) So let’s finish with the most powerful proof:

In 2006, Sheri asked it this way:

How Does the Dot Product of Two Vectors Work? I'm a senior in high school studying advanced math and we are now studying about the dot product. My question is, how do we describe the identity function in terms of the dot product formula? Our teacher was talking about how they're related but I'm absolutely confused! I don't even know what the dot product is about and how does multiplying two vectors give you the angle they form together?

I’m not sure what Sheri means by “the identity function”; she may mean the angle-sum identity used in all our answers so far! If so, she got something different. Doctor Jerry replied:

Hello Sheri, Thanks for writing to Dr. Math. I'll be referring to this image: The vectors u and v are given, t is the angle between them (typing "t" is easier than typing "theta"), and w is the length of the line joining the tips of u and v. The components of u and v are u1, u2, v1, and v2. You may usually write u = u1*i + u2*j.

In the diagram, Doctor Jerry used the standard notation for vectors, \(\langle a, b\rangle\). In the form we have been using here, the vectors are \(\mathbf{u} = (u_1, u_2)\) and \(\mathbf{v} = (v_1, v_2)\). We’ve seen previously that they can be written using unit basis vectors as \(\mathbf{u} = u_1\mathbf{i} + u_2\mathbf{j}\) and \(\mathbf{v} = v_1\mathbf{i} + v_2\mathbf{j}\).

The length of w can be calculated using the law of cosines. For this, notice that the coordinates of the tips of u and v are (u1,u2) and (v1,v2). From the law of cosines and letting sqrt(u1^2 + u2^2) = |u|, the length of u, and sqrt(v1^2 + v2^2) = |v|, the length of v, we see that: w^2 = |u|^2 + |v|^2 - 2*|u|*|v|*cos(t) Because w^2 = (u1-v1)^2 + (u2-v2)^2, we see that: (u1-v1)^2 + (u2-v2)^2 = |u|^2 + |v|^2 - 2*|u|*|v|*cos(t)

Many people, when they first see the formula for the dot product, recognize its similarity to the Law of Cosines, so this is a natural thing to try.

If we now replace |u|^2 and |v|^2 by u1^2 + u2^2 and v1^2 + v2^2 we find: (u1-v1)^2 + (u2-v2)^2 = u1^2 + u2^2 + v1^2 + v2^2 - 2*|u|*|v|*cos(t) Expanding the left side and simplifying: u1*v1 + u2*v2 = |u|*|v|*cos(t) The left side is usually given the name "dot product" and written as "u.v", so: u.v = |u|*|v|*cos(t) As I hope you find out, this is a very useful result.

This is easy to extend to three dimensions:

$$|(u_1,u_2,u_3|^2 + |v_1,v_2,v_3|^2 = |u|^2 + |v|^2 – 2|\mathbf{u}| |\mathbf{v}|\cos\theta$$

$$(u_1-v_1)^2 + (u_2-v_2)^2 + (u_3-v_3)^2 = u_1^2 + u_2^2+u_3^2 + v_1^2 + v_2^2 + v_3^2 – 2|\mathbf{u}| |\mathbf{v}|\cos\theta$$

$$u_1^2 – 2u_1v_1 + v_1^2 + u_2^2 – 2u_2v_2 + v_2^2 + u_3^2 – 2u_3v_3 + v_3^2 = u_1^2 + u_2^2+u_3^2 + v_1^2 + v_2^2 + v_3^2 – 2|\mathbf{u}| |\mathbf{v}|\cos\theta$$

$$- 2u_1v_1 – 2u_2v_2 – 2u_3v_3 = – 2|\mathbf{u}| |\mathbf{v}|\cos\theta$$

$$\mathbf{u}\cdot\mathbf{v} = |\mathbf{u}| |\mathbf{v}|\cos\theta$$

]]>The first question is from 2002:

Unit Vectors I am trying to solve a math problem that I truly do not understand. The problem reads: "Find the twounit vectorsthat arecollinearwith each of the following vectors. (a) vector A = (3, -5)" That's the first question in this problem, anyway. I don't understand what this problem is even asking me to do. Is a unit vector only ever equal to 1? I've done a lot of research in my book and on the internet and I still don't understand. Any help you could provide would be GREATLY appreciated. Thanks ever so much.

A **unit vector** is one whose length (magnitude) is 1; **collinear** vectors lie along the same line (so they can go in the same or opposite directions). Doctor Ian answered:

Aunit vectorcan have any direction, but its length is equal to 1. So the following are all unit vectors: (0,1) length^2 = 0^2 + 1^2 = 1 (1,0) length^2 = 1^2 + 0^2 = 1 (1/2, sqrt(3)/2) length^2 = (1/2)^2 + (sqrt(3)/2)^2 = 1

The length, or magnitude, of a vector is found by the Pythagorean theorem: $$|(a,b)| = \sqrt{a^2+b^2}$$

Here are his three vectors, showing that they all have length 1:

In fact, if you pickany point on the unit circle(i.e., the circle centered at the origin, whose radius is 1), the vector from the origin to the point is theunit vector (cos(a),sin(a)), where a is the angle from the positive x-axis to the point.

Here is an example, where my angle \(\theta\) is 133°, so its components are \((\cos(133°), \sin(133°)) = (-0.68, 0.73)\):

The easiest way to get a unit vector that is collinear with a vector (a,b) is to find the magnitude of the vector, |(a,b)| = sqrt(a^2 + b^2) and divide both components by that: 1/|(a,b)| * (a,b) = (a/|(a,b)|, b/|(a,b)|) Do you see why this will always be collinear with the original vector, and why its length will always be equal to 1? (Note that the unit vector that points in the _opposite_ direction is also collinear.)

Dividing both components by \(|a|\) reduces the length to 1 without changing the direction. Multiplying by negative 1 reverses the direction. Here are our vector \(\mathbf{a} = (3, -5)\)and the two unit vectors \(\mathbf{u}_a = \left(\frac{3}{\sqrt{34}}, -\frac{5}{\sqrt{34}}\right)\) and \(-\mathbf{u}_a = \left(-\frac{3}{\sqrt{34}}, \frac{5}{\sqrt{34}}\right)\):

This 1998 question is from a student whose goals are far beyond the basics, but who needs help starting:

Unit and Basis Vectors in Three Dimensions Please give me a simple explanation of: 1.Unit vector. My books (e.g., _Vector and Tensor Analysis_ by Borisenko) are not clear and assume I already understand this. Also,what use is a unit vector?2.Basis vector. Again, my other sources are not clear. P.S. I study relativity on my own, and this is why I'd like to understand the basics, like tensor algebra.

Doctor Anthony answered, giving the basic definition for vectors that we discussed last time, because it is applicable (with some little modifications) to physics:

A vector is a physical quantity, like velocity, displacement, or force, having bothmagnitude and direction. Think of a vector as represented by a straight line pointing in a particular direction. The length of the line represents themagnitudeof the vector. So in the case of aunit vector, the length of the line is 1 unit. It is convenient to use unit vectors when working on problems. If we let u represent a vector in a certain direction and of unit magnitude, then 3u, 7u, and -8u are immediately understandable as vectors of magnitudes 3, 7 and -8 all in thedirectionof u (except -8u, as the negative sign means "in theopposite directionto +u").

Just as, above, we started with a vector **a** and found a unit vector in the same direction, here we can reverse the process and describe a vector as a unit vector in the right **direction**, multiplied by its **length**. In doing this, we are splitting the vector into its magnitude (a number) and its direction (a unit vector). Here are a unit vector **u** and the multiples that were mentioned:

But we can do much more by picking a standard set of unit vectors to use as a “basis” for describing any vector at all:

It is very common to use i, j, and k asunit vectors in the directions of the x, y, and z axes, respectively, in 3D space. This means that EVERY vector in space can be given in terms of its "components" parallel to those three axes. So, for example, 5i + 2j - 6k is a vector in space, and its magnitude would be represented by the length of a line joining the origin (0,0,0) to the point (5,2,-6). Incidentally, this answers your second question: i, j, k are called "base" vectors because they are used as the basis for expressing all other vectors. Every other vector in 3D space whatsoever can be given in terms of i, j, and k.

I’ll stick to two dimensions here. The unit basis vectors **i** and **j** are the same as **u** and **v** that we used in the first illustration above:

Sometimes it is convenient to use other vectors as base vectors. Any two non-parallel vectors could be used as base vectors to give any point in the plane of the two vectors. That is, every other vector in that plane could be expressed in terms of the base vectors, just as we say 6i + 4j to express a vector in the xy plane. Similarly, any three non-coplanar vectors could be used as base vectors "spanning" 3D space. Again, the most common base vectors are i,j, k, but there are occasions when an entirely different set of base vectors are used. Finally, vectors are not confined to 1, 2, or 3 dimensions. You can have multi-dimensional vectors expressed in terms of 4, 5, 6, and higher base vectors. The number of base vectors will equal the dimension of the space under consideration.

Here is our vector 6**i** + 4**j**, which can also be called (6, 4) using its components:

The next question, from 1998, involves vectors whose direction is expressed as the angle from the positive *x*-axis:

Vector Components, Magnitude, and DirectionVector Mof magnitude 4.75 m is at 58.0 degrees counter-clockwise from the positive x-axis. It isadded to vector N, and theresultantis a vector of magnitude 4.75 m, at 39 degrees counterclockwise from the positive x-axis. Find: (a) the components of N, and (b) the magnitude and direction of N. I drew a graphical illustration of the problem. But I really can't solve it because I don't know how.

Here we have two vectors being added, and one of them and the sum are described in terms of magnitude and direction. We want to find vector **n**, both in terms of components and of direction (angle) and magnitude:

Doctor Rick answered, suggesting the most likely method:

Hi, Kristine, I will get you started on solving this kind of problem. There are two tools you need to do this: (1) converting between magnitude/direction and components of a vector and (2) adding vectors. The first requires some trigonometry, so I hope you've had some. (1) You are given the magnitude and direction of vectors M and P (the sum of M and N). Before you can add them, you mustfind their components. Remember this diagram: My+-------------* M | /| | / | | / | | / | | / | | L/ | | / | sin(a)|-----+ | | /| | | 1/ | | | / | | | /)a | | |/____|_______|__________ O cos(a) Mx A vector of length 1 has components (cos(a), sin(a)). By similar triangles, a vector M of length L has components Mx = L*cos(a), My = L*sin(a). Do this with both vectors M and P to get their components (Mx, My) and (Px, Py).

As we saw above, we can think of the vector **m** as a unit vector in the given direction multiplied by its length. The trigonometric functions cosine and sine give the *x* and *y* components of the unit vector, as we saw in our first answer. For our vector **m**, the angle is 58° and the length is 4.75, so the vector is $$(m_x, m_y) = (4.75\cos(58°), 4.75\sin(58°)) = (2.517, 4.028)$$

Similarly, for vector p = m + n, we have angle 39° and length 4.75, so the vector is $$(p_x, p_y) = (4.75\cos(39°), 4.75\sin(39°)) = (3.691, 2.989)$$

(2) You know that M + N = P. To add vectors, add their components: Mx + Nx = Px My + Ny = Py You know Mx, My, Px, and Py, so you should be able to figure out Nx and Ny. These are the components of vector N.

To find the components of **n**, we just subtract: $$n = (n_x, n_y) = (p_x-m_x, p_y-m_y) = (3.691 – 2.517, 2.989 – 4.028) = (1.174, -1.039)$$

That’s the answer to part (a).

(3) You were also asked for themagnitude and directionof vector N. To do this, you have to reverse step 1. Here's how, using the figure (remember, you'll be doing this for vector N, not vector M). Magnitude(M) = Mx^2 + My^2 (the Pythagorean Theorem where ^2 means square) tangent(a) = sin(a)/cos(a) = My/Mx (by similar triangles again) So Direction(M) = a = inverse tangent of a = arctan(a)

The Pythagorean theorem gives our length as $$|\mathbf{n}| = \sqrt{n_x^2+n_y^2} = \sqrt{1.174^2+(-1.039)^2} = \sqrt{2.457797} = 1.568$$

The tangent of our angle is the slope of the vector: $$\tan(\theta) = \frac{n_y}{n_x} = \frac{-1.039}{1.174} = -0.885$$

So the angle itself is $$\theta = \tan^{-1}(-0.885) = -41.5°$$

Those are the tools you'll need. See if you can do the job now. Write back if you're still confused after you've tried it.

Presumably, Kristine did just what we’ve done here.

This 1997 question hopes for a way to indicate the direction of a 3-dimensional vector similar to the angle or slope in the previous type of problem:

Formula for Slope of 3-D Line Thank you for answering our question about finding the length of a line in three dimensions. Now we would like to know the formula to findthe slope of a three-dimensional line. We searched through textbooks and tried to adapt the formula, but with no success.

Doctor Rob answered, first talking about planes rather than lines:

There is no direct analogue of the idea of slope in two dimensions. The subject you are discussing is analytic geometry of three dimensions. The following facts should help a little. A linear equation in x, y, and z, such as ax + by + cz = d, is the equation of aplane, not a line. Such equations can be put intoone standard formby dividing by sqrt(a^2+b^2+c^2). The resulting coefficients of x, y, and z have the property that the sum of their squares is 1.Another standard formis gotten by dividing by d, and writing the equation as x/(d/a) + y/(d/b) + z/(d/c) = 1. From this form you can read off the intercepts with the x-, y-, and z-axes: (d/a, 0, 0) is the x-intercept, (0, d/b, 0) is the y-intercept, and (0, 0, d/c) is the z-intercept (provided all of a, b, c, and d are nonzero). One way of regarding the "slope" of a plane is to write down a unit vector which is perpendicular to it, called thenormal vector. It is given by (a*I + b*J + c*K)/sqrt(a^2+b^2+c^2), where I, J, and K are the unit vectors in the x, y, and z directions. The coefficients of I, J, and K in this expression are called the direction cosines of the vector, because they are the cosines of the angles between the vector and the x-, y-, and z-axes, respectively.

The two standard forms he mentions for a plane are, in effect, $$a’x + b’y + c’z = d’$$ where the vector (*a*‘, *b*‘, *c*‘) is a unit vector called the unit normal vector (this is something we will see later in this series or a subsequent series), and $$\frac{x}{A} + \frac{y}{B} + \frac{z}{C} = 1$$ where *A*, *B*, and C are the intercepts on the three axes. The normal vector represents the direction of the plane.

But the question was about a line, and the “direction cosines” just mentioned for the unit normal vector show up here, too:

Alineis specified as the intersection of two nonparallel planes. This means you need two linear equations in x, y, and z to determine a line. There are several standard forms for the equations of a line, but a commonly used one is x - x0 y - y0 z - z0 ------ = ------ = ------ a b c Here (x0, y0, z0) is a point on the line, and the numbers a, b, and c determine the direction along the line: the vector a*I = b*J + c*K is parallel to the line. (Note: This form only works when the line is not parallel to any of the xy-, xz-, or yz-planes, i.e., when neither a, b, or c is zero).

Note that this form is not a single equation, but a pair of equations that set three quantities equal. In this form, $$\frac{x – x_0}{a} = \frac{y – y_0}{b} = \frac{z – z_0}{c}$$ the vector (*a*, *b*, *c*) gives the direction of the line, which is the best answer to the question, much like the normal vector for the plane.

In some sense, thedirection cosinesare the closest analogue to the slope. In two dimensions, they are just the cosine of the inclination, which is the angle with the x-axis, and the cosine of its complement, which is the sine of the inclination. The slope is the ratio of these two, the tangent of the inclination. There is no exact analogue because there is no "ratio" of three direction cosines, or of any three numbers.

We could say, however, that the triple ratio *a* : *b* : *c* is a reasonable analogue of the slope of a line, even though it is not a number; the direction cosines, as we’ll see below, are just the components of the *unit vector* in the direction of the line.

As he suggests, we can do all this in two dimensions for comparison, which is quite instructive. We can write a line as $$\frac{x – x_0}{a} = \frac{y – y_0}{b}$$ which (solving for *y*) can be rewritten as $$y = \frac{b}{a}(x – x_0) + y_0$$ The slope is the number \(\displaystyle\frac{b}{a} = \frac{\cos(\theta_y)}{\cos(\theta_x)} = \frac{\sin(\theta_x)}{\cos(\theta_x)} = \tan(\theta)\), showing that the ratio *a* : *b* is closely related to the slope, which is the tangent of the angle to the *x*-axis.

For a fuller picture of direction cosines, we’ll close with this question from 2003:

Why They're Called Direction Cosines I would like to know how to find theangles between a 3D vector and the 3 coordinate axes, given the components of the vector.

Doctor Ian answered, using a concept we’ll be getting to next week, the dot product:

Hi Kristen, If you have two vectors, A and B, and you want to find the angle between them, one way is to use the dot product: dot(A,B) = |A||B|cos(theta) Does that look familiar? To find the angle between a vector and a particular axis, you can just make B a unit vector. For example, if A is (a,b,c), then to find the angle with the x-axis, _______________ a*1 + b*0 + c*0 = \|a^2 + b^2 + c^2 cos(theta) a -------------------- = cos(theta) _______________ \|a^2 + b^2 + c^2

What he’s done here is to apply his formula for the angle between two vectors to the given vector **a** and the unit vector \(\mathbf{i} = (1,0,0)\). The general formula for the angle between vectors **a** and **b** is $$\cos(\theta) = \frac{\mathbf{a}\cdot\mathbf{b}}{|\mathbf{a}||\mathbf{b}|}$$ and if we replace b with a unit vector **u** (either **i**, **j**, or **k**), we have $$\cos(\theta) = \frac{\mathbf{a}\cdot\mathbf{u}}{|\mathbf{a}|}$$ where the numerator is just the appropriate component of **a**.

So the cosine of the angle between a vector and the *x*-axis is just the *x*-component of the vector divided by the magnitude of the vector. This is true for each of the components. But you may recognize that this cosine is simply a component of the unit vector:

Note that if you make A aunit vector(which you can do by dividing all the components by the magnitude of A), you end up with a b c ( ---, ---, --- ) = ( cos(theta ), cos(theta ), cos(theta ) ) |A| |A| |A| x y z For this reason, thecomponents of a unit vectorare often called the 'direction cosines' of the vector. Does this help?

So for example, if we have the vector (3, 4, 5), whose magnitude is \(\sqrt{3^2 + 4^2 + 5^2} = \sqrt{50} \approx 7.07\), then the unit vector in the same direction is $$\left(\frac{3}{\sqrt{50}}, \frac{4}{\sqrt{50}}, \frac{5}{\sqrt{50}}\right) = (0.424, 0.566, 0.707)$$ so the angles with the axes are the inverse cosines of these numbers, 64.9°, 55.6°, and 45°, respectively:

Kristen replied,

Thank you for your help. That will make my life SO much easier!

Next week, we’ll look at ways to multiply two vectors.

]]>A question we got at the end of March asked about a standard kind of algebra word problem that can be solved in a couple very different ways. It illustrates several choices that can be made (both about the meaning of the problem and how to solve it), as well as why we ask students to show what they have tried when they ask a question! It’s also a good reminder that sometimes a nudge is all you need.

Here is the question, from Elise:

How do I solve this question:

Damien drives his car at an average that is 40km/h faster than his average speed on a bicycle. He drove his car to his local maintenance shop, because it needed a tune-up. He had his bike on the roof of the car so he could ride his bike home while the car was being worked on. The total distance to the garage and back home is 30km and the total trip to drive to the garage and ride his bike home took 2 h. Determine his average speed on the bike from the garage to the home.

Normally, when a student shows no work, I ask to see what they have tried, and perhaps give a hint to get them started in case they have no idea. This time I chose to go a little further in my hint than usual, in part because of an interpretational issue I saw. I first restated the problem in a more readable form to help me find the information, and then started the process of turning it into equations:

Hi, Elise.

I presume you are taking algebra; if not, let me know what is the context of this problem.

The problem is:

Damien drives his car at an average that is 40 km/h faster than his average speed on a bicycle.

He drove his car to his local maintenance shop, because it needed a tune-up.

He had his bike on the roof of the car so he could ride his bike home while the car was being worked on.

The total distance to the garage and back home is 30 km and the total trip to drive to the garage and ride his bike home took 2 h.

Determine his average speed on the bike from the garage to the home.

I would start by thinking about

what we know, andwhat we want to find out.We know

how the two speeds are related, but not what they are.We know the

total distanceof the trip, and thetotal time.We know

how speed, distance, and time are related: d = rt (where d is distance, r is rate (speed), and t is time.

We aren’t told that the trips each way follow the same route and have the same distance, though that seems likely; so we probably shouldn’t assume the trip is 15 km each way … but we might change our mind on that!So it looks like we can either define a variable for each speed, or just one variable for one of the speeds.

This process of taking inventory is an important first step, as we focus our attention on what is known and what is unknown. Sometimes two variables are needed, and sometimes we could use only one. But for students familiar with using more than one variable, doing so makes the initial translation step easier. The most important thing at this step is clarifying what the problem means, and revealing a possible ambiguity to keep in mind.

I continued,

Now, let’s remove unnecessary information to reduce the problem to its essentials:

The

car’s speedis 40 km/h faster than thebicycle’s speed.The

total distanceof the combined trip is 30 km.The

total timeis 2 h.We want the

speed of the bike.I see two reasons there for defining a variable like this:

R = (average) speed of bike

One reason is that this is

what we want to find out; the other is thatif we know this, we can easily write an expression for the speed of the car. What would that be?___ = (average) speed of car (in km per hour)

Sometimes “what we want to find” and “what we can use to find other quantities in the problem” are not the same thing, and we have to choose the latter in order to proceed. But when the two criteria coincide, I would go for it!

But we have to say something about time, too:

We could write an expression for the distance each vehicle goes if we knew how much of the 2 hours was spent on each; so we might define another variable, and write another expression:

T = time spent on bike (in hours)

___ = time spent in car (in hours)

Or, we could use a variable for the time spent in the car, or for the distance the bike is ridden, …

So you have some choices to make; I think my suggestions are reasonable ones, but not the only good ones.

Let’s see you take it from here. Try writing expressions for other quantities you need, then use those to write two equations.

As I often do, I wrote those general thoughts before actually working on the problem, in order to show my initial thinking without bias. But …

Having written all that, I tried solving the problem, and found that I could only write one equation with two variables. So I’m guessing that

we might be required to assume the same distance each way.Here’s what we need now: Please tell me what topics you have been learning, assuming this problem is for a class; then show me whatever work you tried, and where you got stuck. There are several places I can see where you might have stopped because you weren’t sure of your work!

I had recognized from the start, without mentioning it, that many problems like this specify that the return trip is along the same route, but this one doesn’t say that. I didn’t want to make that assumption initially; but when I realized there was not enough information to solve the problem as I was interpreting it, I had to change my interpretation. That sometimes happens!

The problem as now defined is:

Damien drives his car at an average that is 40 km/h faster than his average speed on a bicycle.

The total distance to the garage and back home **(on the same route)** is 30 km **(so each leg is 15 km)**, and the total trip to drive to the garage and ride his bike home took 2 h.

Determine his average speed on the bike from the garage to the home.

We’ll be using a different approach below, so let’s finish my method now.

I defined

\(R\) = (average) speed of bike

\(T\) = time spent on bike (in hours)

so that

\(R + 40\) = (average) speed of car (in km/h)

\(2 – T\) = time spent in car (in hours)

Applying the formula *d* = *rt* to each leg of the trip, I get the equations

\(RT = 15\)

\((R + 40)(2 – T) = 15\)

This is a nonlinear system of equations, which can be solved by substitution; one way to do this is to solve the first equation for *T* (since we want to solve for *R*):

\(\displaystyle\left(R + 40\right)\left(2 – \frac{15}{R}\right) = 15\)

Expanding this and then clearing fractions, we get

\(\displaystyle2R – 15 + 80 – \frac{600}{R} = 15\)

\(2R^2 + 50R – 600 = 0\)

\(R^2 + 25R – 300 = 0\)

We’d like to factor this, but can’t. I’ll finish by completing the square:

\(R^2 + 25R + 156.25 = 300 + 156.25\)

\((R + 12.5)^2 = 456.25\)

\(R = -12.5\pm\sqrt{456.25} \approx 8.86,-33.86\)

That was a little harder than I was originally expecting. The speed of the bike (which has to be positive) is 8.86 km/h.

Elise replied, showing two attempts, one very wrong and the other nearly right though quite different from mine:

I’m actually in grade 11 and we are covering Rational Expressions. Here are some attempts I made but none of the answers I came up with seem correct. Please help!

First way:

Car speed (C) = v+40

Bike speed (B) = v

Distance (D) = 30 km

Total time (T) = 2 hours

Time it took the car to get to the garage: 15/(v+40)

Time it took to ride the bike: 15/vavg = t + t + 40

30 km/2 h = 15 km/h

So

15 = 2t + 40

15 – 40 = 2t + 40 – 40

-25 = 2t

-12.5 km/h = speed of bike

She nicely listed various quantities, defining them either as known constants or using the one variable *v*, and used the formula *t* = *d*/*r* (derived from *d* = *rt*) to find the time for each part of the trip. Then she added the two speeds (evidently intending to average them but not dividing by 2) and set that equal to the average speed calculated from total time and distance. This gave an invalid equation, whose solution was clearly wrong.

It wouldn’t have worked even if she had divided by 2. This is a common error, due to the fact that averaging speeds over different parts of a trip weights them improperly (because the faster lap takes less time). For a good explanation, see

Average Speed of a Caterpillar

Then she tried again:

Second way:

Car speed (C) = v+40

Bike speed (B) = v

Distance (D) = 30 km

Total time (T) = 2 hours

Time it took the car to get to the garage: 15/(v+40)

Time it took to ride the bike: 15/vT = C + B

2 = 15/(v+40) + 15/v

2[v(v+40)] = 15/(v+40)[v(v+40)] + 15/v [v(v+40)]

2v^2 + 80v = 15v + 15v + 600

2v^2 + 50v – 600 = 0

This time the equation was good, formed by setting the known total time (in hours) equal to the sum of the times for each part. She started the work by multiplying by the LCD to clear fractions, leading to a quadratic equation. She offered two ways to continue from there:

To finish the equation, it could be this?

v^2 + 25v – 300 = 0

(v + 5)(v – 5) = 300

v + 5 = 300 or v – 5 = 300

v = 295 km/h or 305 km/h ?

Or it could be this?

v^2 + 25v – 300 = 0

v(v + 25) = 300

v = 300 or v = 325 ?

These are, respectively, a wrong way and a right way to factor *the wrong expression*. Now we know exactly where help is needed.

I replied:

Thanks. I should have just asked for your work and context from the start, as we usually do.

Knowing you are studying rational expressions changes my approach a little; my method used my variables R and T largely to avoid rational expressions! The points at which I saw potential difficulties included a brief use of a

rational equationanyway, a resultingquadratic equation, and a resultingirrational solution. But my solution is a reasonable speed for a bike, as most of yours are not …I see that you did assume the same distance each way, as I had concluded we must. My equations were RT = 15 and (R+40)(2-T) = 15.

We saw my work with these equations above. I mentioned them just in case Elise later wanted to try my ideas.

Now I had to deal with the main issue, the quadratic equation:

Looking at your solutions, the first makes no sense to me, but

the second is very good; you end up withthe same equation I got, v^2 + 25v – 300 = 0.But

it isn’t useful to factor a quadratic equation unless the right-hand side is 0; when you get (v+5)(v-5) = 300, you can’t conclude that v+5 = 300 or v-5 = 300! Moreover, that isn’t the factorization of v^2 + 25v! In your last attempt you factored correctly, but again that is useless.It turns out that v^2 + 25v – 300 can’t be factored; so the appropriate thing to do is to complete the square, or use the quadratic formula. That will lead to the correct answer!

So

your error is not in rational equations, but in solving quadratic equations. That is not uncommon.I look forward to seeing your finished work soon.

Elise answered with good work:

Okay sorry about that. I didn’t realize that

I just needed a quadratic formula to solve it. It’s been a while since I’ve done quadratic equations. Thanks for helping me with this. You’ve been a big help! Okay here’s what I think is the answer. Let me know your thoughts!Car speed (c) = v+40

Bike speed (b) = v

Distance (d) = 30 km

Time it took the car to get to the garage: 15/v+40

Total time (T) = 2 hours

Time it took to ride the bike: 15/vT = C + B

2 = 15/(v + 40) + 15/v

2[v(v + 40)] = 15/(v + 40)[v(v + 40)] + 15/v [v(v + 40)]

2v^2 + 80v = 15v + 15v + 600

2v^2 + 50v – 600 = 0

v^2 + 25v – 300 = 0

v = [-b +- sqrt(b^2 – 4ac)]/[2a]

v = [-25 +- sqrt(25^2 – 4(2)(-300))]/2

v_a = 8.86

v_b = -33.86Bike’s average speed is 8.86 km/h

The first part of this is a quicker way to get to the quadratic equation than mine; the rest is equivalent to my work apart from using the formula.

I replied:

Yes, you solved it – exactly as I did.

Of course, the next thing to do would be to

check that your answer fits the story, by finding the other speed and the two times, which do add up to 2 hours.As I mentioned, it is common for students working with rational equations to need a nudge when a quadratic equation arises! And when the solution is irrational … well, that’s how real life works, but we protect students from that reality a little too much!

Good work.

Elise closed with this:

Thanks so much for your help. Yes, I totally needed a nudge. If I had known I just needed a quadratic equation man I wouldn’t have been so confused for so long.

And if I’d just asked for her work (or been given it initially), we would have gotten to the answer more quickly – but I wouldn’t have shown an alternative method! Only a little help was ultimately needed, yet all of this was worth discussing.

]]>We can start with this question from 2002:

What is a Vector? I am having trouble understandingexactly what a vector isand cannot seem to find a simple, straightforward explanation. Please help!

Patrick was only 10, so a fairly simple answer was appropriate. I answered:

Hi, Patrick. Vectors can be formally defined in several complicated ways, but I can give a basic introduction to the concept. Simply put,a vector is a directed quantity.

We’ll look at a more advanced view below. But here we need the basics.

We'll start with aone-dimensional vector. This is just the same as anumber. Draw a number line, anddraw an arrow starting at zeroand ending at 5. This is the vector (5). Draw another arrow starting at the 5 and ending at the 8; this is the vector (3). It doesn't matter where a vector starts; all that matters ishow long it isand how far it goes. So this second vector is 3 units long and points to the right, making it identical to a vector starting at 0 and going to 3. By putting two vectors end to end, as I did, I just added the vectors (5) and (3) to get the vector (8): 0 1 2 3 4 5 6 7 8 9 --+---+---+---+---+---+---+---+---+---+-- o------------------>o----------> 5 3 o------------------------------> 5+3=8

We can think of a vector in this sense as representing a motion. The vector (3) means “moving 3 units to the right”, regardless of where you start. And addition of vectors just means making one motion after another.

If you are familiar with negative numbers, you can see that a vector pointing to the left would correspond to a negative number. If we add the vectors (5) and (-5), we get the vector (0), a vector with no size at all. For any vector you draw, if you move it so that it starts at 0, it will point to its name. Since a one-dimensional vector is nothing but a (signed) number, it's nothing new. But this introduces the essential concept: onlysize and direction(left or right in this case) count,not position. Now we can look at two-dimensional vectors, in a plane, where things start to get interesting.

Draw anarrowon a piece of paper, pointing in any direction, and you have a vector:a length with a direction. Draw another arrow starting at the tip of the first one, and you have added two vectors: --+ / |\ u+v / \v / \ / \ o------------->o u If you draw vectors on a coordinate grid, you can give them names: V(-2,3) ^ W(3,3) + | + \ | / \ v\ | u+v / \v \ | / \ \| / \ o------------->o------> u U(5,0)

Here is that picture, slightly improved:

I’ve drawn the vectors **u** and **v**, both starting at the origin; then I’ve made a copy of **v** starting at the end of **u** in order to add them. This shows that \((5,0) + (-2,3) = (3,3)\):

Place each vector so that it starts at the origin (0,0), and name it for the point where it ends, just as we did on the number line. Our vectors are u = (5,0) and v = (-2,3), since they end at points U and V as shown. Their sum w = u+v is (3,3). Do you see how to add two vectors? You just add their x coordinates and their y coordinates; u goes 5 to the right and v goes 2 to the left, so w goes 5-2 = 3 to the right. (Actually, we use the word "coordinate" only for points; for vectors, we use the word "component.")

So for vectors with components \((a, b)\) and \((c, d)\), the sum would be \((a+c, b+d)\). In the example, walking 5 feet due east, then 2 west and 3 north, is equivalent to walking 3 feet east and 3 feet north. Addition produces the net result of two or more motions.

You can do the same for vectors in three-dimensional space, but I won't bother drawing that. The important thing is that vectors give us a way to talk about anything that has both size and direction, but not position - things likevelocities, wind speeds, forces, and so on. If I row myboatin the direction of vector u, but thewateritself is moving along vector v, then I will actually be moving along vector u+v, so the sum of the two vector velocities tells me how fast, and in what direction, I am really going.

In three dimensions, we can still use components, like \((a, b, c)\), and we still add vectors by putting them end-to-end.

There are two specific ideas about vectors that are not explicitly mentioned there. One is that the **magnitude** (length) of a vector can be found by the “distance formula”. For a two-dimensional vector \(\mathbf{u} = (a,b)\), this is $$|\mathbf{u}| = |(a,b)| = \sqrt{a^2+b^2}$$

Second, in addition to being able to add vectors, we can **multiply** a vector by a number (called a **scalar** to distinguish it from vectors) in the obvious way: Doubling a vector would mean adding it to itself, doubling each component, and generally $$k\mathbf{u} = k(a,b) = (ka,kb)$$

At a (much) higher level, this basic picture of vectors expands into the more abstract concept of “vector spaces”, just as numbers expand from what you learn in kindergarten (1, 2, 3) into real numbers and beyond. Here is a question from a month after the last question, asking how this fits with the elementary definition:

Definition of a Vector

Now we need to examine further how vectors are combined. Here is a question from 2010:

Magnitudes of Vectors Don't Add Up I don't understand how adding vectors results in a triangle in which the third side is equivalent to the sum of the original two vectors. In particular, I don't understand how the sum of the two added vectors can have the same magnitude as the vector sum. A vector is defined as something with magnitude and direction, so vectors are equal if and only if they have the same magnitude and direction. The addition of vectors means combining two vectors, sothe result of vector addition should give a vector with the same direction and magnitude as that of the combination of the added vectors, right?I can see how the sum of vectors A and B, if combined, would have the same direction as the third side of a triangle. What I don't understand here ishow the magnitude of the third side can be equal to the magnitude of the other two sides. That would mean two sides of a triangle sum to the third side, wouldn't it?

The fact is, the magnitude of the sum *can’t* equal the sum of the magnitudes, so David has misunderstood something. Doctor Ian answered:

Hi David, Suppose you are standing on a giant grid. You are given two numbers (a,b). You move a units to the east, and b units to the north. Now you are given two more numbers (c,d). You move c units to the east, and d units to the north. What is the total distance you've moved to the east? It's a + c, right? And what is the total distance you've moved to the north? It's b + d, right? So you could have got to the same final point by being given the numbers(a + c, b + d)along with the same instructions. Does this make sense? Do you see how it illustrates the rule for vector addition?

In this example, we are adding \((3,1)+(2,4) = (3+2,1+4) = (5,5)\).

Now Doctor Ian comments on two of David’s specific questions. First:

"A vector is defined as something with magnitude and direction, so vectors are equal if and only if they have the same magnitude and direction. The addition of vectors means combining two vectors, so the result of vector addition should give a vector with the same direction and magnitude as that of the combination of the added vectors, right?" Right. Andthey are combined by adding their components, as illustrated in the example above.

Don’t misread this: the direction and magnitude of the sum (resultant) is the direction and magnitude *of the combination*, not the *sum* of their magnitudes.

Next:

"I can see how the sum of vectors A and B, if combined, would have the same direction as the third side of a triangle. What I don't understand here is how the magnitude of the third side can be equal to themagnitude of the other two sides. That would mean two sides of a triangle sum to the third side, wouldn't it?"The magnitudes don't add directly. If you add two vectors, the magnitude of the resulting vector will besomewhere between zero and the sumof the individual magnitudes. The latter occurs when they have thesame direction, e.g., (3,0) + (4,0) = (7,0) The vectors on the left have magnitudes of 3 and 4, and the sum has a magnitude of 7.

The former occurs when they haveopposite directions, but the same magnitude, e.g., (3,0) + (-3,0) = (0,0) The vectors on the left have magnitude 3, but they cancel each other out, leaving a null vector, with no direction or magnitude.

His statement about the possible magnitudes of the sum can be strengthened to the “triangle inequality”, which we’ll see soon. We’ll see that the *smallest* possible magnitude of the sum is the *difference* of the individual magnitudes. It was zero in this case only because the two magnitudes were the same.

In between, we might have something like (3,0) + (0,4) = (3,4) Here, the vectors on the left have magnitudes 3 and 4, but the sum has a magnitude of 5. That would correspond to a situation like Two guys are pushing on a box. One pushes to the east with a force of 3 lbs, while the other pushes to the north with a force of 4 lbs. What is the resultant force on the box? We can add the vectors to get (3,4). The magnitude of that is sqrt(3^2 + 4^2) = sqrt(25) = 5 The direction of that is tan^-1(4/3) = about 53 degrees So we could replace the two guys with one guy, pushing with a force of 5 lbs, at an angle of 53 degrees from the x-axis, and the box would move in the same way as when the two guys push it. Now, why doesn't the combined force have a magnitude of 7 lbs? Well, think of it this way: the box is moving at an angle to the force being applied by the guy pushing to the east. So only SOME of his force is going to moving the box. And the same is true for the guy pushing to the east. So we should expect the resultant force to be less than either of the individual forces.

Now try thinking about those other kinds of cases. In one, the two guys areboth pushing east, and their forces add up -- so the box moves to the east, under a force of 3 + 4 = 7 lbs. In the other,one guy is pushing east while the other pushes west, and since they apply the same magnitude of force, the box doesn't move at all. That is, it's like a force of 3 + -3 = 0 lbs is being applied to it.

So the magnitude of the resultant depends on how much the original vectors are “helping each other” or “fighting each other”.

In terms of triangles, you can think of it this way. The hands of a clock always form two sides of a triangle, right? And the third side of that is the line connecting the hands.Would you expect the length of that third side to always be the sum of the lengths of the individual hands?Or does the angle between them have something to do with it?

My guess is that David’s difficulty was largely in confusing “the magnitude of the combination” with “the sum of the magnitudes”.

Finally, here is a 1996 question about sums of vectors:

Sum of Two Vectors My daughter is having trouble learning about vectors. How do I explain the concept that the sum of two vectors a+b can be equal to or less than the sum of vector a + vector b ? To put it another way|a + b| <= |a| + |b|. I tried to explain it in terms of force being applied in the same direction, vs. in slightly different directions. However, no success yet. Is there an easier way to express or prove this relation to be true?

Note that what this (anonymous) parent says is missing a word: It is the *magnitudes*, not the vectors themselves, that are involved in the inequality. The symbolic form makes this clear.

Doctor Tom answered, changing the image from invisible forces to tangible motions:

I'd avoid a concept like "force," which may seem vague to her. Why not this: I start at home andwalk 3 milesin a fixed direction. Then Iwalk 4 milesin a fixed direction, but not necessarily in the same direction as the first walk. How far from home am I? It's easy to draw pictures with vectors to see all the possibilities. If you happened to continue in exactly thesame direction, you'd be 7 miles away, but it should be clear thatevery other path will wind up closer than 7 miles. In fact, if you happen to make a 180 degree turn, you'll only wind up a mile from home.

If we walk the 4 miles in the same direction we walked the first 3, the total distance is 7:

If we walk the 4 miles in the opposite direction, we end up 1 mile from the start:

And we can end up any distance in between, such as 6 miles:

No point on the blue circle (possible end points) is farther than 7 miles from the start. And similarly, no point is closer than 1 mile from the start. This demonstrates the two sides of the Triangle Inequality: $$\left||\mathbf{a}|-|\mathbf{b}|\right|\le|\mathbf{a}+\mathbf{b}|\le|\mathbf{a}|+|\mathbf{b}|$$

This is called the triangle inequality because it is true of any triangle; each side is no longer than the sum of the lengths of the other two sides. Or, stated differently, a straight line is the shortest distance between two points.

]]>We haven’t done much with vectors here, though there have been many problems of that sort lately. Let’s look at a recent question that touches on the basics, yet is by no means a simple problem.

This came from Stefan in March:

Determine the angle between vectors a and b if

- (a + b) is perpendicular to (7a – 5b), and
- (a – 4b) is perpendicular to (7a – 2b).
I don’t really know how to start.

I know that cos θ = (a*b)/(|a|*|b|) but I’m not sure how to use it here.

Stefan knows the key formula to be used to find the angle between vectors, using their dot product, and just needs some help getting to the point of using it. If you are not familiar with the dot product, I plan to have a post on that soon!

I answered,

I’d start by observing that the two pairs of perpendicular vectors imply that

(

a+b) • (7a– 5b) = 0 and(

a– 4b) • (7a– 2b) = 0Expand each equation, and see what you can determine about

a•b.If I did my work correctly, you will find that the answer is numerically unpleasant, but the work is conceptually straightforward.

If you need more help, be sure to show your work as far as you get, so I can check it and make any appropriate suggestions for a next step or a correction.

In my answer I demonstrated a better way to represent vectors and their operations in typing; with two different multiplication operations on vectors, the symbol “*” can be ambiguous, but since our site (though not the best in handling math) provides a way to insert special symbols, it is not too hard to use the dot for the dot product. To represent vectors, we can use either the arrow, \(\vec{a}\), or bold,\(\mathbf{a}\). The latter is easier to just type, so I’ll be using that.

The key idea is to see that the dot product is useful not only to find an angle, but also to express the fact of perpendicularity. Since \(\mathbf{a}\cdot \mathbf{b} = |\mathbf{a}||\mathbf{b}|\cos\theta\), when **a** and **b** are perpendicular, \(\mathbf{a}\cdot \mathbf{b} = 0\).

But I’d carried out the work (which I don’t always do initially), and found the answer to be an ugly radical expression; I wanted to mention that as an encouragement, as it might lead to unnecessary doubt.

Stefan replied,

Hi, Doctor Peterson. Thank you for responding so fast.

So this is what I did:

(a + b)*(7a – 5b) = 7a

^{2 }– 5ab + 7ab – 5b^{2 }= 0

(a – 4b)*(7a – 2b) = 7a^{2 }– 2ab – 28ab + 8b^{2 }= 0 →7a

^{2}– 2ab – 28ab + 8b^{2}= 7a^{2}– 5ab + 7ab – 5b^{2}→-32ab = -13b

^{2}→ a/b = 13/32Now if I can assume that this represents cos(13/32) then I get the angle 66°.

So if it’s right then great, but is it?

He presumably meant to say, \(\cos(\theta) = 13/32\), which if correct would indeed imply that \(\theta = \cos^{-1}(13/32) = 66.03°\).

But the work, while partly correct, suggests that he is not paying enough attention to the fact that **a** and **b** are vectors. This is a natural mistake when one is first learning about vectors, as the notation looks mostly like ordinary algebra with numbers (scalars). I replied,

The trouble is that you are not clearly distinguishing between

vectorsandscalars, so some of what you did makes no sense. (You can’t divide vectors.) Also, there is no reason to imagine that your “a/b”, even if it meant something, would be the cosine of the angle between them, is there?What you really have is this, where I have put vectors in bold, indicated the dot product explicitly (which is necessary), and made the magnitude of a vector explicit as |

a|, using the fact thata•a= |a|^{2}:(

a+b)•(7a– 5b) = 7|a|^{2}– 5a•b+ 7a•b– 5|b|^{2 }= 0(

a– 4b)•(7a– 2b) = 7|a|^{2 }– 2a•b– 28a•b+ 8|b|^{2 }= 0I would not rush to set these equal to one another, which loses the important information that not only are they

equal, but they are bothzero. I would first simplify each equation. Note that you can then solve each of them fora•bin terms of |a| and |b|, if you find that useful. Or (big hint) you could eliminatea•bbetween them.As I mentioned, you can’t say that

a/b= 13/32, becauseaandbare vectors, and there is no division operation on vectors; the step before you wrote that is really -32a•b= -13|b|^{2}, and you can’t divide byb.But what you did up to that point will be useful, because your goal is to find

a•b/(|a||b|), which is rather close to what you have. If you can only find how |a| and |b| are related …

The equation he got can be used to express \(\mathbf{a}\cdot \mathbf{b}\) (and therefore the angle) in terms of the magnitudes of **a** and **b**, but we need more to get an actual value.

Stefan replied, taking my hint by solving the first equation for \(\mathbf{a}\cdot \mathbf{b}\) and putting it into the second:

So I guess to find the relationship I would have to do this:

7|

a|^{2 }+ 2a•b– 5|b|^{2 }= 0

a•b= (5|b|^{2 }– 7|a|^{2})/2now insert that into

7|a|

^{2}– 2a•b– 28a•b+ 8|b|^{2}= 0and get

112|

a|^{2 }– 67|b|^{2 }= 0 →112|

a|^{2}= 67|b|^{2}→|

a|^{2}= (67/112)|b|^{2}→|

a|/√(67/112) = |b|Then

-32

a•b= -13|b|^{2}→(

a•b)/|b|^{2}= (13/32) →(

a•b)/(|b|*|b|) = (13/32) →(

a•b)/(|b|*|a|/√(62/112)) = 13/32(

a•b)/(|b|*|a|) = (13/32)*√(62/112) = cosθThis is the only thing i can think of, sorry, my monkey brain is slow when it comes to math.

I answered,

I think you’ve got a pretty

good“monkey brain”! You thought of almost exactly therightthing!You just made two little slips: You miscopied 67 as 62, and messed up the final step.

Fix that, then get a decimal value for the cosine, take the inverse cosine, and you’ll have the answer!

Now, I took a slightly different path that led to a slightly more complicated expression, (91/134)√(67/112), which turns out to be equivalent to yours (after correction). Your method is a little nicer than mine, and we both missed some simplification of the fractions, which would have made the similarity more obvious.

He wrote back,

So it’s almost right, besides the miscopied numbers.

I’m not sure what you meant by messing up the final step?

I know that cosθ doesn’t equal to the angle but arccos (or cos

^{-1}, I’m not sure if there is a difference?) but I just wrote it that way.

I answered, explaining the subtle error in the last step, and finishing:

When you solved

(

a•b)/(|b|*|a|/√(67/112)) = (13/32)for (

a•b)/(|b|*|a|), you should have multiplied both sides by (1/√(67/112)), that is, by √(112/67), to get(

a•b)/(|b|*|a|) = (13/32)*√(112/67).You didn’t flip the radical over. This simplifies (though this is not needed) to

cos(θ) = (13/8)*√(7/67) = 0.525249,

so θ = 58.315°.

(You are correct that arccos and cos

^{-1}are two ways to say what we are doing here.)I have attached a picture of a pair of vectors

aandbthat satisfy this, with the correct angle and ratio of magnitudes. This is how I verified that my answer was correct!(By the way, the answer I originally got was (91/134)√(67/112), which looked very different from yours; that’s why I had to simplify both to make sure they agreed.)

The desired vectors **a** and **b** are in green, and the two pairs of perpendicular resultants are in red and blue, respectively. Once I arbitrarily created vector **a**, vector **b** was determined by the angle \(\cos^{-1}\left(\frac{13}{8}\sqrt\frac{7}{67}\right)\) (which could have been in either direction) and the magnitude, \(4\sqrt\frac{7}{67}\) times that of **a**.

He responded,

Ah, I see, that was a stupid mistake…

Thanks for helping me with this, I appreciate it greatly!

I closed with,

We all do that!

That’s why I teach my students that checking your work, as well as your answer, is half the work. (And I manage to demonstrate that at least once a lesson, by making mistakes for them to catch!)

And thanks for asking the question, which was an interesting challenge.

While preparing for an upcoming series on vectors, I ran across this 2006 problem, which is quite similar in some respects:

Finding the Angle between Two Vectors There's two vectors A and B, which both have equal magnitudes. In order for the magnitude of A+B to be 120 times larger than the magnitude of A-B, what must the angle between them be?

Doctor Luis answered:

Hi Victor, Using vector norm notation, the problem informs us that A and B are two vectors such that |A| = |B| Further, they want us to determine the angle T (between A and B) such that |A+B| = 120 * |A-B| Ok. Now that we have expressed the requirements in concise mathematical notation, let's solve the problem.

As above, we can use the dot product to relate the various magnitudes:

The easiest way to find T is probably to use the dot product between A and B (denoted A.B). I'm sure you'll recognize the identity A.B = |A| * |B| * cos(T) Solving for cos(T) we get cos(T) = A.B/(|A| * |B|) if we use |A|=|B|, then cos(T) = (A.B)/|A|^2

And, as above, we can distribute the dot products to make usable equations:

Now, it is clear that the problem will be easier if we find the value (A.B) in terms of |A|^2. We can do that from the following relationship between the dot product and vector norm: v.v = |v|^2 (which is actually an instance of the identity above, applied to the same vector v, so that T=0, or cos(T)=1). Well the important thing is to realize that we can apply v.v = |v|^2 to |A+B|^2 and to |A-B|^2 (and then applying the distributive rule of the dot product), |A + B|^2 = (A + B).(A + B) = A.(A+B) + B.(A+B) = (A.A + A.B) + (B.A + B.B) = |A|^2 + 2(A.B) + |B|^2 = 2|A|^2 + 2(A.B) (using |A|=|B|) Similarly, |A - B|^2 = (A - B).(A - B) = A.(A-B) - B.(A-B) = (A.A - A.B) - (B.A - B.B) = |A|^2 - 2(A.B) + |B|^2 = 2|A|^2 - 2(A.B) (using |A|=|B|)

This turns the equation we had into something we can actually solve (I’ll correct a small error in the original):

Now, we'll use that second equation that the problem gave us: |A+B| = 120 |A-B| or |A+B|^2 = 120^2 * |A-B|^2 (2|A|^2 + 2(A.B)) = 120^2 * (2|A|^2 - 2(A.B)) You can use this last equation to solve for A.B in terms of |A|^2, which will allow you to find the ratio A.B/|A|^2 = cos(T), from which you can finally determine the value of T.

Let’s finish the work. We have $$2|A|^2 + 2(A\cdot B) = 14,400 (2|A|^2 – 2(A\cdot B))$$ Distributing and rearranging, we get $$2|A|^2 + 2(A\cdot B) = 28,800|A|^2 – 28,800(A\cdot B)$$ and then $$28,802(A\cdot B) = 28,798|A|^2$$ so that $$A\cdot B = \frac{28,798}{28,802}|A|^2 = 0.99986|A|^2$$

Therefore, $$\cos(T) = \frac{A\cdot B}{|A|^2} = 0.99986$$ $$T = \cos^{-1}(0.99986) = 0.9549°$$

As a sanity check, you should notice that the answer is small (at least relative to 180 degrees), which means that A and B are pointing in almost the same direction. This makes sense, since they'll reinforce each other when added, but almost cancel out when subtracted. This is how |A+B| can manage to be 120 times larger than |A-B|, even though the two vectors A and B have the same magnitude.

Again, for confirmation, I’ve constructed these vectors in GeoGebra, though it’s hard to see:

The program tells me that the ratio \(\displaystyle\frac{|a+b|}{|a-b|} = 120\). In effect, we have constructed a rhombus such that the ratio of its diagonals is 120:1; looking at it that way, the angle between the sides of the rhombus is \(2\cot^{-1}(120) = 0.9549°\), just as we found by explicitly using vectors.

]]>We’ll start with this question from 2001:

Commas in Numbers Why are we taught in grade school toput the comma after the third digitin a large number? (9,876,543,210). Why not the fourth digit? (98,7654,3210). Anddoes the comma really change the number value?

I responded:

Hi, LaToya. The comma has no effect on the value of a number.All it does is make it easier for humans to read it.We put a comma every three digits to match the way we say numbers in English, by thousands. Your number reads naturally as 9 billion, 876 million, 543 thousand, 210 Divided every four digits, you would want to read it as something like 98 myrion, 7654 myriad, 3210 (I made up the word "myrion," but "myriad" really does mean "ten thousand".) The commas are purely for convenience.

So if we used different names for large numbers, we would likely place commas differently. and if we placed the commas differently, we might have different words to use. Hold on to that thought!

If “myrion” existed, it would mean a hundred million.

A couple months earlier we had this very similar question:

Why Commas after Every Three Places? Why do we put commas after every three places in a number like 200,000? Thank you.

Doctor Floor (that’s his first name) answered, from a very different perspective than mine. But he started out the same:

Hi, Larry, Thanks for writing. The use of commas in large numbers is not necessary, it isa service to the reader. The commas are exactly in the right places to show thousands, millions, billions, trillions, etc., sothey fit the way we pronounce the numbers. Also, it is quite comfortable to read numbers in parts of three digits, when you have to "spell" them.

Again, the commas agree with the language; and if there were more digits per group, they would be more awkward to work with. But:

Commas are not used everywhere in the world to do this job. It can be done by using spaces as well, so instead of 3,978,654,128 one sometimes sees 3 978 654 128 In my home country, the Netherlands,the roles of dot and comma are reversed, so here we would write: 3.978.654.128 and the decimal number 3.978 is written as 3,978 in the Netherlands. Be careful when you are outside your own country!

So what we write as 12,345.78 in America would be 12.345,78 in many places. And, yes, this does sometimes cause a little confusion when we get a question about, say, 1,234 from another country, and it turns out to be a decimal!

The obvious next question is, “Why?” This is from 2007, referring to the decimal point vs. decimal comma (or, more generally, decimal separator or “separatrix”):

Commas and Decimal Points in Currency Notations Why do the Eurolanders use the comma in currency instead of the decimal point as in the States?

This is not only in writing about money, of course. I answered:

Hi, Katherine. During the 1600'sthere were many competing notations for decimals, of which the comma and period were the two winners. Here's what Cajori, in A History of Mathematical Notations, says about one point in the conflict (p. 328): In the eighteenth century, trials of strength between the comma and the dot as the separatrix were complicated by the fact that Leibniz had proposed the dot as the symbol of multiplication .... As a symbol for multiplication the dot was seldom used in England during the eighteenth century, Oughtred's X being generally preferred. For this reason, the dot as a separatrix enjoyed an advantage in England during the eighteenth century which it did not enjoy on the continent. In the end,the comma won on the continent of Europe;the dot was used in England, though commonly raised rather than on the line (with the low dot used for multiplication). In America the usage varied at first, but has settled down to the low dot for the decimal separator, and a raised dot for multiplication.

So in England at that time, “×” (and later “.”) was used for multiplication and “·” for the decimal separator; while on the continent, “·” was used for multiplication and “,” for the decimal separator. In America, we ended up with “·” (or “×” at the elementary level) and “.” respectively, the reverse of the British usage. And this is the simplified version!

The next question is from 2003 (before I wrote the answer above – I had probably forgotten it):

Writing Commas in Numbers Is it true that in France where we in America use decimals in math, they use commas, and where we use commas they use decimals? An example would be we would write 3,998.60 and they would instead write 3.998,60.

I answered:

Hi, Megan. Yes, this is true of much of the non-English-speaking world. I have long wanted to find a good reference that would show me where each convention is used; one that you might find interesting is the Regional and Languages Options box in Windows XP (other versions of Windows may do it differently), where you can select from a long list of countries and see the default choices they give for writing numbers, currency, times, and dates. I don't know whether I always trust Microsoft to know the correct way to use any language, but the variety of options is fascinating! Try picking any European or South American country.

Here is what my current Windows 10 computer shows under Region > Additional Settings when I select Dutch (Netherlands) as for Doctor Floor:

We see the comma for the “decimal symbol”, and dot for “digit grouping symbol”. Here is English (Sweden):

There they use a space for digit grouping.

Now, we would just look it up in Wikipedia, here!

Here are some references I have found that tell a little more about it: Commas in Numbers - Grammar Slammer, English Plus http://englishplus.com/grammar/00000087.htm Many European countries use a comma in place of the decimal point and use periods or blank spaces to separate every third digit.United States: 2,367.48 francsFrance: 2.367,48 francs or 2 367,48 francs

This agrees with what we heard. (That link still works; none of those that follow do.)

Cross Cultural Comparisons: Numbers http://www.geocities.com/Broadway/1906/cultr14.htmlFrance: The decimal point is a comma, as inall of Europe. Certainly not a dot, except on computers.

Technology sometimes forces a convention on people (e.g. calculators with “.” on a button, even when they’d naturally use a comma).

The following is a favorite site of mine, which I hadn’t realized until now had gone away:

How Many: Using Numbers and Units - Russ Rowlett http://www.unc.edu/~rowlett/units/numbers.html InEnglish-speaking countries, the decimal point (decimal marker) is the period. Incontinental Europe and most other places, the decimal marker is the comma. ... Since the comma often means a decimal point, theInternational System (SI)requires that large numbers, like the billions above, be represented as groups of three digits separated by narrow spaces, not by commas.

Since SI is meant to be international, they have to standardize as much as possible. So they allow either comma or dot for the decimal separator, but spaces for thousands, so that there is never a possibility of confusion.

Then there are technical specifications for “regionalization” of software, much as we saw in Windows settings:

Country-Specific Data Formats http://w3.pppl.gov/misc/motif/MotifStyleGuide/en_US/Country-Specific_Data_Formats.htmlThousands SeparatorsThe comma, period, space, and apostrophe are examples of valid separators for units of thousands as shown in the following examples: 1 234 567 1.234.567 1'234'567 1,234,567Decimal SeparatorsThe period, comma, and the center dot are examples of valid separators for decimal fractions as shown in the following examples: 5,324 5.324 5·324

After those quotations, I added:

Incidentally, in some countries (such as in India) digits arenot even grouped in threesas we do, because they have different traditional names in their languages from our "thousand" and "million." They call 1,00,000 a lakh and 1,00,00,000 a crore. So the way we write numbers is not at all standard around the world.

We’ll see more on India soon.

Getting back to commas, why do we use the grouping we do? This is from 2003:

Placement of Commas in Writing Numbers Why is theones periodnamed for the ones place value and why is thethousands periodnamed for the thousands place?Why is there not a hundreds period?

If you’re unfamiliar with this use of the word “period”, it refers to each set of three digits separated by commas. In the “short word form” we’ve seen, “*x* millions, *y* thousands, *z* ones”, the *x*, *y*, and *z* are the periods, while “millions”, “thousands”, and “ones” are their names. In effect, they are the “digits” in a “base 1000” number.

I answered again:

Hi, Beth. We do, of course, have a PLACE for hundreds; but PERIODS are divided by thousands, to match the way we say numbers aloud: 123,456,789 = 123 million, 456 thousand, 789 [units] So each period is named for the value of its least significant digit, and each period contains three digits. There isno real mathematical reasonfor doing this, apart from the value ofconsistency; it isjust a good match with English(and most other languages).

If there is a mathematical reason, it’s as I said above, that we are working in “base 1000”, which provides a consistent format, a useful regularity.

You may be interested in the fact thatin India, commas are put in irregularly, to match anirregular set of number names: Numbers in Hindi and Urdu http://mathforum.org/library/drmath/view/57179.html In their language, alakhis what we call one hundred thousand (100,000), and acroreis our ten million (10,000,000). In order to match how they say numbers, they write a lakh as 1,00,000, and a crore as 1,00,00,000.

We’ll be looking at that page in a moment. For the moment, notice that this is irregular, so its only value is to match the words they use, rather than anything mathematical.

So, since we read numbers as "XXX thousand, Y hundred, ZZ", we could very well have chosen to write numbers as XXX,Y,ZZ. But I think it's good that we don't, because it would be confusing, and would not help much if any. Perhaps we were saved from that by the fact that we don't normally put a comma after "hundred" when we write out numbers (or a pause when we read them), so the comma did not seem necessary. Also, the fact that we say "hundred" within the other periods would force us to write X,XX,Y,ZZ, and we just don't need that many commas.

This is an interesting thought. It is ultimately only the desire for regularity, and the fact that we think of “*x* hundred *y*ty-*z*” as a single number, that saves us from writing 1,23,4,56 for “one hundred, twenty-three thousand, four hundred, fifty-six”. Those extra commas would mess things up a little.

Here is the question I referred to about India, from 2000:

Numbers in Hindi and Urdu How many crores make one billion?

Doctor Rick took this:

I had never heard of a crore, but I did a simple Web search on the word crore, and I found this, which looks like just what you need: Numbers in Hindi and Urdu - About.com, Inc. http://hindiurdu.about.com/aboutuk/hindiurdu/library/weekly/aa051900a.htm I quote a bit: There are two terms in particular that are worth discussing: lakh and crore. Alakhisone hundred thousand(100,000), acroreisten million(10,000,000). The South Asian numbering system progresses as follows: ten (das), hundred (sau), thousand (hejar in Hindi, hezar in Urdu), one hundred thousand (lakh) and ten million (crore). Commas are usually placed to show the number of lakhs and crores, so one lakh is written 1,00,000 and one crore is written 1,00,00,000. The structure of the numbering system affects usage. In English "half a million" and "five hundred thousand" are essentially interchangeable. In Hindi and Urdu, the only possibility is "five lakhs". India's population recently reachedone billion, but South Asian papers reported this as "one hundred crore".

This quotation directly answers the question about a billion: 1,000,000,000 to us is 100,00,00,000 to them. Their (non-periodic) “periods” are the hejars, lakhs, and crores.

Interestingly, Windows does not show lakh-and-crore commas (by default) for Urdu; in Pakistan it shows essentially American style, and in India it shows the same but with Arabic digits (and either left-to-right or right-to-left):

Here is Hindi:

In 2003 I added this (in my response to a reader who pointed out a small error in the quote above, which has been corrected):

For another reference, see: How Many? A Dictionary of Units - Russ Rowlett http://www.unc.edu/~rowlett/units/ lakh or lac a traditional unit of quantity in India, equal to 10^5 or 100 000. In India the lakh is used commonly instead of the million and commas are used to isolate the number of lakh; for example, the number 5 300 000 is called 53 lakh and written "53,00,000". See also crore. crore a traditional unit of quantity in India, equal to 10^7 or 10 million. Large numbers are usually described in India using the crore and the lakh (10^5); for example, the number 25 600 000 is called 2 crore 56 lakh and written "2,56,00,000".

Note how confusing it would be to read this in their terms if the commas were put in our places! Of course, it they wrote numbers their way in a question written in English, we would wonder whether there were a typo.

Math is *not* a universal language!

Let’s look at two quick questions about lakhs and crores that were not archived. First, from 2006:

How would you read RS1452000 in India?

Doctor Camilo answered:

Hello, Nancy. That amount, Rs. 14,52,000, would be read as fourteen lakh, fifty-two thousand rupees. In Indian counting: 1,000 = one thousand 10,000 = ten thousand 1,00,000 = one lakh 10,00,000 = ten lakh 1,00,00,000 = one crore Note that the commas in Indian counting are in different places from where they would be in the counting of many other countries. 1,000,000 (one million) equals 10,00,000 (ten lakh)

Finally, from 2014:

I am from Nepal where we write the numerical value such as 1,00,000 and call it "lakh". While writing a numerical value, we use comma after 3rd digit and after that every 2nd digit. I would like to know,is this method of expressing numerical value wrong according to international standards. Should I write one lakh as 100,000 or our style of writing is okay. I am confused whether to follow the American style of expressing numerical values and putting comma after every 3rd digit or my style is also okay. One of my Thai friends told me that my style of writing is very wrong.

I answered:

Hi, Nitin. This is not a matter of mathematics, so much as it is a matter of language. You write a numberso that it can be easily read in the language of the reader. In English, and many other languages, we have special words for 1,000 (thousand), 1,000,000 (million), and so on, so using commas to mark those makes sense. In your language (or culture) you have names for other quantities, so you naturally mark them as you read them.For international use, it is probably best not to depend on your own local language. The SI specifies using spaces, not commas, and putting them in every third place: http://www.bipm.org/en/publications/si-brochure/section5-3-4.html So one lakh would be written as 100 000 and a crore would be 10 000 000

On one hand, Nitin should not be ashamed of his own culture! But on the other, when writing for international use, we need international standards. I have to admit that I might find it awkward to have to use a different style than I was taught, in order to communicate with colleagues in other countries, as if that made America a second-class country; I would have to think of it as adapting to others. But the same is true of using spaces instead of commas (as the SI specifies, as we saw above), or of using the metric system rather than feet and inches for use in international markets.

]]>