What Do dx and dy Mean?

We’ve looked at the meaning of the derivative, and of its various notations, including dy/dx. This leads to the next question: What does dx or dy mean on its own? This was touched on last time, but there’s a lot more to say that I couldn’t fit there. We’ll look at more advanced approaches to differentials in themselves, then at two perspectives on what they mean in integrals.

Differentials as functions

We’ll start with the page two of us referred to in our answers last time, which comes from 1998:


I have to reach this conclusion:

If you can get the differentials of a function, you can differentiate it, but if you can differentiate it, you can not necessarily get its differentials.

Please help.

As we’ve already seen, differentials can be discussed from several different perspectives. This question, lacking clear context, doesn’t indicate what kind of function is in view, or what approach to differentials is being taken. What does it mean here to “get a function’s differentials”? Doctor Jerry answered by suggesting one possible context, giving a definition that is  quite different from what we’ve seen so far, where differentials were just infinitesimal numbers:

Hi Maria,

The standard definition of the differential of a real-valued function f of a real variable is:

   At a given point x, the differential df_x
   (df sub x; usually the x is omitted) of f
   is the linear function defined on R by:
       df_x(h) = f'(x) * h

Everyday usage of the differential often suppresses the fact that the differential is a linear function. For example, if y = f(x) = x^2, then we write:

       dy = df = 2x * dx

where dx is used instead of h. This is for good reason. The finite numbers dy and dx appearing in dy = 2x * dx can be manipulated to obtain:

    dy/dx = 2x

I feel that I haven't replied directly to your question. I think that this is because I don't fully understand your question.  

Please write again if my answer has not helped.

This definition takes the differential of a function to be itself a function, namely the function whose value is the vertical change \(\Delta y\) along the tangent line for a given horizontal change (h or \(\Delta x\) or dx). In this way, we don’t have to think of dy as a number-that-is-not-really-a-number (an infinitesimal), yet we get the action of multiplying the derivative by any number dx.

In his example, the differential of \(f(x) = x^2\) at x = 3 is \(df(h) = df_3(h) = f'(3)\cdot h = 6h\). From this perspective, the usual way of writing the differential as if it were a number is just a shortcut. Retaining the variable x, we could say, fully, \(df_x(dx) = 2x dx\), or briefly, just \(dy = 2x dx\). For a very slightly different version of this definition, see here.

Maria asked for more, giving a little more context but still not quite making it clear what level she is at:

Thanks for your answer. I know that the question is a little bit confusing, and at the beginning I thought it was a problem of the translation from English of the Math books. Your answer helped a little, so I am going to try to rephrase it.

What is the difference between finding the derivatives of a function (dy/dx), and finding its differentials (dy, dx)?

In the books I've seen they define differentials supposing that f(x) is differentiable.

My teacher gave a hint to reach this conclusion: if you can find the differentials of f, then f is differentiable, but if f is differentiable you can't necessarily find its differentials.

That is why I can prove this, starting with a function that is differentiable.

It is still  unclear what “the derivatives of a function” means; perhaps she doesn’t intend a plural.

Doctor Jerry started his answer by restating the previous definition:

Hi Maria,

Suppose f(x) = x^2. To find the derivative of f we use the definition of derivative: f'(x) is the limit as h->0 of the quotient

   f(x+h) - f(x)

For this function, f'(x) = 2x.

Okay, this much is clear; there is no possible ambiguity.

The differential of f at x is defined to be the linear function df, which is defined on all of R by:

   df(h) = f'(x) * h

Often, the notation df(h) is shortened to df or, if y = f(x), then we write dy instead of df. Then the above definition is:

   dy = f'(x)*dx    or

   dy/dx = f'(x)

Unless you are studying differential geometry, in which dx is interpreted slightly differently, dx is not the differential of a function. It is a variable, the same as h.

I’m going to omit the rest of the answer, because I don’t think the question and its context were ever clarified, so it isn’t clear what answer is needed.

If you want to dig deeper …

Doctor Jerry mentioned differential geometry in passing, as a place where differentials are defined more deeply. We have only occasionally gone into that territory; I want to just quote the conclusion to an unarchived answer to a question about differentials, by Doctor Fenton in 2009, in case you are interested:

There is also a more sophisticated viewpoint in which what is integrated is not a function f(x), but rather what is called a "differential form". This viewpoint involves a lot of complicated mathematical structure and is more commonly seen in calculus of functions of several variables (see, for example,

   http://en.wikipedia.org/wiki/Differential_form  )

but it can also be used in one-dimensional calculus as well (e.g. in David Bressoud's book _Second Year Calculus_).

So, the easiest viewpoint is the purely formal one, in which you do useful but basically meaningless computations (du=g'(x)dx which does the bookkeeping), but there is also a more complicated viewpoint in which the computations are not meaningless, but they require you to learn more abstract mathematics.  For example, the one-dimensional differential form dx becomes a mapping from intervals on the real line to R, and

   dx([a,b]) = b-a ,

while the differential form 3x^2dx (to use one of Bressoud's examples) is the mapping which takes the interval [a,b] to

   /           b^3   a^3
   | 3x^2 dx = --- - --- .
   /            3     3

This becomes the viewpoint used in modern differential geometry.

Differentials in definite integral notation

Last week we talked about the use of differentials within symbols for the derivative. Let’s look at a couple questions about their use in integration. First, we have this, from 2002:

The Meaning of 'dx' in an Integral

No matter how many times it's explained to me, and even though I've taken several advanced math courses (diff eq, linear algebra, etc), nobody has ever given me a satisfactory explanation for the meaning of the notation in which an integral has dx appended to the end if x is the variable which we are integrating with respect to. In physics, for example, dx seems to mean a very small amount of x, and then we use it in an integral to integrate whatever physical quantity is being discussed. I just don't understand. 

Or, when a differential is defined, all of a sudden the dx has a meaning, but then when an integral is being evaluated, the teacher says, "Oh, the dx is just a formality." 

So, sometimes it's a formality, sometimes a vital concept, sometimes a physical quantity, sometimes a derivative: What is it?

When we write \(\int f(x) dx\), we read it as “the integral [or antiderivative] of f(x) with respect to x,” assigning no meaning to “dx” other than telling us what variable we care about. (In fact, sometimes the dx can just be omitted entirely, when the variable is clear!) This is not very different from its use in a derivative, where it also means “with respect to x“. What does it mean here?

Doctor Jeremiah took the question, focusing on the idea of a definite integral:

Hi Nosson,

Think about it this way:

An integral gives you the area between the horizontal axis and the curve.  Most of the time this is the x axis.


                           |                    |
                         --|--              ----|---- f(x)
                       /   |   \          /     |
                      /    |     --------       |
            |        /     |                    |
       -----|-------       |                    |
            |              |                    |
            |              |                    |
  ----------|--------------+--------------------|----- x
            a                                   b

And the area enclosed is:

 Area = | f(x) dx

This is a definition of the definite integral, in a broad sense; what follows defines how it can be calculated in principle (and therefore, how it is formally defined):

But say you didn't want to use an integral to measure the area between the x axis and the curve.  Instead you just calculate the average value of the graph between a and b and draw a straight flat line y = avg(x)   (the average value of x in that range).

Now you have a graph like this:


                           |                    |
                         - | -              - - | - - f(x)
            |          /   |   \          /     |
       -----|-----------------------------------|---- avg(x)
            |        /     |                    |
       - - -|- - - -       |                    |
            |              |                    |
            |              |                    |
  ----------|--------------+--------------------|----- x
            a                                   b

And the area enclosed is a rectangle:

  Area = avg(x) w          where w is the width of the section

The height is avg(x) and the width is w = b-a or in English, "the width of a slice of the x axis going from a to b."

His width w would often be called \(\Delta x\); we’ll see that later.

But say you need a more accurate area.  You could break the graph up into smaller sections and make rectangles out of them. Say you make 4 equal sections:


                           |                    |
                      |----|---|        |-------|---- f(x)
                      |    |   |        |       |
                      |    |   |--------|       |
            |         |    |   |        |       |
       -----|---------|    |   |        |       |
            |         |    |   |        |       |
            |         |    |   |        |       |
  ----------|---------|----+---|--------|-------|----- x
            a                                   b

And the area is:

 Area = section 1  + section 2  + section 3  + section 4
      = avg(x,1) w + avg(x,2) w + avg(x,3) w + avg(x,4) w

where w is the width of each section.  The sections are all the same size, so in this case w=(b-a)/4 or in English, "the width of a thin slice of the x axis or 1/4 of the width from a to b."

We are starting to develop the Reimann integral (though many details are needed to make a complete definition, as for example the widths don’t really have to be the same).

And if we write this with a summation we get:

 Area =  /   avg(x,n) w

But it's still not accurate enough.  Let's use an infinite number of sections.  Now our area becomes a summation of an infinite number of sections. Since it's an infinite sum, we will use the integral sign instead of the summation sign:

 Area =  | avg(x) w

where avg(x) for an infinitely thin section will be equal to f(x) in that section, and w will be "the width of an infinitely thin section of the x axis."

So instead of avg(x) we can write f(x), because they are the same if the average is taken over an infinitely small width.

Again, a lot of details are being omitted to keep things intuitive.

And we can rename the w variable to anything we want. The width of a section is the difference between the right side and the left side.  The difference between two points is often called the delta of those values. So the difference of two x values (like a and b) would be called delta-x.  But that is too long to use in an equation, so when we have an infinitely small delta, it is shortened to dx.

If we replace avg(x) and w with these equivalent things:

 Area =  | f(x) dx

So, as in the infinitesimal approach to the derivative, dx is thought of (informally) as a very small change in x.

So what the equation says is:

Area equals the sum of an infinite number of rectangles that are f(x) high and dx wide (where dx is an infinitely small distance).

So you need the dx because otherwise you aren't summing up rectangles and your answer wouldn't be total area.

dx literally means "an infinitely small width of x".

This, of course, applies specifically to the definite integral. From this perspective, we can think of the indefinite integral as inheriting the same notation via the Fundamental Theorem of Calculus, which ties the two together.

The differential doesn’t have to be at the end!

One consequence of teaching students that the differential in an integral means only “… with respect to x” can be seen in the following question, from 2003, about a relatively unusual variation in the notation:

Integral Notation - Missing Integrands

I have seen some integral notation used that I am not familiar with. It looks like this:

 | dx f(x) + ...

There does not seem to be an integrand (i.e. a function being integrated). I'm not sure if f(x) is to be integrated. I have two theories, but I can't see the point in writing the expression as it is if either of my theories is correct.

My theories about what this might mean:

1) The above notation is the same as writing:

 | 1 dx f(x) + ... (note the explicit 1 here)


(x + C) * f(x) + ... (where C is a constant of integration)

2)  The rest of the expression is to be integrated with respect to x.

If (1) is correct, then what was the point of writing the integral - why wasn't (x + C) just written instead? If (2) is correct, then how does one know when to "stop integrating" (i.e. if there is some term to be added on to the expression that is not to be integrated, how is it distinguished?).

I have seen this recently in multi-variate calculus, i.e. when x is in R^n rather than R: does this situation justify the use of the integral notation somehow?

Chris’s first guess is that the dx closes off the integral, so that what follows is to be multiplied; the second (which is correct) is that it doesn’t matter where the dx is placed.

He is right that this notation is particularly common in calculus with more than one variable. One might write, for example, $$\int_0^b dy\int_0^a dx f(x,y)$$ or $$\int_0^b dy\int_0^a f(x,y) dx$$ rather than $$\int_0^b\int_0^a f(x,y)dx dy$$ to indicate that we are to integrate first with respect to x, and then integrate the result with respect to y. One benefit is that it makes it easier to see which limits go with which variable.

I answered:

Hi, Chris.

It is common to learn about integration in such a way that the "dx" seems to be a marker for the end of the integral, as if the "long S" were a left parenthesis and the "dx" were the right parenthesis. But it doesn't work that way. In fact, what you are integrating is the product of a function and dx; and multiplication is commutative! So these mean the same thing:

   /                 /
   | f(x) dx   and   | dx f(x)
  /                 /

If you then add something, you must use parentheses if it is to be part of the integral:

   /                    /
   | dx f(x) + g(x) = [ | f(x) dx] + g(x)
  /                    /

is the sum of an integral and a function, while

   /                      /
   | dx (f(x) + g(x)) =  | (f(x) + g(x)) dx
  /                      /

is the integral of the sum of two functions.

That is, presumably the integral has higher precedence than addition, so you "stop integrating" at the first plus sign. But even then, I'm not positive that this rule I just made up is always followed; let me know if you think it doesn't fit the practice in your text, and show me an example.

Seeing the differential as part of a product is necessary in order to understand the notation. This can be done whether you think of dx as a mere notation, so that the “product” is as illusory as the “quotient” in a derivative, or you think explicitly about the Riemann sum.

I don’t see my ideas about parentheses followed universally; it is not uncommon to see \(\int x^2-2x+3 dx\) rather than \(\int (x^2-2x+3) dx\). This is probably due to the common use of the differential to terminate the integrand, and the fact that it would be meaningless to take the dx as associated only with the last term, despite the usual order of operations. This laxity may carry over into integrals where dx is written first, though the ambiguity is much greater there. Too often, as in some other aspects of order of operations, you ultimately just have to recognize what interpretation makes sense in context.

In writing this, it has occurred to me that my reference to commutativity is not quite valid, specifically when it comes to definite integrals. The following are not the same: $$\int_0^b\int_0^a f(x,y)dx dy\ne\int_0^b\int_0^a f(x,y)dy dx$$

That’s because the order of the differentials determines the meaning of the limits of integration. Everything about calculus notation is a little slippery.

Chris replied,

Doctor Peterson,

Thank you for your quick and helpful reply.

I was indeed taught that integration begins with the "long S" and ends with the (for example) dx.

I have, however, seen the following notation:

 |      dx
 | ------------
 |  f(x) + g(x)

and assumed it was a convenient notation rather than being a justifiable mathematical expression.

Perhaps I need to go and look at calculus from first principles again to see why this is the case.

That is both convenient notation and justifiable! Again, we are thinking of the dx as being multiplied by a fraction, and therefore equivalent to part of the numerator.

A particularly good example of the usefulness of the differential in an indefinite integral arises in the substitution method, where we can replace the dx with an expression that we actually multiply:

Why Does Integration by Substitution Work?

I looked at that page in the post Integration by Substitution.

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.