What Derivative Notations Mean

Last week we looked at the meaning of the derivative. In doing so, we mostly used the notation “\(f'(x)\)“, but mentioned another in passing. Here I want to look at our answers to several questions about the different notations you will see.

Several different notations

We’ll start with this question from 2013:

Differences in Differentiation Notation?

Hello, I am currently studying calculus, and am doing derivatives. I know how to take derivatives of various functions, but I am confused about the notation.

I know that d/dx(f(x)) means "the derivative of function f." But sometimes, when I'm reading the book, I see they write dy/dx, d/dy, or however they write it.

I believe dy/dx means "derivative of y with respect to x," or something similar to that, but I also get VERY confused when they say "with respect to x;" I don't know what that means.

Can you please give me a general overview of the meanings of the different derivative notations?

Thanks!

The question is specifically about what is called Leibniz notation, after its inventor, one of the creators of calculus; but I will also mention a couple others.

Leibniz notation: dy/dx

I answered:

Hi, Zanzabar.

The basic idea is that we write dy/dx to remind us that the derivative is defined as

                 delta y
       lim      ---------
  delta x -> 0   delta x

That is, it is a slope: the ratio of a change in y to a change in x. But we think of those changes as being very, very small -- just looking at the limit.

Showing the notation as we couldn’t back then, this definition is $$\frac{dy}{dx} = \lim_{\Delta x\rightarrow 0}\frac{\Delta y}{\Delta x}$$ This is equivalent to the forms of the definition we discussed last week, where we called \(\Delta x\) “h” and used \(f(x)\) for y.

When calculus was first invented, they actually thought of dx and dy as very tiny quantities, but that led to difficulties in logic -- sometimes you'd be thinking of dx as actually being zero, and sometimes not. (A philosopher complained that mathematicians were talking about "the ghosts of departed quantities," and it was a valid complaint!) The idea of limits was created to avoid those problems; but we can still informally think of them that way.

So "d" can be thought of as meaning "a very small change in ...," and this ...

   df(x)
   -----
    dx

... means "the ratio of a very small change in the value of the function to the very small change in x that caused it." (Don't tell a mathematician that I told you this ... ;-)

I was exaggerating the disapproval of mathematicians. In fact, there is a field of mathematics today that has formally defined these concepts of “infinitesimals”, which Doctor Jordi discussed here:

Nonstandard Analysis and the Hyperreals, by Jordi Gutierrez Hermoso

This approach is used at an introductory level in the textbook Elementary Calculus: An Infinitesimal Approach, by H. Jerome Keisler. So the limit approach is not the only mathematically valid way to learn about derivatives.

But we usually think of the notation as merely looking like a fraction.

When we say "derivative with respect to x," we mean that we are talking about the rate of change of the function value in relation to the change in x, as opposed to some other variable that might be floating around. (This becomes a lot more important when we are dealing with functions of more than one variable, but you won't get to that for a while.) For example, I could tell you how fast the temperature is rising "with respect to time," in order to emphasize that I am talking about how much the temperature is increasing per day, say, rather than how much hotter it is for every mile you drive south. The former would be a ratio of temperature to time; the latter, a ratio of temperature to distance. We usually know what that bottom variable is going to be from context, but sometimes it's important to mention, and the notation makes it clear.

So "the derivative with respect to x" basically means "the derivative with dx on the bottom of the notation."

One place you will see multiple variables involved in one problem is when we use the chain rule: $$\frac{dy}{du}\cdot\frac{du}{dx}=\frac{dy}{dx}$$ There, it is very important which variable is used at each step.

The more involved cases I referred to involve partial derivatives, a much later topic in calculus, where a different symbol \(\displaystyle\frac{\partial z}{\partial x}\) is used for clarity. Here, z might be a function of several variables, say \(z = f(x, y)\), and we are treating y temporarily as a constant.

Lagrange notation: f'(x)

I turned to the other common notation focusing on function notation, which many mathematicians today prefer:

The notation f'(x) is more formal, and avoids some of the dangerous implications of the df/dx notation. It doesn't suggest that the derivative actually IS a fraction (rather than the limit of a fraction); it just says we have a "derived function" -- a new function that we obtained from another by this process.

In one sense, this notation hides its meaning, making it less intuitive; any “derived function” could have been named this way. Its benefit is largely in not carrying any meaning that might distract you!

What makes this notation especially valuable is that it focuses attention on the function itself rather than on the variables. When we differentiate, we aren't really doing anything to the variables, but only making a new function, the values of which turn out to be rates of change of the given function. On the other hand, it can hide what variable we are differentiating with respect to. (It's the argument of the function, which isn't always showing; and when more than one variable is in view, we have to modify the notation.)

This notation, like the d/dx notation, can be applied either to a named function, like \(f’\), or to a variable (thought of as a function), e.g. \(y’\).

The modified notation used for functions of more than one variable looks like \(\displaystyle f_x=\frac{\partial f(x,y)}{\partial x}\).

Leibniz notation as an operator: d/dx

Now, when we write d/dx separately, as in d/dx f(x), we're thinking of it as an operator -- it tells us what we're doing to the function f, and is essentially equivalent to putting the prime mark on it (f'). The notation comes again from the analogy to fractions. Just as we can write ...

    2        2 * 6
   --- * 6 = ----- = 4
    3          3

... we can write

   d         df(x)
   -- f(x) = ----- = f'(x)
   dx         dx

It means "the derivative with respect to x of ...".

So just as $$\frac{2}{3}f(x) = \frac{2f(x)}{3},$$ we can write $$\frac{d}{dx}f(x) = \frac{df(x)}{dx}.$$ It’s mere notation.

I then referred to the two pages we’ll be looking at below.

Euler (Df ) and Newton (\(\dot{y}\)) notations

I should mention here a couple other notations that are used in special contexts. One is \(Df\), which was mentioned last week. The D operator is essentially the same as \(\frac{d}{dx}\). You can see an example of it in action here:

Particular Solution of Differential Equation

Another notation is \(\dot{x} = \frac{dx}{dt}\), which was used by Newton, and is still found particularly in physics. The dot specifically indicates differentiation with respect to time; otherwise, it is equivalent to the prime notation applied to a variable, like \(y’\). We don’t seem to have ever discussed it, except for a couple unarchived answers.

Is dy/dx a fraction? Yes and no …

A question from 2004, the first of the two I directed Zanzabar to, looked more closely at the dy/dx notation:

Can dy/dx Be Treated as a Fraction?

When I learned about derivatives, I learned that dy/dx was a notation that implied "derivative of y with respect to x."  I understood that.  But I am confused about whether or not the notation dy/dx can be treated as a fraction, giving individual meanings to dy and dx.

For example, in integration by substitution:

integrate (sin(3x+5)dx)
  u = 3x+5
  du/dx = 3
  dx = du/3

That is the part that confuses me.  How can the dx be solved for?  What exactly is "dx"?

I understand that dy/dx is a limit, and that it is a slope.  But the idea of it being simply a notation doesn't help me understand how you can multiply out the bottom.  Any help would be appreciated...

This separate use of dx and dy is particularly common in integration, which is the inverse of differentiation, but is also used in other ways. Is it really legal to break dy/dx apart like this?

This topic was previously discussed in less depth, in the post Why Do People Treat dy/dx as a Fraction?, where I quoted an unarchived answer from 2015, and gave links to some of the pages we’re looking at here.

No …

Doctor Vogler answered:

Hi Amit,

Thanks for writing to Dr Math.  The easy answer to your question is that your definition for dy/dx is correct; it means the derivative of y with respect to x, and dy and dx are meaningless when written alone, so that

  dx = du/3

is not a meaningful expression but should be written

  dx/du = 1/3.

This follows the formal definition of dy/dx, which represents a single operator on a function, not an actual fraction. But certain things you can do with a derivative look an awful lot like what we can do with fractions:

And when certain nice things happen that *look* like fractions, such as:

            1
  dy/dz = -----
          dz/dy

and

  dz/dx = dz/dy * dy/dx

then this is actually just the Chain Rule at work.  And the reason that

  integral( f(g(x)) g'(x) dx ) = integral( f(u) du )

is not that

  u = g(x)

implies

  du = g'(x) dx

but rather the Chain Rule again.

The chain rule, and the related concept of substitution in an integral “just happen” to look like you can juggle dx and dy and du, called differentials, as if they were numbers themselves. And the rules for substitution in an integral are proved by the chain rule, not by treating differentials as real entities.

… But, yes, in fact

All of that is true, except that I should qualify the "not a meaningful expression."  You see, something is only meaningless until somebody gives it a formal meaning.  Then you hope that the meaning they gave it has useful properties (such as, that it relates to derivatives...).  In fact, this has been done, and there is a good deal of mathematics that has gone into the theory of differentials, and it fits into integrals, and putting the differential "dx" at the end of every integral also makes sense according to this theory, and so on.  One math doctor alluded to some of this on

  Differentials
  http://mathforum.org/library/drmath/view/53678.html

We’ll be looking at this next week.

You can also get books that discuss this in more detail.  But the fact is that most people who use calculus don't really need all of the theory of differentials, and the Chain Rule indeed suffices to verify most facts that you would get from treating dy/dx as a fraction.  The reason you can treat it as a fraction is that dy/dx is the limit of a fraction, and so most of the operations you would do to the fraction you can do before you take the limit.  In other words, *before* the limit is taken, it *is* a fraction, so you can treat it as one.  But then you take the limit and it becomes a derivative.

We can say that the derivative “inherits” some (not all) of the behavior of fractions from the fraction (difference quotient) that is at the heart of its definition. That fraction is the motivation for the notation, and the notation makes it easy to remember things like the chain rule, but you have to remember that in spite of all this, it isn’t really a fraction, and you can only treat it as a fraction where there is a theorem that says you can.

One place where we actually write differentials on their own, besides integration, is in estimation, which ties in with the theory of differentials:

Finally, there is also the theory of estimating with derivatives, where I always say to think of

  dx = change in x
  dy = change in y
  x = unchanged value of x
  y = unchanged value of y

For example, to estimate (1.98)^6, we use

  y = x^6
  dy = 6*x^5 dx
  x = 2
  dx = -0.02

(so that x + dx = 1.98), and therefore

  y = 2^6 = 64
  dy = 6*x^5 dx = 6*32*-0.02 = -3.84

which implies that

  y + dy = 64 - 3.84 = 60.16,

which is, in fact, a pretty close approximation to (1.98)^6.  And this is essentially treating dy/dx as a fraction before we've taken the limit, since dx doesn't go all the way to zero but only to -0.02.

In our intuitive idea of the meaning of the derivative, we can think of dx and dy as infinitesimal (very tiny) changes in the variables; here, we allow them to have any size, while recognizing that our estimate will not be very good if they get too large.

Does this help you to understand how differentials are a fraction in some sense but not in others?  If you have any questions about this or need more help, please write back and show me what you have been able to do, and I will try to offer further suggestions.

What’s squared in the second derivative?

The second page I recommended was this, also from 2004:

Meaning of Second Derivative Notation

What does the second derivative notation, (d^2*y)/(d*x^2), really mean?  

I understand that the notation in the numerator means the 2nd derivative of y, but I fail to understand the notation in the denominator.  Isn't it supposed to mean with respect to x?  Why is there an x^2 in the notation?

We haven’t until now shown any second derivatives, which look like this: $${f}”(x) = D^2y = \frac{d^2y}{dx^2}$$ In Newton’s notation, $$\frac{d^2x}{dt^2} = \ddot{x}$$

I answered this one:

Hi, Jamie.

I don't think this is explained nearly as often as it should be!  There is no x^2 in this notation, and in fact no multiplication (i.e., it is _not_ d*x^2 as you say).  It is

  d^2y
  ----
  dx^2

and the "d" represents the "differential operator", which evidently has higher precedence than exponentiation.  That is, "dx" as a whole is thought of as a quantity (think of it as a small change in x), and the denominator is "(dx)^2".

Having written a lot on the order of operations, I’m sensitive to the fact that if \(dx^2\) were a multiplication, it would mean \(d(x^2)\). Instead, you have to think of d as an operator that “binds tightly to” the variable, making dx a single unit. The second derivative notation comes from once again imagining that the derivative is really a fraction:

But here is where it comes from: the second derivative is just the derivative of the derivative, or

  d  dy    d(dy)    d^2y
  --(--) = ------ = ----
  dx dx    (dx)^2   dx^2

You might read it as "the second derivative of y, with respect to x TWICE"; that last word is the reason for the "dx^2".  When you have functions of more than one variable you can see things like

  d^2z
  -----
  dx dy

(though a modified "d" is used to avoid some confusion); this means you are taking one derivative with respect to x and another with respect to y:

  d  dz
  --(--)
  dx dy

These last things, partial derivatives, look like $$\frac{\partial}{\partial x}\left(\frac{\partial z}{\partial y}\right) = \frac{\partial^2z}{\partial x\partial y}$$

I closed with a reference to the same page Doctor Vogler had mentioned:

This notation is based on analogies to fractions, and it can be dangerous to imagine that the dx and dy and d alone actually stand for numbers; but the notation works very well in making many formulas memorable.  See this page for more on differentials:

  Differentials
  http://mathforum.org/library/drmath/view/53678.html

Again, we’ll be looking at that next week.

Another view of the second derivative

Let’s look at one more question about this, from 2009:

Differential Notation

Why is the second derivative written as d^2y/dx^2?

Doctor Vogler replied, with thoughts similar to mine:

Hi Jordan,

Thanks for writing to Dr. Math.  It seems mysterious how the numerator repeats the d but the denominator repeats the dx, but it makes sense if you think about it the right way.

You are familiar with the notation

  dy
  --
  dx

to mean "the derivative of y with respect to x".  Sometimes, when you want to use a formula instead of a variable, you can put that in the place of y, as in

  d(cos x)
  --------
     dx

or, more conveniently (especially for complicated formulas),

  d
  --(cos x).
  dx

As we saw earlier, we can put the variable or function either within the fractional notation, or outside of it.

So then if that is the derivative of cos x, then what is its derivative?  Naturally, it would be

  d   d
  --( --(cos x) ).
  dx  dx

Does that make sense?  And when you write it this way, it is natural that one would abbreviate this by using the "squaring" notation on the d in the numerator and the dx in the denominator, as in

  d^2
  ----(cos x).
  dx^2

Of course, you're not really squaring, because it's not multiplication; but it's a convenient notation, especially when the 2 becomes 12 or an unspecified n.  Similarly, the derivative of dy/dx would be

  d  dy
  --(--)
  dx dx

which is why you get d^2y/dx^2.

He also comments on the order of operations issue:

I should like to point out that we are *NOT* squaring the x in the denominator, like d/d(x^2), but are squaring the dx.  It might have been better if we wrote (dx)^2 instead of dx^2, but that is not the notation in common usage by mathematicians, who instead treat the "dx" as a single piece.

Does this explanation make sense?  I hope I have cleared things up for you.  If you have any questions about this or need more help, please write back, and I will try to offer further suggestions.

Next time: Differentials.

1 thought on “What Derivative Notations Mean”

  1. Pingback: How to Think About the Chain Rule – The Math Doctors

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.