Differentiability, along with the concept of derivative, is by far the most central
basic concept of modern analysis and has useful applications in all realms of
mathematics and quantitative science. It is therefore paramount to develop a deep
and hands-on understanding of it. We illustrate its geometric meaning by a very
simple one-variable example. Take the two real real-valued functions
$$
f(x)=x+\frac{x^2}{2}\text{ and }g(x)=\frac{1}{2}(x+|x|),\: x\in \mathbb{R}.
$$
We are interested in their behavior about the point $x=0$. A plot of these functions
helps us getting started.
They both seem reasonably well-behaved simple functions. Since we are interested in
the local behavior about the point $x=0$, we zoom in to get a clearer picture. You
will notice that the function $g$ (in red) looks the same in both pictures in
that it exhibits a kink at the origin and, if no scale were indicated, you could not tell
the difference between the two plots. The function $f$, on the other hand, looks like
a line at this scale. Notice that the graph of $f$ was dashed in the second figure to
make both function visible for $x\geq 0$.
What do we learn from this? Well, for one thing, we understand that there are
functions, like $f$, that, after you zoom in into a neighborhood of a point, they start looking
like lines, and functions, like $g$, which do not enjoy this property.
If a function does indeed look like a line in small neighborhoods of a
given point, then it arguably is very easy to understand its behavior
in that neighborhood. This simple insight is at the heart of the proof
of the important Implicit Function Theorem presented in the next lecture.
Next we would like to predict when this important property holds for a
given function. First notice that, for a line-function such as
$l(x)=mx+k$ for $x\in \mathbb{R}$ and $m,k\in \mathbb{R}$, it
trivially holds that
$$
\frac{l(y)-l(x)}{y-x}\equiv m,\: x\neq y\in \mathbb{R}.
$$
Taking $f:\mathbb{R}\to \mathbb{R}$, it would behave approximately like a line about
a point $x_0\in \mathbb{R}$ if
$$
\frac{f(y)-f(x)}{y-x}\approx m\text{ for }x_0\approx x\neq y\approx x_0,
$$
and for some slope $m=m(x_0)$, which will, in general, depend on
$x_0$. A natural way to formalize this is to require that $\lim _{x\to
x_0}\frac{f(y)-f(x_0)}{y-x_0}$ exist, which is, of course, the definition
of differentiability at $x_0$. Differentiability therefore amounts to local
approximability by lines. This point of view remains valid in any dimension, that is
for functions $f:\mathbb{R}^n\to\mathbb{R}^m$ and any $m,n\in \mathbb{N}$ (and even
more in general for functions between possibly infinite dimensional vector
spaces and between smooth manifolds). One needs, however, to replace "lines'' by
"planes'' (at least for $m=1$). Differentiable functions are therefore functions the
graph of which can be well approximated locally by planes, or functions which are
well approximated by linear (affine, actually) functions. Take
$f:\mathbb{R}^n\to\mathbb{R}^m$ and $x_0\in \mathbb{R}^n$. Then $f$ is
differentiable at $x_0$ iff there exists (a matrix) $M=M_{x_0}\in
\mathbb{R}^{m\times n}$ (depending on $x_0$ in general) for which
$$
f(x_0+h)\approx f(x_0)+Mh\text{ for small }h\in \mathbb{R}^n.
$$
This is not a precise definition and you should gain a good grasp of
the formal definition given in the lecture.
Differentiation
Definition (Differentiability)
Let $f:[a,b]\to \mathbb{R}$ be given and take $x_0 \in (a,b)$. Then $f$ is said to
be differentiable at $x_0$ if and only if
$$
\lim_{x \to x_0} \frac{f(x) - f(x_0)}{x-x_0}
$$
exists, in which case it is denoted by $f'(x_0)$. In other words, $f$ is
differentiable at $x_0$ if and only if there is a real number $f'(x_0)$ such that
$$
\lim_{x \to x_0} \frac{|f(x) - f(x_0) - f'(x_0)(x-x_0)|}{|x-x_0|} = 0.
$$
Latter condition is often rewritten as
$$
f(x)-f(x_0)-f'(x_0)(x-x_0)=o\bigl(|x-x_0|\bigr)\quad\text{ as }x\to x_0\, .
$$
Remark
It is useful to notice that a function $f$ is differentiable at $x_0$
if and only if it is well approximated (better than to first order) by
an affine function "around $x_0$''. The linear function is then
uniquely determined and given by
$$
x\mapsto f(x_0)+f'(x_0)(x-x_0)\, .
$$
Continuity of $f$ at a point $x_0$ can also be formulated similarly as
$$
f(x)-f(x_0)=o(1)\quad \text{ as }x\to x_0\, ,
$$
and thus amounts to $f$ being well approximated (to better than zeroth
order) by $f(x_0)$ "around $x_0$''.
Notation (Little o and Big O)
Let $g$ be a non-negative real, real-valued function and
$f:\mathbb{R}\to \mathbb{R}$ be given. For $x_0\in \mathbb{R}$, it is
said that
$$
f=O(g)\text{ as }x\to x_0\:\Longleftrightarrow\: \exists\:
C\in \mathbb{R},\:\delta>0\text{ s.t. }|f(x)|\leq C\, g(x)\text{ if }
|x-x_0|\leq \delta.
$$
Notice that this is the same as requiring that $\limsup_{x\to
x_0}\frac{|f(x)|}{g(x)}<\infty$. Analogously it is said that
$$
f=o(g)\text{ as }x\to x_0\:\Longleftrightarrow\:\lim_{x\to
x_0}\frac{f(x)}{g(x)}=0.
$$
This very useful notation means that $f$ essentially grows (in
size) in a way comparable to $g$ as $x$ approaches $x_0$. They are of
the "same order". Similarly the little o notation indicates that the function
$f$ vanishes "faster" (to a higher order) than $g$ as $x$ tends to
$x_0$.
Exercise
Find examples of concrete functions $f$ and $g$ which are and are not
little o or big O of each other.
Proposition
If $f$ is differentiable at $x_0$, then $f$ is continuous at $x_0$.
Exercise
Give a proof of this simple observation.
Proposition (Elementary Properties)
Let $f$ and $g$ be differentiable at $x$, i.e. let $f'(x)$ and $g'(x)$ exist. Then
(i) (Linearity) $(f \pm g)'(x) = f'(x) \pm g'(x)$.
(ii) (Product rule) $(fg)'(x) = f'(x)g(x) + f(x) g'(x)$.
(iii) (Quotient rule) $\big(\frac{f} {g}\big)'(x)= \frac{f'(x)g(x) - g'(x)f(x)}{g^2(x)}$.
(iv) (Chain Rule) $(f \circ g)'(x) = f'\bigl(g(x)\bigr)g'(x)$.
Example
Let
$$
f(x) =\begin{cases} x\sin(1/x),&x\ne 0\, ,\\0\, ,&x=0,\end{cases}
$$
and discuss the differentiability of $f$ on $\mathbb{R}$.
(i) $f$ is not differentiable at $x=0$. Indeed
$$
\frac{f(x) -f(x)}{x-0} =\frac{x \sin (1/x) -0}{x-0} =\sin(1/x)\, ,
$$
which has no limit as $x$ tends to $0$.
(ii) $f$ is differentiable on $\mathbb{R}\setminus \{0\}$ by the
product rule since $x$ and $\sin(1/x)$ are differentiable there.
Example
Define
$$
g(x) =\begin{cases} x^2 \sin(1/x)\, ,&x\ne0\, ,\\
0\, ,&x=0\, .\end{cases}
$$
Then
$$
g'(0)=\lim_{x\to 0} {g(x)-g(0)\over x}=\lim_{x\to 0}{x^2 \sin(1/x)-0\over x}=0
$$
and
$$
g'(x) = \begin{cases}2x \sin(1/x) - \cos(1/x),&x\ne0\, ,\\0\, ,&x=0.\end{cases}
$$
The derivative $g'(x)$ exists at every point $x\in \mathbb{R}$, but
notice that $g'$ is not continuous at $x=0$. This shows that
differentiability does not imply continuity of the derivative (as a
function).
Mean value theorems
Theorem (Rolle)
If $f$ is continuous on $[a,b]$, differentiable on $(a,b)$, and
satisfies $f(a) = f(b)$, then there is $x_0 \in (a,b)$ such that
$f'(x_0) = 0$.
Since $f$ is continuous on $[a, b]$, it attains a maximum and a
minimum. Now, as $f(a) = f(b)$, there must be $x_0 \in (a,b)$ such
that $f(x_0)$ is either a maximum or a minimum of $f$ on $[a, b]$
unless $f$ is constant (in which case the theorem is proved). Without
loss of generality, we may assume $f(x_0)$ is a minimum.
\begin{eqnarray*}
f'(x_0)& =& \lim_{x \to x_0} \frac{f(x) - f(x_0)}{x-x_0}
= \lim_{x\to x_0, x >x_0} \frac{f(x) - f(x_0)}{x-x_0} \geq 0 \\
& & \qquad \qquad\qquad =\lim_{x\to x_0, x<x_0} \frac{f(x) -
f(x_0)}{x-x_0} \leq 0.
\end{eqnarray*}
Therefore, $f'(x_0) = 0$.
Theorem (Lagrange's Mean Value Theorem)
Let $f$ be continuous on $[a,b]$ and $f'(x)$ exist for all
$x\in(a,b)$. Then, there is $x_0 \in (a,b)$ with
$$
f(b) - f(a) = f'(x_0)(b-a)\quad \text{ or }\quad f'(x_0) = \frac{f(b)
- f(a)}{b-a}.
$$
Define
$$
g(x) = f(x) - \bigl[f(a) + \frac{f(b) - f(a)}{b-a}(x-a)\bigr]
$$
and observe that $g \in C[a,b]$, $g$ is differentiable on $(a,b)$, and
$g(a)=g(b)=0$.
Rolle's Theorem therefore yields $x_0 \in (a,b)$ with $g'(x_0) = 0$,
which, in turn, means
$$
g'(x_0) = f'(x_0) - \frac{f(b) - f(a)}{b-a} = 0\, .
$$
Next we formulate the general version of the mean value theorem.
Theorem (Mean Value Theorem)
Let $f$ and $g$ be continuous on $[a,b]$ and differentiable on $(a,b)$.
Then there is $x_0 \in (a,b)$ such that
$$
\big[g(b) - g(a)\big]f'(x_0) = \big[f(b) -f(a)\big]g'(x_0)\, .
$$
Set $h(x)=f(x)\big[g(b)-g(a)\big]-g(x)\big[f(b)-f(a)\big]$ and apply
Rolle's theorem using that $h(b)-h(a)=0$.
Corollary
If $f$ is differentiable on $(a,b)$, then
(i) If $f'(x) \ge 0$ on $(a,b)$, then $f(x)$ is non-decreasing on $(a,b)$.
(ii) If $f'(x) \le 0$ on $(a,b)$, then $f(x)$ is non-increasing on $(a,b)$.
(iii) If $f'(x) = 0$ on $(a,b)$, then $f(x)$ is constant on $(a,b)$.
Remark
If $f$ is continuous on $[a,b]$, then, given any $y$ between $f(a)$
and $f(b)$, $x \in (a,b)$ can be found with $f(x)=y$ by the
intermediate value theorem.
Notice that an intermediate theorem for
$f'$ holds even without assuming that $f'$ is continuous. Indeed
Theorem (Intermediate Value Theorem for $f'$)
Let $f$ be differentiable on $[a,b]$ and assume that $f'(a) <
f'(b)$. Then, for any $m\in[f'(a),f'(b)]$, there is $x\in (a,b)$ such
that $f'(x) =m$.
Define $g(x) = f(x) - m(x-a)$ and observe that
$$
g'(a) = f'(a) - m < 0
$$
and that there is $x_1>a $ such that $g(x_1)< g(a)$. Similarly
$g'(b) = f'(b) - m>0$ and $x_2< b$ can be found such
that $ g(x_2)< g(b)$. Thus there is also $x_0\in (a, b)$ such that
$g(x_0)$ is the minimum of $g$ on $[a, b]$ and so $g'(x_0)=0$,
i.e. $f'(x_0)=m$.
Theorem (L'Hôpital)
Let $f$ and $g$ be differentiable in $(a,b)\ni x_0$ and $g'(x_0)\ne
0$. If
$
\lim_{x \to x_0} f(x) = \lim_{x \to x_0}g(x) = 0,
$
then
$$
\lim_{x \to x_0} \frac{f(x)}{g(x)} =\frac{f'(x_0)}{g'(x_0)}.
$$
Differentiability at $x_0$ amounts to
$$
f(x)=f(x_0)+f'(x_0)(x-x_0)+o(|x-x_0|)\text{ and
}g(x)=g(x_0)+g'(x_0)(x-x_0)+o(|x-x_0|)\text{ as }x\to x_0.
$$
Since $f(x_0)=g(x_0)=0$, it follows that
$$
\frac{f(x)}{g(x)}=\frac{f'(x_0)(x-x_0)+o(|x-x_0|)}{g'(x_0)(x-x_0)+o(|x-x_0|)}
=\frac{f'(x_0)+o(|x-x_0|)/(x-x_0)}{g'(x_0)+o(|x-x_0|)/(x-x_0)}\text{
as }x\to x_0.
$$
Using the definition of little o and the fact that $g'(x_0)\neq 0$,
the claim follows from taking the limit.
Example
One has that $\lim_{x \to 0+} x \log \frac{1}{x}=0$ since
$$
\lim_{x\to 0+}x \log \frac{1}{x}=\lim_{y\to\infty}\frac{\log y}{y}=
\lim_{y\to\infty}\frac{1}{y}=0.
$$
Question
Why is the above not a correct example of a direct application of
L'Hôpital's rule? How can you fix it?
Theorem (Taylor's Expansion)
Let $f$ be differentiable up to order $(n+1)$ on $[a,b]$. Then there
is $\xi\in(a,b)$ such that
$$
f(b) = \sum_{j=0}^n \frac{f^{(j)}(a)}{j!} (b-a)^j+
\frac{f^{n+1}(\xi)}{(n+1)!}(b-a)^{n+1} \, .
$$
For $M$ to be determined later, consider the function
$$
g(x) = f(x) - \sum_{j=0}^n \frac{f^{(j)}(a)}{j!} (x-a)^j
-\frac{M}{(n+1)!}(x-a)^{n+1},
$$
and observe that it satisfies
$$
g^{(k)}(x)=f^{(k)}(x)-\sum
_{j=k}^n\frac{f^{(j)}(a)}{(j-k)!}(x-a)^{j-k}-\frac{M}{(n+1-k)!}(x-a)^{n+1-k},
$$
and consequently $g^{(k)}(a)=0$ for $k=1,\dots, n$. Let now $M$ be the
number such that
$$
g(b)=f(b) - \sum_{j=0}^{n} \frac{f^{(j)}(a)}{j!} (b-a)^j
-\frac{M}{(n+1)!}(b-a)^{n+1}=0.
$$
It follows that $g(a) = 0$, $g(b) = 0$. By Rolle's Theorem, there is
then $x_1 \in (a,b)$ such that $g'(x_1) = 0$. Then $g'(a)=g'(x_1)=0$
and Rolle's theorem yields $x_2\in(a,x_1)$ such that
$g''(x_2)=0$. Continuing in this fashion we obtain $x_n\in(a,x_{n-1})$
with the property that $g^{(n)}(x_n)=0=g^{(n)}(a)$ and one last use
of Rolle's theorem yields $\xi\in(a,x_n)\subset(a,b)$ with $g^{(n+1)}
(\xi) =0$.
Now $g^{(n+1)}(x) = f^{(n+1)} (x) - M$, so that $f^{(n+1)}(\xi) - M=
0$ and thus
$$
f(b)= \sum_{j=0}^n \frac{f^{(j)}(a)}{j!} (b-a)^j +
\frac{f^{(n+1)}(\xi)}{(n+1)!}(b-a)^{n+1}\, .
$$
Definition (Real Analyticity)
We say that that a function $f$ is analytic at $x_0$
if and only if
$$
f(x)=\sum_{j=0}^\infty \frac{f^{(j)}(x_0)}{j!} (x-x_0)^j,\ \hbox{ for
} \ |x-x_0|<\delta
$$
and some $\delta>0$. If $f$ is analytic at every point in $[a,b]$, we
say that $f$ is real analytic in $[a, b]$.
Notation
The space of analytic functions on $[a,b]$ is denoted by
$\operatorname{C}^\omega\bigl([a,b]\bigr)$. Furthermore it will be
useful to have
$$
\operatorname{C} ^k(a,b):=\big\{ f:(a,b)\to \mathbb{R}\, :\,
f,f',\dots,f^{(k)} \in\operatorname{C}(a,b)\big\}
$$
and
$$
\operatorname{C}^{\infty}(a,b):=\big\{ f\in \operatorname{C}(a,b)\,
:\, f^{(k)}\in\operatorname{C}(a,b)\, ,\: k=0,1,2,\dots\big\}.
$$
Example
Find a function $f \in C^{\infty}(-\infty,\infty)$ which is not real
analytic at $x=0$.
Define
$$
f(x)=\begin{cases} e^{-1/x^2} \, ,& x \ne 0\\ 0\, ,& x= 0.\end{cases}
$$
Observe that $1/x^2 \in C^{\infty}(\mathbb{R} \setminus\{ 0\})$ and
that
$$
f^{(k)}=f(x) p_{3k}(1/x),
$$
where $p_{3k}(x)$ is a polynomial of degree $3k$. It follows that $f \in
C^{\infty}(\mathbb{R} \setminus\{ 0\})$. In order to prove that $f\in
C^\infty (-\infty, \infty)$, it suffices to show that $f^{(j)}(0)=0$
for $j=0, 1,2,\dots$. This follows from the fact that
$$
\lim_{x\to 0}\frac{1}{x^{3m}} \exp(-1/x^2)=0\text{ for }m\in
\mathbb{N}.
$$
However, $f$ cannot be analytic at $x=0$ since otherwise
$$
0\neq f(x)=\sum_{j=0}^\infty \frac{f^{(j)}(0)}{j!}x^j=0\text{ for
small }x.
$$
Example
Suppose $f$ is defined on $(x_0-\delta, x_0 +\delta)$ and that
$f''(x_0)$ exists for some $x_0$ and $\delta>0$. Show that
$$
\lim_{h \to 0} \frac{ f(x_0+h) + f(x_0-h) -2 f(x_0) }{h^2} =
f''(x_0).
$$
(i) Let $f[a,b]\to \mathbb{R}$ possesss a second order derivative
everywhere. Then
$$
f\text{ is convex on }[a,b] \iff f''(x) \ge 0\, ,\: x\in [a,b].
$$
(ii) Let $f[a,b]\to \mathbb{R}$ possess a first order derivative
everywhere on $(a,b)$. Then
$$
f\text{ is convex on }(a, b) \iff f(x)\ge f(x_0)+f'(x_0)(x-x_0)\,
,\: x_0, x\in (a, b) .
$$
"$ \Leftarrow$" Assume that $f''\ge 0$ on $(a,b)$. For arbitrary $x_1, x_2
\in[a,b]$ with $x_1 < x_2$ and $\tau \in [0,1]$, there are
$$
\xi_1 \in\bigl(x_1,\tau x_1 + (1-\tau)x_2\bigr)\, ,\:\xi_2\in
\bigl(\tau x_1+(1-\tau) x_2, x_2\bigr)\text{ and } \xi_3\in
(\xi_1,\xi_2)
$$
such that
\begin{eqnarray*}
\tau f(x_1) + (1- \tau) f(x_2) - f\bigl(\tau x_1 + (1-
\tau)x_2\bigr)
&=&
\tau\big[f(x_1) - f\bigl(\tau x_1 + (1-\tau)x_2\bigr)\bigr] +
(1-\tau)\big[f(x_2) - f\bigl(\tau x_1 + (1-\tau)x_2\bigr)\big]\\
&=&\tau f'(\xi_1)\big[(1-\tau)x_1 - (1-\tau) x_2\big] +
(1-\tau)f'(\xi_2)\big[-\tau x_1 + \tau x_2\big]\\
&=&\tau (1-\tau)f'(\xi_1)(x_1 - x_2) + (1-\tau)\tau
f'(\xi_2)(x_2 - x_1)
\\
&=&\tau(1-\tau)(x_2-x_1)[f'(\xi_2) - f'(\xi_1)]\\
&=&
\tau (1-\tau)(x_2-x_1)f''(\xi_3)(\xi_2 - \xi_1)\ge 0
\end{eqnarray*}
Hence
$$
f\bigl(\tau x_1 + (1-\tau)x_2\bigr) \le\tau f(x_1) +
(1-\tau) f(x_2),
$$
and convexity is obtained.
"$\Rightarrow$'' Conversely
$$
f''(x_0)= \lim_{h \to 0} \frac{ f(x_0+h) + f(x_0-h) -2 f(x_0)
}{h^2}\ge 0
$$
for any $x_0\in (a, b)$, since $f(x_0+h) + f(x_0-h) -2 f(x_0)\ge 0$
by assumption.
Example
Let $f$ be continuous on $(0, \infty)$ and $f'$ exist on $(0,
\infty)$. Assume that $f'$ is monotone increasing and $f(0) = 0$. Show
that
$$
g(x):= \frac{f(x)}{x}\, ,\: x\in(0,\infty),
$$
is increasing.
Let $f,\, f', \, f'':[0,\infty)\to \mathbb{R}$ be bounded and let
$$
\| f\| _\infty= \sup\{ | f(x)|: x\in [0, \infty)\}\, .
$$
Then
$$
\| f'\| _\infty ^2\leq 4 \| f\|_\infty \| f''\|_\infty\, ,
$$
and the inequality is sharp.
(i) Given $x \in (0, \infty)$ and $h >0$, Taylor's expansion yields $\xi
\in (x, x+2h)$ such that
$$
f(x+2h) = f(x) + f'(x)2h + \frac{1}{2!}f''(\xi) (2h)^2\, .
$$
This implies that
$$
f'(x) = \frac{1}{2h} [ f(x+2h) - f(x)] - f''(\xi) h.
$$
Thus,
$$
|f'(x)|\leq\frac{1}{2h} 2\| f\|_\infty+\| f''\| _\infty h\, ,
$$
and
$$
\| f'\| _\infty\leq\frac{1}{h}\| f\|_\infty + h\| f''\| _\infty \:
\text{ for all }\: h>0.
$$
Minimizing over $h$ yields a minimum at $h=\sqrt{\| f\| _\infty/\|
f''\| _\infty}$ and thus the claimed inequality.
(ii) Taking $f(x)=\frac{x^2 -1}{x^2+1}$ on $[0,\infty)$ it holds that $\|
f'\| _\infty ^2=4 \| f\|_\infty \| f''\|_\infty$. It is easy to see
that $\| f\| _\infty=1$. Next we compute
$$
f'(x)=\frac{4x}{(x^2+1)^2},\text{ and
}f''(x)=\frac{4(1-3x^2)}{(x^2+1)^3},\: x\in [0,\infty),
$$
to see that $\| f''\| _\infty=4$ and the claim follows.