3 Differentiation
Differentiation grew out of the problem of instantaneous velocity. Velocity can only easily be measured
as an average over a time interval:
1
if an object travels d meters in t seconds, then its average ve-
locity is v
av
=
d
t
ms
1
. An early ‘definition’ (dating to the 1300’s) makes the instantaneous velocity
equal to the constant velocity that would be observed if a body were to stop accelerating: while use-
less for the purposes of measurement, this is essentially Newton’s first law regarding inertial motion
(1687). We also see the concept of the tangent line beginning to appear: if one graphs position against
time, then a couple of things should be clear:
The graph of inertial (constant speed) motion is a straight line whose slope is the velocity.
The tangent line to a curve at a point has slope equal to the instantaneous velocity at that point.
The problem of finding, defining and computing instantaneous velocity thus morphed into the con-
sideration of tangent lines to curves. With the advent of analytic geometry in the early 1600’s,
mathematicians such as Fermat and Descartes pioneered versions of the familiar secant (‘cutting’)
line method for computing tangents.
d
t
a
t
d
v =
d
t
d
t
a
t
Instantaneous velocity equals constant
velocity corresponding to tangent line
Secant lines approximate tangent line as t a
The average velocity of the particle over the time interval [a, t] is the slope of the secant line, namely
v
av
(a, t) =
d(t) d(a)
t a
Since the secant lines approximate the tangent line as t approaches a, it seems reasonable that we
should compute the instantaneous velocity in this manner:
v(a) = lim
ta
v
av
(a, t) = lim
ta
d(t) d(a)
t a
This is, of course, the modern definition of the derivative.
1
Even a modern technique such as Doppler-shift compares measurements separated by the extremely small period of a
light or soundwave. These are still therefore average velocities, albeit taken over very small time intervals.
28 Basic Properties of the Derivative
Definition 3.1. Let f : U R and let a U. We say that f is differentiable at a if the following limit
exists (is finite!)
lim
xa
f (x) f (a)
x a
We call this limit the derivative of f at a and denote its value by either f
(a) or
d f
dx
x=a
.
If f
(a) exists for all a U then f is differentiable (on U); the derivative becomes a function f
(x) =
d f
dx
.
Notation The contrasting styles are partly attributable, to the primary founders of calculus, Issac
Newton and Gottfried Leibniz. Each has its pros and cons and you should be comfortable with both.
One-sided derivatives Since the defining limit is two-sided, differentiability only makes sense at
interior points of U. Left- and right-derivatives may be defined via one-sided limits; differentiability
is equivalent to these being equal. All results in this section hold for one-sided derivatives with
suitable (sometimes tedious) modifications. It is quite common, though strictly incorrect, to say that
f is differentiable on an interval [a, b) if it is differentiable on the interior ( a, b) and right-differentiable
at a; however, we will strictly adhere to differentiable meaning two-sided.
Examples 3.2. 1. Let f (x) = x
2
+ 4x. Then, for any a R,
lim
xa
f (x) f (a)
x a
= lim
xa
x
2
+ 4x a
2
4a
x a
= lim
xa
(x a)(x + a + 4)
x a
= lim
xa
(x + a + 4) = 2a + 4
Note how the definition of lim
xa
allows us to cancel the x a terms from the numerator and
denominator. We conclude that f is differentiable (on R) and that f
(x) = 2x + 4.
2. Let g(x) =
x+1
2x3
. Then, for any a =
3
2
,
lim
xa
f (x) f (a)
x a
= lim
xa
1
x a
x + 1
2x 3
a + 1
2a 3
= lim
xa
5a 5x
(x a)(2x 3)(2a 3)
= lim
xa
5
(2x 3)(2a 3)
=
5
(2a 3)
2
f is therefore differentiable on its domain R \{
3
2
} with derivative f
(x) =
5
(2x3)
2
.
The familiar expressions
f
(a) = lim
h0
f (a + h) f (a)
h
, f
(x) = lim
h0
f (x + h) f (x)
h
are equivalent to the original definition (see Exercise 5). While seemingly simpler, they sometimes
lead to nastier calculations: see what happens if you try the previous example in this language. . .
2
We now turn to possibly the most well-known result of Freshman Calculus.
Theorem 3.3 (Power Law). Let r R. Then f (x) = x
r
is differentiable with f
(x) = rx
r1
.
The domains of f and f
depend messily on r, but the above certainly holds on the interval (0, ).
We leave a complete proof to the exercises and instead consider a few generalizable examples.
Examples 3.4. 1. If n N and a R, a simple factorization yields
lim
xa
x
n
a
n
x a
= lim
xa
(x a)(x
n1
+ ax
n2
+ ··· + a
n2
x + a
n1
)
x a
()
= lim
xa
(x
n1
+ ax
n2
+ ··· + a
n2
x + a
n1
) = na
n1
We conclude that
d
dx
x
n
= nx
n1
.
2. If f (x) = x
1
and a = 0, then
lim
xa
x
1
a
1
x a
= lim
xa
a x
ax(x a)
= lim
xa
1
ax
=
1
a
2
from which we conclude that f
(x) = x
2
.
A similar approach followed by the factorization () proves the
power law for all negative integer exponents:
x
n
a
n
x a
=
a
n
x
n
a
n
x
n
(x a)
= ···
2
1
1
2
y
2 1 1 2
x
3. To differentiate x
1/n
, simply substitute x = y
n
and observe case 1.
If g(x) = x
1/3
and a = 0, then y = x
1/3
and b = a
1/3
yield
lim
xa
x
1/3
a
1/3
x a
= lim
yb
y b
y
3
b
3
=
1
3b
2
=
1
3
a
2/3
= g
(x) =
1
3
x
2/3
Note that g is not differentiable at x = 0!
1
1
2
y
2 1 1 2
x
We could similarly compute the derivative for all rational exponents, though it is much easier to wait
for the chain rule. The power law for irrational exponents is somewhat more ticklish.
Corollary 3.5 (Basic Transcendental Functions). Recalling our development of power series in the
previous chapter, the power law (for positive integers!) is all we need to see that
d
dx
exp(x) = exp(x),
d
dx
sin x = cos x,
d
dx
cos x = sin x
It is also possible to develop these results independently of power series (see e.g. Exercise 9).
3
Failure of differentiability
It is instructive to consider when a function can fail to be differentiable. First a simple result shows
that functions are not differentiable at discontinuities.
Theorem 3.6. If f is differentiable at a then f is continuous at a.
Proof. Simply take the limit (think carefully why this works!):
lim
xa
f (x) = lim
xa
f (x) f (a)
x a
(x a) + f (a)
= f
(a)(0 0) + f (a) = f (a)
It remains to consider situations when a function is continuous but not differentiable.
Examples 3.7. The following cover all situations where a function is continuous on an interval and
differentiable everywhere except at a single interior point; similarly to isolated discontinuities, these
are classified by considering the three ways in which the derivative limit might not exist.
1. A vertical tangent line occurs when the derivative is infinite. For instance, g(x) = x
1/3
at x = 0.
2. Corners occur when the one-sided derivatives are unequal (could be infinite). For instance,
f (x) =
|
x
|
is not differentiable at zero, the one-sided derivatives being
lim
x0
+
|
x
|
|
0
|
x 0
= lim
x0
+
x
x
= 1 = lim
x0
|
x
|
|
0
|
x 0
= lim
x0
x
x
= 1
Indeed f is differentiable everywhere except at zero, with
f
(x) =
(
1 if x > 0
1 if x < 0
3. A singularity is where left- and/or right-derivatives do not ex-
ist. The standard example in this case is
f (x) =
(
x sin
1
x
if x = 0
0 if x = 0
which is continuous on R and differentiable everywhere ex-
cept at zero: the details are in Exercise 8.
1
x
2
π
2
π
2
π
Singularities and vertical tangent lines can also prevent one-sided differentiability.
More esoteric examples of non-differentiability are also possible:
Utilizing series, we can create functions which are continuous on an interval but nowhere differ-
entiable! For a classic example, see page 28.
It is also possible to construct a function which differentiable (and thus continuous) at precisely
one point; can you think of an example?
4
The Basic Rules of Differentiation
Theorem 3.8. Let f , g be differentiable and k, l be constants.
1. (Linearity) The function k f + lg is differentiable with (k f + lg)
= k f
+ lg
.
2. (Product rule) The function f g is differentiable with ( f g)
= f
g + f g
.
3. (Inverse functions) If f is bijective with non-zero derivative, then f
1
is differentiable and
d
dx
f
1
(x) =
1
f
f
1
(x))
Proof. Parts 1 and 2 follow from the limit laws:
lim
xa
( k f + lg)(x) (k f + lg)(a)
x a
= lim
xa
k
f (x) f (a)
x a
+ l
g(x) g(a)
x a
= k f
(a) + lg
(a)
lim
xa
f (x)g(x) f (a)g(a)
x a
= lim
xa
f (x) f (a)
x a
g(x) + f (a)
g(x) g(a)
x a
= f
(a)g(a) + f (a)g
(a)
Note where we used the continuity of g in the second line (lim g(x) = g(a)). Part 3 is an exercise.
The inverse function rule is intuitive since the graphs of f and f
1
are related by reflection in the line
y = x; gradients at corresponding points are therefore reciprocal. In Leibniz notation the result reads
dx
dy
=
dy
dx
1
.
Examples 3.9. 1. Linearity allows us to differentiate any polynomial: for instance
d
dx
7x
2
+ 13x
4
= 7
d
dx
x
2
+ 13
d
dx
x
4
= 14x + 52x
3
2. The product rule extends the reach of differentiation somewhat:
d
dx
(x
4
sin x) =
d
dx
x
4
sin x + x
4
d
dx
sin x = 4x
3
sin x x
4
cos x
3. The inverse trigonometric functions can now be differentiated. For instance,
y = sin
1
x =
d
dx
sin
1
x =
dy
dx
=
dx
dy
1
=
1
cos y
=
1
q
1 sin
2
y
=
1
1 x
2
4. Define natural log to be the inverse of the (bijective!) exponential function exp(x):
y = ln x x = exp y
It follows that
d
dx
ln x =
dx
dy
1
=
1
exp y
=
1
x
The full details, and the justification that exp x = e
x
, form an optional exercise.
5
Theorem 3.10 (Chain Rule). If g is differentiable at a and f is differentiable at g(a) then f g is
differentiable at a with derivative
( f g)
(a) = f
g(a)
g
(a)
In Leibniz notation this reads
d( f g)
dx
=
d f
dg
dg
dx
which looks like a simple cancellation of the dg terms!
2
Proof. Define γ : dom( f ) R via
γ(v) =
f
v
f
g(a)
vg(a)
if v = g(a)
f
g(a)
if v = g(a)
()
Since f is differentiable at g(a), we see that γ is continuous and lim
vg(a)
γ(v) = f
g(a)
.
Since g is differentiable at a, there exists an open interval U a for which x U = g(x) dom( f ).
Now compute: for any x U \ {a}, let v = g(x) in (), whence
f
g(x)
f
g(a)
x a
= γ
g(x)
g(x) g(a)
x a
Take limits as x a for the result.
Corollary 3.11 (Quotient Rule). Suppose f and g are differentiable. Then
f
g
is differentiable when-
ever g(x) = 0. Moreover
f
g
=
f
g f g
g
2
The proof is an exercise.
Examples 3.12. 1. By the quotient rule,
d
dx
tan x =
d
dx
sin x
cos x
=
cos
2
x + sin
2
x
cos
2
x
= sec
2
x
2. We can now differentiate highly involved combinations of elementary functions:
d
dx
tan(e
4x
2
)
7x
sin x
= 8xe
4x
2
sec
2
( e
4x
2
)
7 sin x 7x cos x
sin
2
x
2
This is completely unjustified since dg does not (for us) mean anything on its own! The same problem appears in the
famously faulty one-line ‘proof of the chain rule:
lim
xa
f
g(x)
f
g(a)
x a
?
= lim
xa
f
g(x)
f
g(a)
g(x) g(a)
lim
xa
g(x) g(a)
x a
The second limit cannot exist unless g(x) = g(a) for all x near, but not equal to, a. The faulty argument is repaired by
replacing the second difference quotient with f
g(a)) whenever g(x) = g(a), before taking the limit. This is precisely what
γ
g(x)
does in the correct proof.
6
Exercises 28 1. Use Definition 3.1 to calculate the derivatives.
(a) f (x) = x
3
at x = 2 (b) g(x) = x + 2 at x = a
(c) f (x) = x
2
cos x at x = 0 (d) r(x) =
3x + 4
2x 1
at x = 1
2. Differentiate the function f (x) = cos
e
x
5
3x
using the chain and product rules.
3. (a) Prove the quotient rule (Corollary 3.11) by combining the chain and product rules.
(b) Prove the inverse derivative rule (Theorem 3.8, part 3).
(Hint: You can’t simply differentiate 1 =
dx
dx
=
d
dx
f ( f
1
(x)) using the chain rule; why not?)
4. (a) Find the derivatives of secant, cosecant and cotangent using the quotient rule.
(b) Why did we choose the positive square-root when computing
d
dx
sin
1
x? What is the
standard domain of arcsine, and what happens at x = ±1?
(c) Find the derivatives of the inverse trigonometric functions using the inverse function rule.
5. Using the definition of the derivative, and supposing that f is differentiable at a, prove that
f
(a) = lim
h0
f (a + h) f (a)
h
= lim
h0
f (a + h) f (a h)
2h
6. Prove that the function f (x) = x
|
x
|
is differentiable everywhere and compute its derivative.
7. Show that following function is differentiable everywhere and compute its derivative:
f (x) =
(
x
2
sin
1
x
if x = 0
0 if x = 0
Moreover, prove that the derivative f
is discontinuous at x = 0.
8. Show that the following function is differentiable everywhere except at zero:
f (x) =
(
x sin
1
x
if x = 0
0 if x = 0
9. (a) Suppose 0 < h <
π
2
. Use the picture to show that
0 <
1 cos h
h
< sin
h
2
and sin h < h < tan h
Hence conclude that lim
h0
sin h
h
= 1 and lim
h0
1cos h
h
= 0.
(b) Use part (a) to prove that
d
dx
sin x = cos x
cos h
sin h
tan h
1
h
h
10. (Hard) Use induction to prove the Leibniz rule (general product rule):
( f g)
(n)
=
n
k=0
n
k
f
(k)
g
(nk)
7
Masochists Corner (non-examinable)
We finish with two very hard bonus exercises, though the first is somewhat easier. If you want a
challenge, give ’em a go!
The Exponential Function & the General Power Law
Consider the function exp(x) :=
n=0
x
n
n!
which converges for all real x.
As we saw when discussing power series, this function satisfies the initial value problem
d
dx
exp(x) = exp(x), exp( 0) = 1
Define e := exp( 1). Certainly e
x
makes sense whenever x Q. When x is irrational, define
e
x
:= sup{e
q
: q Q, q < x}
Our primary goal is to prove that exp(x) = e
x
. As a nice bonus we recover Bernoulli’s limit
identity e = lim
n
1 +
1
n
n
and obtain a complete proof of the power law.
(a) For all x, y R, prove that exp(x + y) = exp(x) exp(y)
(Hint: use the binomial theorem and change the order of summation)
(b) Show that exp(x) is always positive, even when x < 0.
(c) Prove that exp : R (0, ) is bijective.
(Hint: x 0 = exp(x) 1 + x; take limits then apply part (a))
(d) Prove that e
x
= exp(x). Do this in three stages:
If x N, use part (a). Now check for x Z
.
If x =
m
n
Q, first compute
exp(
m
n
)
n
.
If x is irrational, start with (q
n
) Q such that q
n
< x and e
q
n
e
x
. . .
(e) Let ln : (0, ) R be the inverse function of exp. Prove the logarithm laws:
ln(xy) = ln x + ln y and ln x
r
= r ln x
(Just do this when r N; another argument like part (d) is required in general)
(f) We’ve already seen that
d
dy
ln y =
1
y
. Use the fact that
d
dy
ln y = lim
h0
ln(y + h) ln y
h
to prove that exp(x) = lim
n
1 +
x
n
n
, thus recovering Bernoulli’s definition of e.
(g) For any r R, define x
r
:= exp(r ln x). Hence obtain the power law for any exponent.
8
A Very Strange Function
Here is a classic example of a continuous but nowhere-differentiable function!
Let f be the sawtooth function defined by f (x) =
|
x
|
whenever x [1, 1] and extending
periodically to R so that f (x + 2) = f (x). Now define g : R R via
g(x) =
n=0
3
4
n
f (4
n
x)
1
2
2 1 0 1 2
1
2
2 1 0 1 2
f (x) and iterations to n = 3 g(x) (really n = 6, but can you tell?!)
(a) Prove that g is well-defined and continuous on R.
(b) Let x R and m N be fixed. Define h
m
= ±
1
2
·4
m
where the ±-sign is chosen so that
no integers lie strictly between 4
m
x and 4
m
(x + h
m
) = 4
m
x ±
1
2
.
For each n N
0
, define
k
n
=
f
4
n
(x + h
m
)
f (4
n
x)
h
m
Prove the following
i.
|
k
n
|
4
n
with equality when n = m.
ii. n > m = k
n
= 0.
(Hint:
|
f (y) f (z)
|
|
y z
|
: when is this an equality?)
(c) Use part (b) to prove that
g(x + h
m
) g(x)
h
m
1
2
(3
m
+ 1)
Hence conclude that g is nowhere differentiable.
9
29 The Mean Value Theorem
We now turn to one of the central results in calculus.
Theorem 3.13 (Mean Value Theorem/MVT). Let f be continuous on [a, b] and differentiable on
(a, b) . Then there exists ξ (a, b) such that f
( ξ) =
f (b)f (a)
ba
.
This follows easily from two lemmas.
Lemma 3.14. 1. (Critical Points) Suppose g is bounded on (a, b) and attains its maximum or min-
imum at ξ (a, b). If g is differentiable at ξ then g
( ξ) = 0.
2. (Rolle’s Theorem) Suppose g is continuous on [a, b], differentiable on (a, b), and that g(a) =
g(b). Then there exists ξ (a, b) such that g
( ξ) = 0.
The main result follows by applying Rolle’s theorem to
g(x) = f (x)
f (b) f (a)
b a
(x b)
and observing that g(a) = f (b) = g(b) and g
(x) = f
(x)
f (b)f (a)
ba
.
g(x)
x
a b
ξ
Critical Points/Rolle’s Theorem
f (x)
x
a b
ξ
Mean Value Theorem
In the pictures, the orange and green lines are parallel: the average slope over the interval [a, b] equals
the gradient/derivative f
( ξ).
Proof of Lemma. 1. Suppose, for a contradiction, that
g
( ξ) = lim
xξ
g(x) g(ξ)
x ξ
> 0
Let ϵ = g
( ξ) in the definition of limit: δ > 0 such that
0 <
|
x ξ
|
< δ =
g(x) g(ξ)
x ξ
g
( ξ)
< g
( ξ) = 0 <
g(x) g(ξ)
x ξ
< 2g
( ξ)
In particular, if x (ξ, ξ + δ), then g(x) > g(ξ), contradicting the maximality at ξ.
The argument when g
( ξ) < 0 is similar. Finally, apply to g for the result at a minimum.
2. By the extreme value theorem, g is bounded and attains its bounds. If the maximum and min-
imum both occur at the endpoints a, b, then g is constant: any ξ (a, b) satisfies the result.
Otherwise, at least one extreme value occurs at some ξ (a, b): part 1 says that g
( ξ) = 0.
10
Examples 3.15. 1. Let f (x) = (x 1)
2
(4 x) + x on [a, b] = [1, 4]: this is roughly the above picture
illustrating the mean value theorem. We compute the average slope and the derivative:
f (b) f (a)
b a
= 1, f
(x) = 2(x 1)(4 x) (x 1)
2
+ 1 = 3x
2
+ 12x 8
and observe that
f
( ξ) =
f (b) f (a)
b a
3ξ
2
12ξ + 9 = 0 ξ = 1 or 3
Since only 3 lies in the interval ( 1, 4), this is the value ξ satisfying the mean value theorem.
2. We find the maximum and minimum values of g(x) = x
4
14x
2
+ 24x on the interval [0, 2].
The function is differentiable, with
g
(x) = 4x
3
28x + 24 = 4(x 2)(x 1)(x + 3)
By the Lemma, the locations of the extrema are either the end-
points x = 0, 2 or locations with zero derivative (x = 1). Since
f (0) = 0, f (1) = 11, f (2) = 8
we conclude that max( f ) = f (1) = 11 and min( f ) = f (0) = 0.
0
5
10
g(x)
0 1 2
x
Consequences of the Mean Value Theorem Several simple corollaries relate to monotonicity.
Definition 3.16. Suppose f : I R is defined on an interval I. We say that f is:
Increasing (monotone-up) on I if x < y = f (x) f (y)
Decreasing (monotone-down) on I if x < y = f (x) f (y)
We say strictly increasing/decreasing if the inequalities are strict.
Examples 3.17. 1. f : x 7 x
2
is strictly increasing on [0, )
and strictly decreasing on (, 0].
2. The floor function f : x 7 x (the greatest integer less
than or equal to x) is increasing, but not strictly, on R.
2
1
1
2
g(x)
2 1 1 2 3
x
Corollary 3.18. Suppose f is differentiable on an interval I, then
1. f
0 on I f is increasing on I
2. f
0 on I f is decreasing on I
3. f
= 0 on I f is constant on I
11
Proof. () Let x < y where x, y I. By the mean value theorem, ξ (x, y) such that
f (y) f (x)
y x
= f
( ξ) whence f
( ξ) 0 = f (y) f (x)
() For the converse, use the definition of derivative: f
( ξ) = lim
xξ
f (x)f (ξ)
xξ
. If f is increasing, then
x > ξ = f (x) f (ξ) = f
( ξ) 0
Parts 2 and 3 are similar.
Corollary 3.18 yields a couple of flashbacks to elementary calculus.
Corollary 3.19. Let I be an interval.
1. (Anti-derivatives on an interval) If f
(x) = g
(x) on I, then c such that g(x) = f (x) + c on I.
2. (First derivative test) Suppose f is continuous on I and differentiable except perhaps at ξ. If
(
f
(x) < 0 whenever x < ξ, and
f
(x) > 0 whenever x > ξ
then f has its minimum value at x = ξ
The statement for a maximum is similar.
Examples 3.20. 1. Since
d
dx
sin( 3x
2
+ x) = (6x + 1) cos(3x
2
+ x) on (the interval) R, whence all
anti-derivatives of f (x) = (6x + 1) cos( 3x
2
+ x) are given by
Z
f (x) dx =
Z
(6x + 1) cos(3x
2
+ x) dx = sin(3x
2
+ x) + c
As is typical, we use the indefinite integral notation
R
f (x) dx for anti-derivatives.
2. If f (x) = x
2/3
e
x/3
, then f
(x) =
1
3
x
1/3
(2 + x)e
x/3
.
By Lemma 3.14, the only possible critical points are at
x = 0 or 2. The sign of the derivative is also clear:
1
f (x)
3 2 1 0 1
x
By the 1
st
derivative test, f has a maximum at x = 2 and a minimum at x = 0.
We finish this section by tying together the mean and intermediate value theorems.
Theorem 3.21 (IVT for Derivatives). Suppose f is differentiable on an interval I containing a < b,
and that L lies between f
(a) and f
( b). Then ξ (a, b) such that f
( ξ) = L.
If f
(x) is continuous, this is just the intermediate value theorem applied to f
. A full proof is left to
the exercises; surprisingly, continuity is not required. . .
12
Exercises 29 1. Determine whether the conclusion of the mean value theorem holds for each func-
tion on the given interval. If so, find a suitable point ξ. If not, state which hypothesis fails.
(a) x
2
on [1, 2] (b) sin x on [0, π] (c)
|
x
|
on [1, 2]
(d) 1/x on [1, 1] (e) 1/x on [1, 3]
2. Suppose f and g are differentiable on an open interval I, that a < b and f (a) = f (b) = 0. By
considering h(x) = f (x)e
g(x)
, prove that f
( ξ) + f (ξ)g
( ξ) = 0 for some ξ (a, b).
3. Use the Mean Value Theorem to prove the following:
(a) x < tan x for all x (0, π/2).
(b)
x
sin x
is a strictly increasing function on (0, π/2).
(c) x
π
2
sin x for all x [0, π/2].
4. Suppose that
|
f (x) f (y)
|
(x y)
2
for all x, y R. Prove that f is a constant function.
5. (a) Prove that f
> 0 on an interval I = f is strictly increasing on I.
(b) Show that the converse of part (a) is false.
(c) Carefully prove the first derivative test (Corollary 3.19).
6. If f is differentiable on an interval I such that f
(x) = 0 for all x I, use the intermediate value
theorem for derivatives to prove that f is either strictly increasing or strictly decreasing.
7. We prove the intermediate value theorem for derivatives. Let f , a, b and L be as in the Theorem,
define g : I R by g(x) = f (x) Lx, and let ξ [a, b] be such that
g(ξ) = min{g(x) : x [a, b]}
(a) Why can we be sure that ξ exists? If ξ (a, b), explain why f
( ξ) = L.
(b) Now assume WLOG that f
(a) < f
( b). Prove that g
(a) < 0 < g
( b). By considering
lim
xa
+
g(x)g(a)
xa
, show that x > a for which g(x) < g(a). Hence complete the proof.
8. Suppose f
exists on (a, b), and is continuous except for a discontinuity at c (a, b).
(a) Obtain a contradiction if lim
xc
+
f
(x) = L < f
( c). Hence argue that f
cannot have a
removable or a jump discontinuity at x = c.
(Hint: let ϵ =
f
(c)L
2
in the definition of limit then apply IVT for derivatives)
(b) Similarly, obtain a contradiction if lim
xc
+
f
(x) = and conclude that f
cannot have an
infinite discontinuity at x = c.
(c) It remains to see that f
can have an essential discontinuity. Recall (Exercise 28.7) that
f : R R : x 7
(
x
2
sin( 1/x) x = 0
0 x = 0
is differentiable on R, but has discontinuous derivative at x = 0.
i. By considering x
n
=
1
2nπ
and y
n
=
1
(2n+1)π
, show that f
has an essential discontinuity
at x = 0.
ii. Prove that if s
n
0 and f
( s
n
) converges to some M, then M [1, 1].
iii. Use IVT for derivatives to show that for any L [1, 1], ( t
n
) R \ {0} such that
lim
n
f
( t
n
) = L.
13
30 L’Hˆopital’s Rule
We are often forced to consider limits known as indeterminate forms, which do not yield easily to the
standard limits laws. For example, it is tempting to try to write
lim
x0
sin 2x
e
3x
1
=
lim
x0
sin 2x
lim
x0
e
3x
1
=
0
0
()
This is an incorrect application of the limit laws since the resulting quotient has no meaning.
Definition 3.22. An indeterminate form is a limit where a na
¨
ıve application of the limit laws results
in a meaningless expression: the primary types are
0
0
,
, , 0 ·, 0
0
, 0
, and 1
.
Examples 3.23. 1. lim
x7
+
(x 7)
1
x7
is an indeterminate form of type 0
.
2. The above indeterminate form () may be evaluated using the definition of the derivative
lim
x0
sin 2x
e
3x
1
= lim
x0
sin 2x 0
x 0
x 0
e
3x
1
=
d
dx
x=0
sin 2x
d
dx
x=0
e
3x
1
=
2
3
By considering lim
x0
3a sin 2x
2(e
3x
1)
, we see that an indeterminate form of type
0
0
can take any value a!
This approach generalizes: if f (a) = 0 = g(a), we obtain the simplest version of l’H
ˆ
opital’s rule;
lim
xa
f (x)
g(x)
= lim
xa
f (x) f (a)
x a
·
x a
g(x) g(a)
=
f
(a)
g
(a)
This obviously isn’t rigorous. Our goal is to make it so and to extend to the following situations:
Limits where a = ±.
When the RHS cannot be cleanly evaluated: for instance g
(a) = 0 or if the original limit is ±.
Covering all cases makes the proof an absolute behemoth! Because of this, and because such limits
can often be evaluated more instructively using elementary methods, the rule is often discouraged
in Freshman calculus. To prepare for the upcoming monster, we first generalize the MVT.
Lemma 3.24 (Extended Mean Value Theorem). Fix a < b, suppose f , g are continuous on [a, b] and
differentiable on (a, b). Then there exists ξ (a, b) such that
f (b) f (a)
g
( ξ) =
g(b) g(a)
f
( ξ)
Proof. Simply apply the standard mean value theorem (really Rolle’s Theorem) to
h(t) = ( f (b) f (a))g(t) (g(b) g(a)) f (t)
which satisfies h(a) = h(b).
14
Theorem 3.25 (l’Hˆopital’s rule). Let a R {±} and suppose functions f and g satisfy:
1. lim
xa
f
(x)
g
(x)
= L for some L R {± }
2. (a) lim
xa
f (x) = lim
xa
g(x) = 0, or (b) lim
xa
g(x) = (no condition on f )
Then lim
xa
f (x)
g(x)
= L. The same result holds for one-sided limits.
Examples 3.26. 1. If f (x) = e
4x
and g(x) = 21x 17, then
lim
x
f
(x)
g
(x)
= lim
x
4e
4x
21
= = lim
x
e
4x
21x 17
=
This is an example of type
.
2. For an example of type
0
0
, consider f (x) = x
2
9 and g(x) = ln(4 x):
lim
x3
f
(x)
g
(x)
= lim
x3
2x
1/(4 x)
= lim
x3
2x(x 4) = 6 = lim
x3
x
2
9
ln( 4 x)
= 6
3. One can apply the rule repeatedly: for example
lim
x0
e
4x
1 4x
x
2
= lim
x0
4e
4x
4
2x
= lim
x0
16e
4x
2
= 8
There is an abuse of protocol here, since the existence of the first limit is dependent on the last.
The approach is acceptable, though you should understand why it is an abuse. Indeed. . .
4. It is important that the limit lim
f
g
be seen to exist before applying l’H
ˆ
opital’s rule! Consider
f (x) = x + cos x and g(x) = x: certainly lim
x
f (x)
g(x)
has type
, however
lim
x
f
(x)
g
(x)
= lim
x
1 sin x
does not exist! In this case the rule is unnecessary, since
f (x)
g(x)
= 1 +
cos x
x
x
1
by the squeeze theorem.
5. Finally, a short example to explain why l’H
ˆ
opital’s rule is often prohibited in Freshman calculus.
Consider the calculation:
lim
x0
sin x
x
= lim
x0
cos x
1
= 1
This appears to be a legitimate application of the rule. However, recall (Exercise 28.9) that one
purpose of this limit is to demonstrate that
d
dx
sin x = cos x; to use this fact to calculate the limit
on which it depends is the very definition of circular logic!
15
Other Indeterminate Forms
The remaining indeterminate forms listed in Definition 3.22 may be modified so that l’H
ˆ
opital’s rule
applies. Since you’ve likely seen several such examples in elementary calculus, we give just a couple.
Examples 3.27. 1. An indeterminate form of type is transformed to one of type
0
0
before
applying the rule (twice):
lim
x0
+
1
e
x
1
1
x
= lim
x0
+
x + 1 e
x
x(e
x
1)
(type
0
0
)
= lim
x0
+
1 e
x
e
x
1 + xe
x
(still type
0
0
)
= lim
x0
+
e
x
2e
x
+ xe
x
=
1
2
2. For an indeterminate form of type 1
, we use the log laws & the continuity of the exponential:
lim
x0
+
(1 + sin x)
1/x
= exp
lim
x0
+
1
x
ln( 1 + sin x)
(type
0
0
)
= exp
lim
x0
+
cos x
1 + sin x
= e
1
= e
Proving l’Hˆopital’s Rule
The complete argument is very long; if you do nothing else, read the following proof of the simplest
case. Everything else is a modification.
Proof (type
0
0
with right limits). We prove first for right-limits x a
+
. First observe that condition 1.
forces the existence of an interval (a, b) on which f , g are differentiable and g
(x) = 0.
Assume we have a form of type
0
0
(case 2. (a)) and assume additionally that a and L are finite. Every-
thing follows from the definition of limit (condition 1.) and Lemma 3.24:
Given ϵ > 0, δ (0, b a) such that a < ξ < a + δ =
f
( ξ)
g
( ξ)
L
<
ϵ
2
()
a < y < x < a + δ = ξ (y, x) such that
f (x) f (y)
g(x) g(y)
=
f
( ξ)
g
( ξ)
(†)
Since g
= 0, the usual mean value theorem says we never divide by zero in ():
c (y, x) such that g(x) g(y) = g
( c)(x y) = 0
Observe that
f (x)f (y)
g(x)g( y)
L
=
f
(ξ)
g
(ξ)
L
<
ϵ
2
, let y a
+
and use 2. (a) to see that
x (a, a + δ),
f (x)
g(x)
L
ϵ
2
< ϵ
which is the required result.
16
We now describe some modifications.
If a = : Replace the blue part of () as follows:
Given ϵ > 0, m b such that ξ < m =
f
( ξ)
g
( ξ)
L
<
ϵ
2
The rest of the proof goes through after replacing a with and a + δ with m.
If L = : Replace the green parts of () with Given M > 0 and
f
(ξ)
g
(ξ)
> 2M. Fixing the rest of the
proof is again straightforward.
If L = : Replace the green parts of () with Given M > 0 and
f
(ξ)
g
(ξ)
< 2M.
Left-limits: If f , g are differentiable on (c, a), then the blue part may be replaced with either:
(a finite) δ (0, a c) such that a δ < ξ < a
(a = ) m c such that ξ > m
The blue and green parts of () can be replaced independently. This completes the proof for all
indeterminate forms of type
0
0
.
Proof (case 2. (b) when lim g(x) = ). This requires a little more modification.
3
Since g
= 0, and
lim
xa
+
g(x) = , Exercise 29.6 says that g is strictly decreasing on (a, b). By replacing b by some
˜
b (a, b), if necessary, we may assume that
a < y < x < b = 0 < g(x) < g(y) (‡)
Assume a and L are finite, and obtain () and () as before. Let x (a, a + δ) be fixed and multiply
(†) by
g(y)g(x)
g(y)
(this is positive by (‡)): a little algebra and the triangle inequality tell us that
a < y < x =
f (y)
g(y)
=
f
( ξ)
g
( ξ)
+
f (x)
g(y)
g(x)
g(y)
·
f
( ξ)
g
( ξ)
=
f (y)
g(y)
L
f
( ξ)
g
( ξ)
L
+
1
g(y)
|
f (x)
|
+
|
g(x)
|
L +
ϵ
2
Since lim
ya
+
g(y) = and x is fixed, we see that there exists η x a < δ such that
y (a, a + η) =
1
g(y)
|
f (x)
|
+
|
g(x)
|
L +
ϵ
2
<
ϵ
2
Finally combine with (): given ϵ > 0, η > 0 such that y (a, a + η) =
f (y)
g(y)
L
< ϵ.
The same modifications listed previously complete the proof.
3
Forms of type
? Instead of assumption 2. (b), why not simply assume lim f = lim g = and write
f
g
=
1/g
1/ f
to obtain
a form of type
0
0
? The problem is that the derivative of the ‘new’ denominator
d
dx
1
f
=
f
f
2
need not be non-zero on any
interval (a, b) and so condition 1. need not hold. We could modify this, but it would make for a weaker theorem. Example
3.26.4 illustrates this: f
(x) = 1 + sin x has zeros on any unbounded interval.
After the 2. (b) case is proved and we know that lim
f
g
= L, it is then clear that lim f must also be infinite (unless L = 0 in
which case lim f could be anything and need not exist). This situation therefore really does deal with forms of type
.
17
Exercises 30 1. Evaluate the following limits, if they exist:
(a) lim
x0
x
3
sin x x
(b) lim
x
π
2
tan x
2
π 2x
(c) lim
x0
(cos x)
1/x
2
(d) lim
x0
(1 + 2x)
1/x
(e) lim
x
( e
x
+ x)
1/x
2. Let f be differentiable on (c, ) and suppose that lim
x
[ f (x) + f
(x)] = L is finite.
(a) Prove that lim
x
f (x) = L and that lim
x
f
(x) = 0.
(Hint: write f (x) =
f (x)e
x
e
x
)
(b) Does anything change if L exists and is infinite?
3. If p
n
(x) is a polynomial of degree n, use induction to prove that lim
x
p
n
(x)e
x
= 0
4. Let f (x) = x + sin x cos x, g(x) = e
sin x
f (x) and h(x) =
2 cos x
e
sin x
( f (x) + 2 cos x)
(a) Prove that lim
x
f (x) = = lim
x
g(x) but that lim
x
f (x)
g(x)
does not exist.
(b) If cos x = 0, and x is large, show that
f
(x)
g
(x)
= h(x).
(c) Prove that lim
x
h(x) = 0. Explain why this does not contradict part (a)!
18
31 Taylors Theorem
A primary goal of power series is the approximation of functions. As such, there are two natural
questions to ask of a given function f :
1. Given c dom( f ), is there a series
a
n
(x c)
n
which equals f (x) on an interval containing c?
2. If we take the first n terms of such a series, how accurate is this polynomial approximation?
Example 3.28. Recall the geometric series
f (x) =
1
1 x
=
n=0
x
n
whenever 1 < x < 1
The polynomial approximation
p
n
(x) =
n
k=0
x
k
= 1 + x + ··· + x
n
=
1 x
n+1
1 x
has error
R
n
(x) = f (x) p
n
(x) =
x
n+1
1 x
2
4
6
8
10
y
1 0 1
x
1
2
1
2
p
3
(x) = 1 + x + x
2
+ x
3
If x is close to 0, this is likely very small; for instance if x
1
2
,
1
2
, then
|
R
n
(x)
|
1
1
1
2
1
2
n+1
= 2
n
However, when x is close to 1, the error is unbounded!
The behavior in the Example occurs in general: the truncated polynomial approximations are better
near the center of the series. To see this, we first need to consider higher-order derivatives.
Definition 3.29. We write f
′′
for the second derivative of f , namely the derivative of its derivative
f
′′
(a) = lim
xa
f
(x) f
(a)
x a
The existence of f
′′
(a) presupposes that f
exists on an (open) interval containing a. We can similarly
consider third, fourth, and higher-order derivatives. As a function, the n
th
derivative is written
f
(n)
(x) =
d
n
f
dx
n
By convention, the zeroth derivative is the function itself f
(0)
(x) = f (x). We say that f is n times
differentiable at a if f
(n)
(a) exists, and infinitely differentiable (or smooth) if derivatives of all orders exist.
Example 3.30. f (x) = x
2
|
x
|
is twice differentiable, with f
′′
(x) = 6
|
x
|
. It is smooth everywhere
except at x = 0, where third (and higher-order) derivatives do not exist.
19
Definition 3.31. Suppose f is n times differentiable at x = c. The n
th
Taylor polynomial p
n
of f
centered at c is
p
n
(x) :=
n
k=0
f
(k)
( c)
k!
(x c)
k
= f (c) + f
( c)(x c) +
f
′′
( c)
2
(x c)
2
+ ··· +
f
(n)
( c)
n!
(x c)
n
The remainder R
n
(x) is the error in the polynomial approximation
R
n
(x) = f (x) p
n
(x) = f (x)
n
j=0
f
(k)
( c)
k!
(x c)
k
If f is infinitely differentiable at x = c, then its Taylor series centered at x = c is the power series
T f (x) =
n=0
f
(n)
( c)
n!
(x c)
n
When c = 0 this is known as a Maclaurin series.
4
For simplicity we’ll most often work with Maclaurin series, with general cases hopefully being clear.
Examples 3.32. 1. If f (x) = e
3x
, then f
(n)
(x) = 3
n
e
x
, from which the Maclaurin series is
T f (x) =
n=0
3
n
n!
x
n
2. If g(x) = sin 7x, then the sequence of derivatives is
7 cos 7x, 7
2
sin 7x, 7
3
cos 7x, 7
4
sin 7x, 7
5
cos 7x, 7
6
sin 7x, . . .
At x = 0, every even derivative is zero, while the odd derivatives alternate in sign; the Maclau-
rin series is easily seen to be
Tg(x) =
n=0
( 1)
n
7
2n+1
(2n + 1)!
x
2n+1
3. If h(x) =
x, then h
(x) =
1
2
x
1/2
, h
′′
(x) =
1
2
2
x
3/2
, and h
′′
(x) =
3
2
3
x
5/2
, from which the
third Taylor polynomial centered at c = 1 is
p
2
(x) = h(1) + h
(1)(x 1) +
h
′′
(1)
2
(x 1)
2
+
h
′′
(1)
6
(x 1)
3
= 1 +
1
2
(x 1)
1
8
(x 1)
2
+
1
16
(x 1)
3
Rather than compute more examples, we develop a little theory that makes verifying Taylor series
much easier.
4
Named for Englishman Brook Taylor (1685–1731) and Scotsman Colin Maclaurin (1698–1746). Taylor’s general method
expanded on examples discovered by James Gregory and Issac Newton in the mid-to-late 1600’s.
20
Differentiation of Taylor Polynomials and Series
Suppose P(x) =
a
j
x
j
is a power series with radius of convergence R > 0. As we discovered
previously, this is differentiable term-by-term on (R, R). Indeed
P
(x) =
j=1
a
j
jx
j1
= P
(0) = a
1
P
′′
(x) =
j=2
a
j
j(j 1)x
j2
= P
′′
(0) = 2a
2
P
′′
(x) =
j=3
a
j
j(j 1)(j 2)x
j3
= P
′′
(0) = 3!a
3
.
.
.
P
(k)
(x) =
j=k
a
j
j(j 1) ···(j k + 1)x
jk
=
j=k
j!a
j
(j k)!
x
jk
= P
(k)
(0) = k!a
k
Otherwise said, P is its own Maclaurin series! The same discussion holds for polynomials: indeed if
P(x) = a
0
+ a
1
x + ···+ a
n
x
n
is a polynomial, then for all k n,
P
(k)
(0) = f
(k)
(0) a
k
=
f
(k)
(0)
k!
If this holds for all k n, then P must be the Taylor polynomial of f ! With a little modification, we’ve
proved the following:
Theorem 3.33. 1. If f (x) =
n=0
a
n
(x c)
n
on a neighborhood of c, then
n=0
a
n
(x c)
n
is the Taylor
series of f .
2. The n
th
Taylor polynomial of f centered at x = c is the unique polynomial p
n
of degree n
whose value and first n derivatives agree with those of f at x = c: that is
k n, p
(k)
n
( c) = f
(k)
( c)
This answers our first motivating question: a function can equal at most one power series with a
given center. The second question requires a careful study of the remainder: we’ll do this shortly.
Examples 3.34 (Common Maclaurin Series). These should be familiar from elementary calculus.
Each of these functions equals the given series by our previous discussion of power series: by the
Theorem, each series is therefore the Maclaurin series of the given function with no requirement to
calculate directly!
e
x
=
n=0
x
n
n!
x R
1
1 x
=
n=0
x
n
x (1, 1)
sin x =
n=0
( 1)
n
(2n + 1)!
x
2n+1
x R ln( 1 + x) =
n=1
( 1)
n+1
n
x
n
x (1, 1]
cos x =
n=0
( 1)
n
(2n)!
x
2n
x R tan
1
x =
n=0
( 1)
n
2n + 1
x
2n+1
x [1, 1]
21
Examples 3.35 (Modifying Maclaurin Series). By substituting for x in a common series, we
quickly obtain new ones.
1. Substitute x 7 7x in the Maclaurin series for sin x, to recover our earlier example
sin 7x =
n=0
( 1)
n
7
2n+1
(2n + 1)!
x
2n+1
, x R
Note how this requires almost no calculation: since the function equals a series, the Theorem
says we have the Maclaurin series for sin 7x!
2. Substitute x 7 x
2
in the Maclaurin series for e
x
to obtain
e
x
2
= exp(x
2
) =
n=0
1
n!
x
2n
, x R
This would be disgusting to verify directly, given the difficulty of repeatedly differentiating e
x
2
.
3. We find the Taylor series for f (x) =
1
5x
centered at x = 2:
f (x) =
1
3 + 2 x
=
1
3( 1
2x
3
)
=
1
3
n=0
2 x
3
n
which is valid whenever 1 <
2x
3
< 1 1 < x < 5.
4. Fix c R and observe that, for all x R,
e
x
= e
c+xc
= e
c
e
xc
=
n=0
e
c
n!
(x c)
n
We conclude that the series is the Taylor series of e
x
centered at x = c. Of course this is easily
verified using the definition, since
d
n
dx
n
x=c
e
x
= e
c
.
5. Combining the Theorem with the multiple-angle formula, we obtain the Taylor series for sin x
centered at x = c:
sin x = sin(c + x c) = sin c cos(x c) + cos x sin(x c)
=
n=0
( 1)
n
sin c
(2n)!
(x c)
2n
+
n=0
( 1)
n
cos c
(2n + 1)!
(x c)
2n+1
Definition 3.36. A function is analytic on a domain if for each c there exists a neighborhood of c on
which the function equals its Taylor series centered at c.
All the examples we’ve so far seen are analytic on their domains; indeed the last two of Examples
3.35 prove this for the exponential and sine functions. Every analytic function is automatically smooth
(infinitely differentiable), however the converse is false in that not every smooth function is analytic
(see Exercise 10). Analyticity is of greater importance in complex analysis where it is seen to be
equivalent to complex-differentiability.
22
Accuracy of Taylor Approximations
Our final goal is to estimate the accuracy of a Taylor polynomial as an approximation to its generating
function. Otherwise said, we want to estimate the size of the remainder R
n
(x) = f (x) p
n
(x).
Theorem 3.37 (Taylors Theorem: Lagrange’s form). Suppose f is n + 1 times differentiable on an
open interval I containing c and let x I \ {c}. Then there exists some ξ between c and x for which
the remainder centered at c satisfies
R
n
(x) =
f
(n+1)
( ξ)
( n + 1)!
(x c)
n+1
Proof. For simplicity let c = 0. Fix x = 0, define a constant M
x
and a function g : I R by
R
n
(x) =
M
x
( n + 1)!
x
n+1
and g(t) =
M
x
( n + 1)!
t
n+1
+ p
n
( t) f (t) =
M
x
( n + 1)!
t
n+1
R
n
( t)
Observe that
k n + 1 = g
(k)
(x) =
M
x
( n + 1 k)!
t
n+1k
+ p
(k)
n
( t) f
(k)
( t) ()
= g
(k)
(0) = p
(k)
n
(0) f
(k)
(0) = 0 if k n
where we invoked Theorem 3.33.
Apply Rolle’s Theorem repeatedly (WLOG assume x > 0):
ξ
1
between 0 and x such that g
( ξ
1
) = 0.
ξ
2
between 0 and ξ
1
such that g
′′
( ξ
2
) = 0, etc.
Iterate to obtain a sequence (ξ
k
) such that
0 < ξ
n+1
< ξ
n
< ··· < ξ
1
< x and g
(k)
( ξ
k
) = 0
Take ξ = ξ
n+1
and consider (): since deg p
n
n, we see that
0 = g
(n+1)
( ξ) = M
x
f
(n+1)
( ξ) = R
n
(x) = f (x) p
n
(x) =
f
(n+1)
( ξ)
( n + 1)!
x
n+1
Corollary 3.38. Suppose f is smooth on an open interval I containing c and that all derivatives f
(n)
of all orders are bounded on I. Then f equals its Taylor series (centered at c) on I.
Proof. For simplicity, let c = 0. Suppose
f
(n+1)
( ξ)
K for all ξ I. Choose any N >
|
x
|
and
observe that
n > N =
|
R
n
(x)
|
K
|
x
|
n+1
( n + 1)!
=
K
|
x
|
n+1
N!(N + 1) ···(n + 1)
K
|
x
|
N
N!
|
x
|
N
n+1N
n
0
23
Examples 3.39. 1. The functions sine and cosine have derivatives bounded by 1 on R, and thus
both functions equal their Maclaurin series on R. This removes the need to have previously
justified these facts using the theory of differential equations.
2. The exponential function does not have bounded derivatives, however we can still apply Tay-
lor’s Theorem. For any fixed x, ξ between 0 and x such that
|
R
n
(x)
|
=
e
ξ
( n + 1)!
x
n+1
n
0
by the same argument in the Corollary. Thus e
x
equals its Maclaurin series on the real line.
3. Extending Example 3.32.3, we see that the function h(x) =
x has the following linear approx-
imation (1
st
Taylor polynomial) centered at c = 9
p
1
(x) = h(9) + h
(9)(x 9) = 3 +
1
6
(x 9)
This yields the simple approximation
10 p
1
(10) = 3 +
1
6
=
19
6
Taylor’s Theorem can be used to estimate its accuracy (remember to shift the center to 9!):
R
1
(10) =
h
′′
( ξ)
2!
(10 9)
2
=
1
2
2
·2!
ξ
3/2
=
1
8ξ
3/2
for some ξ (9, 10)
Certainly ξ
3/2
< 9
3/2
=
1
27
, whence
1
216
< R
1
(10) < 0 =
19
6
1
216
=
683
216
<
10 <
684
216
=
19
6
19
6
is therefore an overestimate for
10, but is accurate to within
1
216
< 0.005.
Alternative Versions of Taylors Theorem
The two other common expressions for the remainder are typically less easy to use than Lagrange’s
form, but can sometimes provide sharper estimates for the remainder, particularly when x is far from
the center of the series.
Corollary 3.40. Suppose f
(n+1)
is continuous on an open interval I containing c, let x I \{c}, and
let R
n
(x) = f (x) p
n
(x) be the remainder for the Taylor polynomial centered at c. Then:
1. (Integral Remainder) R
n
(x) =
Z
x
c
(x t)
n
n!
f
(n+1)
( t) dt
2. (Cauchy’s Form) ξ between c and x such that R
n
(x) =
(x ξ)
n
n!
(x c) f
(n+1)
( ξ)
24
Using these expressions it is possible to explicitly prove Newton’s binomial series formula:
Theorem 3.41. If α R and
|
x
|
< 1, then
(1 + x)
α
= 1 +
n=1
α(α 1) ···(α n + 1)
n!
x
n
= 1 + αx +
α(α 1)
2!
x
2
+
α(α 1)(α 2)
3!
x
3
+
α(α 1)(α 2)(α 3)
4!
x
4
+ ···
If α N
0
, this is the usual binomial theorem. Otherwise it is more interesting, for instance,
1 + x = (1 + x)
1/2
= 1 +
1
2
x
1
8
x
2
+
1
16
x
3
5
128
x
4
+ ···
1
(1 + x)
3
= 1 3x + 6x
2
10x
3
+ 15x
4
···
Of course this last could easily be obtained from
1
1+x
=
( 1)
n
x
n
by differentiating twice!
Exercises 31 1. Compute the Maclaurin series for cos x directly from the definition and use Taylor’s
Theorem to indicate why it converges to cos x for all x R.
2. Repeat the previous exercise for sinh x =
1
2
( e
x
e
x
) and cosh x =
1
2
( e
x
+ e
x
).
3. Find the Maclaurin series for the function sin( 3x
2
). How do you know you are correct?
4. Find the Taylor series of f (x) = x
4
3x
2
+ 2x 5 centered at x = 2 and show that T f (x) = f (x).
5. Find a rational approximation to
3
9 using the first Taylor polynomial for f (x) =
3
x. Now use
Taylor’s Theorem to estimate its accuracy.
6. If c = 1, use the fact that 1 x = (1 c)
1
xc
1c
to obtain the Taylor series of
1
1x
centered at
c. Hence conclude that
1
1x
is analytic on its domain R \ {1}.
7. We use Taylor’s Theorem to prove that the Maclaurin series
n=1
(1)
n+1
n
x
n
converges to ln(1 + x)
whenever 0 < x 1.
(a) Explicitly compute
d
n+1
dx
n+1
ln( 1 + x) .
(b) Suppose 0 < x 1. Using Taylor’s Theorem, prove that lim
n
R
n
(x) = 0.
(If 1 < x < 0, the argument is tougher, being similar to Exercise 11)
8. Why can’t we use Taylor’s Theorem to approximate the error in
1
1x
= 1 + x + R
1
(x) when
x 1? Try it when x = 2, what happens? What about when x = 2?
9. Prove Taylor’s Theorem with integral remainder when c = 0 by using the following as an
induction step: for each n N, define
A
n
(x) =
Z
x
0
(x t)
n
n!
f
(n+1)
( t) dt
and use integration by parts to prove that A
n+1
= A
n
x
n+1
(n+1)!
f
(n+1)
(0).
(The Cauchy form follows from the intermediate value theorem for integrals which we’ll see later)
25
10. Consider the function
f (x) =
(
e
1/x
if x > 0
0 otherwise
(a) Prove by induction that there exists a degree 2n polynomial q
n
for which
f
(n)
(x) = q
n
1
x
e
1/x
whenever x > 0
(b) Prove that f is infinitely differentiable at x = 0 with f
(n)
(0) = 0 (use Exercise 30.3).
The Maclaurin series of f is identically zero! Moreover, f is smooth (infinitely differentiable) on R but
non-analytic at zero since it does not equal its Taylor series on any open interval containing zero.
A modification allows us to create bump functions, which find wide use in analysis. If a < b, define
g
a,b
: x 7→ f (x a) f (b x)
This is smooth on R but non-zero only on the interval (a, b). A
further modification involving two such functions g
a,b
creates
a smooth function on R which satisfies
h
a,b,ϵ
(x) =
(
0 if x a ϵ or x b + ϵ
1 if a x b
This ‘switches on’ rapidly from 0 to 1 near a and switches off
similarly near b. By letting ϵ be small, we smoothly (but not
uniformly) approximate the indicator function on [a, b].
0
h
a,b,ǫ
(x)
0
x
aa ǫ b b + ǫ
1
11. (Hard) We prove the binomial series formula. Let f (x) = (1 + x)
α
and g(x) = 1 +
n=1
a
n
x
n
where a
n
=
α(α1)···( αn+1)
n!
. Our goal is to prove that f = g on the interval (1, 1).
(a) Check that f
(n)
(0) = n!a
n
so that g really is the Maclaurin series of f .
(b) i. Prove that the radius of convergence of g is 1.
ii. Prove that lim
n
na
n
x
n
= 0 whenever
|
x
|
< 1.
iii. If
|
x
|
< 1 and ξ lies between 0 and x, prove that
xξ
1+ξ
|
x
|
.
(Hint: ξ = tx for some t (0, 1). . . )
(c) Use Taylor’s Theorem with Cauchy remainder to prove that
|
R
n
(x)
|
< (n + 1)
|
a
n+1
||
x
|
n+1
(1 + ξ)
α1
Hence conclude that g = f whenever
|
x
|
< 1.
(d) Here is an alternative argument:
i. Show that (n + 1)a
n+1
+ na
n
= αa
n
.
ii. Differentiate term-by-term to prove directly that g satisfies the differential equation
(1 + x)g
(x) = αg(x). Solve this to show that g = f whenever
|
x
|
< 1.
26