4 Integration

The theory of inﬁnite series addresses the problem of summing inﬁnitely many ﬁnite quantities. By

contrast, integration is the business of summing inﬁnitely many inﬁnitesimal quantities. Mathemati-

cians have attempted to do both for well over 2000 years, and the philosophical objections are just as

old.

The development and increased application of calculus from the late 1600s spurred mathemati-

cians to try to put the theory on a ﬁrmer footing, though from Newton and Leibniz it took another

150 years before Bernhard Riemann (1856) provided a thorough development of the integral.

32 The Riemann Integral

The basic idea behind Riemann integration is to approximate area using a sequence of rectangles

whose width tends to zero. The following discussion is hopefully familiar.

Example 4.1. Consider f (x) = x

deﬁned on [0, 1].

For each n ∈ N, let ∆x =

and deﬁne x

= i∆x.

Above each subinterval [x

i−1

, x

], raise a rectangle of height

f (x

) = x

The sum of the areas of these rectangles is the Riemann sum

with right-endpoints

∑

i=1

f (x

) ∆x =

∑

i=1

n(n + 1)(2n + 1)

3n + 1

The Riemann sum with left-endpoints is deﬁned similarly:

∑

i=1

f (x

i−1

) ∆x =

∑

i=1

(i −1)

−

3n −1

Since f is increasing, the area A under the curve satisﬁes

≤ A ≤ R

and the squeeze theorem allows us to conclude that A =

0 1

n =

0.365234

0 1

n =

0.302734

The example contains the essential idea, but more ﬂexibility is needed. To get further, we must

properly deﬁne the concepts of partition and Riemann sum.

Two of Zeno’s ancient paradoxes are relevant here: Achilles and the Tortoise concerns a convergent inﬁnite series,

while the Arrow Paradox discusses a difﬁculty with integration by questioning whether time can be considered as a sum

of instants. Perhaps the most famous contemporary criticism comes from Bishop George Berkeley, who gave his name

to the Californian city and thus the ﬁrst UC campus: in The Analyst (1734), Berkeley savaged the foundations of calcu-

lus, describing the inﬁnitesimal increments required in Newton’s theory of ﬂuxions (derivatives) as merely the “ghosts of

departed quantities.”

Now is a good time to review some identities:

∑

i=1

i =

n(n + 1),

∑

i=1

n(n + 1) (2n + 1),

∑

i=1

(n + 1)

Deﬁnition 4.2. A partition P = {x

, . . . , x

} of an interval [a, b] is a ﬁnite sequence such that

a = x

< x

< ··· < x

n−1

< x

= b

For each 1 ≤ i ≤ n, deﬁne ∆x

= x

− x

i−1

. The mesh of the partition is mesh(P) := max ∆x

Choose a sample point x

∗

in each subinterval [x

i−1

, x

If f : [a, b] → R, the Riemann sum

∑

i=1

f (x

∗

) ∆x

computes the area of a family of n rectangles.

f (x)

∗

b = x

∗

a = x

b = x

In elementary calculus, one typically computes Riemann sums for equally-spaced partitions with left,

right or middle sample points. The double freedom of partition & sample points makes applying the

deﬁnition a challenge, so instead we consider two special families of rectangles.

Deﬁnition 4.3. Given a partition P of [a, b] and a bounded function f on [a, b], deﬁne

= sup

x∈[x

i−1

]

f (x) U( f , P) =

∑

i=1

∆x

= inf

x∈[x

i−1

]

f (x) L( f, P) =

∑

i=1

∆x

U( f , P) and L( f , P) are the upper and lower Darboux sums for

f with respect to P. The upper and lower Darboux integrals are

U( f ) = inf U( f , P) L( f ) = sup L( f , P)

where the supremum and inﬁmum are over all partitions.

We say that f is (Riemann) integrable on [a, b] if U( f ) = L( f )

and denote this value by

f or

f (x) dx

If the interval is understood or irrelevant, it is common just

to say that f is integrable and write

f .

a x

Upper Darboux sum U( f , P)

a x

Lower Darboux sum L( f , P)

Intuitively, L( f , P) is the sum of the areas of rectangles built on P which just ﬁt under the graph of

f . It is also the inﬁmum of all Riemann sums on P. If f is discontinuous, then L( f , P) need not be a

Riemann sum; there might not be suitable sample points!

Examples 4.4. 1. We revisit Example 4.1 in this language.

Given a partition Q = {x

, . . . , x

} of [ 0, 1] and sample points x

∗

∈ [x

i−1

, x

], we compute the

Riemann sum for f (x) = x

∑

i=1

f (x

∗

) ∆x

∑

i=1

∗

)

− x

i−1

)

Since f is increasing, we have x

i−1

≤ (x

∗

)

≤ x

on each interval, whence

L( f , Q) =

∑

i=1

i−1

)

− x

i−1

) ≤

∑

i=1

∗

)

− x

i−1

) ≤

∑

i=1

)

− x

i−1

) = U( f , Q)

The Darboux sums are therefore the Riemann sums for right and left endpoints.

If we take Q

to be the partition with subintervals of equal width ∆x =

, then

U( f ) = inf

U( f , P) ≤ U( f , Q

) =

∑

i=1





∆x = R

is the right Riemann sum discussed originally. Similarly L( f ) ≥ L

. Since L

and R

both

converge to

as n → ∞, the squeeze theorem forces

≤ L( f ) ≤ U( f ) ≤ R

=⇒ L( f ) = U( f ) =

whence f is Riemann integrable on [0, 1] with

dx =

2. Suppose f (x) = kx + c on the interval [a, b] where

k > 0. Take the evenly spaced partition P

where

= a +

b−a

i with ∆x

b−a

. Since f is increasing,

the upper Darboux sum is again the Riemann sum

with right-endpoints:

U( f , P

) = R

∑

i=1

f (x

) ∆x

b − a

∑

i=1

k(b −a)

i + ak + c

ak + c

bk + c

U( f , P

)

b − a



k(b −a)

n(n + 1) + (ak + c)n



−−−→

n→∞

k(b −a)

+ (b − a)(ak + c) =

( b

− a

) + c(b − a)

We similarly see that the lower Darboux sum is given by the Riemann sum with left endpoints,

and that

L( f , P

) = L

b − a



k(b −a)

n(n −1) + (ak + c)n



−−−→

n→∞

( b

− a

) + c(b − a)

By the same argument as above, L

≤ L( f ) ≤ U( f ) ≤ R

and the squeeze theorem show that

f is integrable with

f =

( b

− a

) + c(b − a).

Following the examples, a few remarks are in order.

Riemann versus Darboux Deﬁnition 4.3 is really that of the Darboux integral. Riemann’s deﬁnition is

as follows: for f [a, b] → R to be integrable with integral

f means

∀ϵ > 0, ∃δ such that ∀P, x

∗

, mesh(P) < δ =⇒



∑

i=1

f (x

∗

) ∆x

−



< ϵ

It can be shown that this is equivalent to the Darboux integral. We won’t pursue Riemann’s

formulation further, except to observe that if a function is integrable and mesh(P

) → 0, then

f = lim

n→∞

∑

i=1

f (x

∗

) ∆x

: this allows us to approximate integrals using any sample points we

choose, hence why right endpoints (x

∗

= x

) are so common in Freshman calculus.

Monotone Functions Darboux sums are particularly easy to compute for monotone functions. As in

the examples, if f is increasing, then each M

= f (x

), from which U( f , P) is the Riemann sum

with right-endpoints. Similarly, L( f , P) is the Riemann sum with left-endpoints. The roles reverse

if f is decreasing.

Area If f is positive and continuous,

the Riemann integral

f serves as a deﬁnition for the area

under the curve y = f (x). This should make intuitive sense:

1. In the second example where we have a straight line, we obtain the same value for the

area by computing directly as the sum of a rectangle and a triangle!

2. If the area under the curve is to make sense, then, for any partition P, it plainly satisﬁes

the inequalities

L( f , P) ≤ Area ≤ U( f , P)

But these are exactly the same as those satisﬁed by the integral itself:

L( f , P) ≤ L( f ) =

f = U( f ) ≤ U( f , P)

In the examples we exhibited a sequence of partitions (P

) where U( f , P

) and L( f , P

) both con-

verged to the same limit. The next results develop some basic properties of partitions and make this

process rigorous.

Lemma 4.5. Suppose f : [a, b] → R is bounded and suppose P, Q are partitions of [a, b].

1. If Q is a reﬁnement of P, that is P ⊆ Q, then

L( f , P) ≤ L( f , Q) ≤ U( f , Q) ≤ U( f , P)

2. For any partitions P, Q, we have L( f , P) ≤ U( f , Q)

3. L( f ) ≤ U( f )

We’ll see later (Theorem 4.16) that every continuous function is integrable.

Proof. 1. We prove inductively. First suppose that Q = P ∪{t} contains exactly one additional point

t ∈ (x

k−1

, x

). Write

= inf{f (x) : x ∈ [x

k−1

, t]}

= inf{f (x) : x ∈ [t, x

k−1

]}

m = inf{f (x) : x ∈ [x

k−1

, x

]} = min{m

, m

}

The Darboux sums L( f , P) and L( f , Q) are identical ex-

cept for the terms involving t; indeed

k−1

Extra area!

··· ···

L( f , Q) − L( f , P) = m

( t − x

k−1

) + m

−t) −m(x

− x

k−1

)

= (m

−m)(t −x

k−1

) + (m

−m)(x

−t) ≥ 0

Since partitions are ﬁnite sets, by induction we see that P ⊆ Q =⇒ L( f , P) ≤ L( f , Q).

The argument for U( f , Q) ≤ U( f , P) is similar, and the middle inequality is trivial.

2. If P and Q are partitions, then P ∪ Q is a reﬁnement of both P and Q. By part 1,

L( f , P) ≤ L( f , P ∪Q) ≤ U( f , P ∪ Q) ≤ U( f , Q) (∗)

3. We leave this as an exercise.

Theorem 4.6 (Cauchy criterion for integrability). Suppose f : [a, b] → R is bounded.

1. f is integrable ⇐⇒ ∀ϵ > 0, ∃P such that U( f , P) − L( f , P) < ϵ

2. f is integrable ⇐⇒ ∃(P

)

n∈N

such that U( f , P

) − L( f , P

) → 0. Moreover, in such a case

both sequences U( f , P

) and L( f , P

) converge to

f .

We call this a Cauchy criterion since integrability is demonstrated without mention of the integral!

Proof. 1. (⇒) Suppose f is integrable. Since inf U( f , Q) =

f = sup L( f , R), ∃Q, R such that

U( f , Q) −

f <

and

f − L( f , R) <

Let P = Q ∪ R and apply (∗): L( f , R) ≤ L( f , P) ≤ U( f , P) ≤ U( f , Q). But then

U( f , P) − L( f , P) ≤ U( f , Q) − L( f , R) = U( f , Q) −

f +

f − L( f , R) < ϵ

(⇐) For every partition, L( f , P) ≤ L( f ) ≤ U( f ) ≤ U( f , P). Thus

0 ≤ U( f ) − L( f ) ≤ U( f , P) − L( f , P) < ϵ

Since this holds for all ϵ > 0, we see that U( f ) = L( f ).

2. This is an exercise.

Examples 4.7. 1. The freedom to choose a partition can be very useful. Consider f (x) =

√

x on the

interval [0, b]. We choose a partition that evaluates nicely when fed to this function:

= {x

, . . . , x

} where x





=⇒ ∆x

= x

− x

i−1



−(i −1)



(2i −1)b

Since f is increasing on [0, b], we see that

U( f , P

) =

∑

i=1

f (x

) ∆x

∑

i=1

√

(2i −1)b

3/2

∑

i=1

−i

3/2



n(n + 1)(2n + 1) −

n(n + 1)



−−−→

n→∞

3/2

Similarly

L( f , P

) =

∑

i=1

f (x

i−1

) ∆x

∑

i=1

(i −1)

√

(2i −1)b

3/2

∑

i=1

−3i + 1

3/2



n(n + 1)(2n + 1) −

n(n + 1) + n



−−−→

n→∞

3/2

Since these limits are equal, we conclude that f is integrable and that

√

x dx =

3/2

0 b

√

Upper Sum U( f , P

)

0 b

√

Lower Sum L( f , P

)

2. We ﬁnish this section with the classic example of a non-integrable function. Let f : [a, b] → R

to be the indicator function of the irrational numbers,

f (x) =

(

1 if x ∈ Q

0 if x ∈ Q

Since any interval of positive length contains both rational and irrational numbers, we see that

sup



f (x) : x ∈ [x

i−1

, x

]



= 1 and inf



f (x) : x ∈ [x

i−1

, x

]



= 0

for any partition P = {x

, . . . , x

}. We conclude that

U( f , P) =

∑

i=1

− x

i−1

) = b − a =⇒ U( f ) = b − a and

L( f , P) = 0 =⇒ L( f ) = 0

Since the upper and lower integrals are unequal, f is not Riemann integrable.

As any freshman calculus student can attest, if you can ﬁnd an anti-derivative, then the fundamental

theorem of calculus (Section 34) makes evaluating integrals far easier. For instance, you are probably

desperate to write

3/2

= x

1/2

=⇒

√

x dx =

3/2



3/2

rather than computing Riemann/Darboux sums as in the previous example! In most practical cases,

however, no easy-to-compute anti-derivative exists, so the best we can do is approximate integrals

by evaluating Riemann sums for progressively ﬁner partitions. Thankfully computers excel at such

tedious work!

Exercises 32 1. For each function on the given interval, use partitions to ﬁnd the upper and lower

Darboux integrals. Hence prove that the function is integrable and compute its integral.

(a) f (x) = x

on [0, b] for any b > 0.

(b) g(x) =

√

x on [0, b].

(Hint: mimic Example 4.7.1)

2. Repeat question 1 for the following two functions. You cannot simply compute Riemann sums

for left and right endpoints and take limits: why not?

(a) h(x) = x(2 − x) on [ 0, 2]

(Hint: choose a partition with 2n points such that x

= 1 and observe that h(2 − x) = h(x))

(b) k(x) =

(

2x if x ≤ 1

5 − x if x > 1

on [0, 3].

(Hint: this time try a partition with 3n points. . . )

3. Let f (x) = x for rational x and f (x) = 0 for irrational x.

(a) Calculate the upper and lower Darboux integrals for f on the interval [0, b].

(b) Is f integrable on [0, b]?

4. Prove part 3 of Lemma 4.5: L( f ) ≤ U( f ).

5. Prove part 2 of Theorem 4.6.

f is integrable ⇐⇒ ∃(P

)

n∈N

such that U( f , P

) − L( f , P

) → 0.

Moreover, both U( f , P

) and L( f , P

) converge to

f .

6. (a) Reread Deﬁnition 4.3. What happens if we allow f : [a, b] → R to be unbounded?

(b) Read “Riemann versus Darboux” on page 4. Explain why being Riemann integrable also

forces f to be bounded.

7. (If you like coding) Write a short program to estimate

f (x) dx using Riemann sums. This

can be very simple (equal partitions with right endpoints), or more complex (random partition

and sample points given a mesh). Apply your program to estimate

sin(x

−

√

) dx.

33 Properties of the Riemann Integral

The rough take-away of this long section is that everything you think is integrable probably is! There

will not be many examples since we have not established many explicit values for integrals.

Theorem 4.8 (Linearity). If f , g are integrable and k, l are constant, then k f + lg is integrable and

k f + lg = k

f + l

Example 4.9. Thanks to examples in the previous section, we can now calculate, for instance

−3

√

x dx = 5 ·

·2

−3 ·

·2

3/2

= 20 −4

√

Proof. Suppose ϵ > 0 is given. By Theorem 4.6 part 3, there exist partitions R, S such that

U( f , R) − L( f , R) <

and U(g, S) − L(g, S) <

By Theorem 4.6 part 1, if P := R ∪ S, then both inequalities are satisﬁed by P. On each subinterval,

inf f (x) + inf g(x) ≤ inf( f (x) + g(x)) and sup( f (x) + g(x)) ≤ sup f (x) + sup g(x)

since the individual suprema/inﬁma could be ‘evaluated’ at different places. Thus

L( f , P) + L(g, P) ≤ L( f + g, P) ≤ U( f + g, P) ≤ U( f , P) + U(g, P)

whence U( f + g, P) − L( f + g, P) < ϵ and f + g is integrable. Moreover,

( f + g) −

f −

g ≤



U( f , P) −





U(g, P) −



< ϵ

Using lower Darboux integrals similarly, we see that

−ϵ <

( f + g) −

f −

g < ϵ

Since this holds for all ϵ > 0, we conclude that

( f + g) =

f +

That k f is integrable with

k f = k

f is an exercise. Put these together for the result.

Corollary 4.10 (Changing endvalues). Suppose g is integrable on [a, b] and that f : [ a, b] → R

satisﬁes f (x) = g(x) on (a, b). Then f is integrable on [a, b] and

f =

Deﬁnition 4.11 (Integration on an open interval). A bounded function f : (a, b) → R is integrable if

it has an integrable extension g : [a, b] → R where f (x) = g(x) on (a, b). In such a case, we deﬁne

f :=

The Corollary (its proof is an exercise) shows that the choice of extension is irrelevant.

Theorem 4.12 (Basic Comparisons). Suppose f and g are integrable on [a, b].

1. If f (x) ≤ g(x), then

f ≤

2. If m ≤ f (x) ≤ M then m( b − a) ≤

f ≤ M(b − a).

3. f g is integrable.

is integrable and



≤

5. max( f , g) and min( f , g) are integrable.

Part 3 is not integration by parts and does not tell us how

f g relates to

f and

Proof. 1. Since g( x) − f (x) ≥ 0 is integrable, L(g − f , P) ≥ 0 for all partitions P, and so

0 ≤ L(g − f ) =

g − f =

g −

2. Apply part 1 twice.

3. This is an exercise.

4. The integrability is an exercise. For the comparison, apply part 1 to −

≤ f ≤

5. max( f , g) =

( f + g) +

f − g

, etc.

Theorem 4.13 (Domain splitting). Suppose that f : [a, b] → R

and let c ∈ (a, b). If f is integrable on both [a, c] and [c, b], then it

is integrable on [a, b] and

f =

f +

f (x)

a c b x

In light of this result, it is conventional to allow integral limits to be reversed:

f := −

f is consistent with

f = 0

Proof. Let ϵ > 0 be given, then ∃R, S partitions of [a, c], [c, b] such that

U( f , R) − L( f , R) <

, U( f , S) − L( f , S) <

Choose P = R ∪S to partition [a, b], then

U( f , P) − L( f , P) = U( f , R) + U( f , S) − L( f , R) −L( f , S) < ϵ

Moreover

f (x)

a c b x

} | {

f −

f ≤ U( f , P) − L( f , R) − L( f , S) = U( f , P) − L( f , P) < ϵ

The other side is similar.

Example 4.14. If f (x) =

√

x on [0, 1] and f (x) = 1 on [1, 2], then

f =

√

x dx +

1 dx =

+ 1 =

Monotonic & Continuous Functions

We are now in a position to establish the integrability of two large classes of functions.

Deﬁnition 4.15. A function f : [a, b] → R is:

Monotonic if it is either increasing (x < y =⇒ f (x) ≤ f (y)) or decreasing.

Piecewise monotonic if there is a partition P = {x

, . . . , x

} of [a, b] such that f is monotonic on each

open subinterval (x

k−1

, x

Piecewise continuous if there is a partition such that f is uniformly continuous on each (x

k−1

, x

Theorem 4.16. If f is monotonic or continuous on [a, b], then it is integrable.

Examples 4.17. 1. Since sine is continuous, we can approximate via a sequence of Riemann sums

sin x dx =

lim

n→∞

∑

i=1

sin

πi

Evaluating this limit is another matter entirely, one best handled in the next section...

2. Similarly, e

√

can be integrated and therefore approximated via Riemann sums:

√

dx =

lim

n→∞

∑

i=1

exp

= lim

n→∞

∑

i=1

2j −1

exp

Both sums use right endpoints; the ﬁrst has equal subintervals and the second is analogous to

Example 4.7.1. These limits would typically be estimated using a computer.

Proof. Suppose f : [a, b] → R is continuous. Since [a, b] is closed and bounded, f is uniformly contin-

uous. Let ϵ > 0 be given:

∃δ > 0 such that ∀x, y ∈ [a, b],

x −y

< δ =⇒

f (x) − f (y)

b − a

Let P be a partition with mesh P < δ. Since f attains its bounds on each [x

i−1

, x

] (extreme value

theorem),

∃x

∗

, y

∗

∈ [x

i−1

, x

] such that M

−m

= f (x

∗

) − f (y

∗

) <

b − a

from which

U( f , P) − L( f , P) <

∑

i=1

b − a

− x

i−1

) = ϵ

The monotonicity argument is an exercise.

Corollary 4.18. Piecewise continuous and bounded piecewise monotonic functions are integrable.

Proof. If f is piecewise continuous, then the restriction of f to (x

k−1

, x

) has a continuous extension

: [x

k−1

, x

] → R; integrable by Theorem 4.16. By Corollary 4.10, f is integrable on [x

k−1

, x

] with

k−1

f =

k−1

. Several applications of Theorem 4.13 ﬁnish things off:

f =

∑

k=1

k−1

The argument for piecewise monotonicity is similar.

Example 4.19. The ‘fractional part’ function f (x) = x − ⌊x⌋

is both piecewise continuous and piecewise monotone on any

bounded interval. It is therefore integrable on any [a, b].

0 1 2 3 4 5

We ﬁnish with the ﬁnal incarnation of the intermediate value theorem.

Corollary 4.20 (IVT for integrals). If f is continuous on [a, b], then ∃ξ ∈ (a, b) for which

f ( ξ) =

b − a

Proof. Since f is continuous, it is integrable on [a, b]. By the extreme value theorem it is also bounded

and attains its bounds: ∃p, q ∈ [a, b] such that

f (p) := inf

x∈[a,b]

f (x), f ( q) = sup

x∈[a,b]

f (x)

Applying Theorem 4.12, part 2, with m = f (p) and

M = f (q) , we see that

( b − a) f (p) ≤

f ≤ (b − a) f (q)

ξa bp q

Now divide by b −a and apply the usual intermediate value theorem for f to see that the required ξ

exists between p and q.

In the picture, when f is positive and continuous, the grey area equals that under the curve; imagine

levelling off the blue hill with a bulldozer. . . The notation f

b−a

f is short for the average value

of f on [a, b]: to see why this interpretation is sensible, approach

f via a sequence of Riemann sums

on equally-spaced partitions P

, then

b − a

f = lim

n→∞

∑

i=1

f (x

∗

) ∆x = lim

n→∞

f (x

∗

) + ··· + f (x

∗

)

is the limit of a sequence of averages of equally-spaced samples f (x

∗

What can/can’t be integrated? (non-examinable)

We now know a great many examples of integrable functions: essentially

• Piecewise continuous & monotonic functions are integrable.

• Linear combinations, products, absolute values, maximums and minimums of (already) inte-

grable functions.

After so many positive integrability conditions, it is reasonable to ask precisely which functions are

Riemann integrable. There is a precise answer, though it is quite tricky to understand.

Theorem 4.21 (Lebesgue). Suppose f : [a, b] → R is bounded. Then

f is Riemann integrable ⇐⇒ it is continuous except on a set of measure zero

ıvely, the measure of a set is the sum of the lengths of its maximal subintervals; though unfor-

tunately this doesn’t make for a very useful deﬁnition.

Any countable subset has measure zero;

Lebesgue’s result is almost as if we can extend Corollary 4.18 to allow for inﬁnite sums. Indeed

you might have encountered a function which is continuous only on the irrationals; such a function

is Riemann integrable. There are also some uncountable sets with measure zero such as Cantor’s

middle-third set: if f is the indicator function of Cantor’s set

f (x) =

(

1 if x ∈ C

0 otherwise

then f is continuous except on C, and is Riemann integrable with

f (x) dx = 0.

Exercises 33 1. Explain why

2π

sin

( e

) dx ≤

2. If f is integrable on [a, b] prove that it is integrable on any interval [c, d] ⊆ [a, b].

3. We complete the proof of Theorem 4.8 (linearity of integration).

(a) Suppose k > 0, let A ⊆ R and deﬁne kA := {kx : x ∈ A}. Prove that sup kA = k sup A

and inf kA = k inf A.

(b) If k > 0 prove that k f is integrable on any interval and that

k f = k

f .

4. Give an example of an integrable but discontinuous function on a closed bounded interval [a, b]

for which the conclusion of the Intermediate Value Theorem for Integrals is false.

Formally, the length of an open interval (a, b) is b − a and a set A ⊆ R has measure zero if

∀ϵ > 0, ∃ open intervals I

such that A ⊆

∞

[

n=1

and

∞

∑

i=1

length(I

) < ϵ

More generally, the measure of a set (subject to a technical condition) is the inﬁmum of the sum of the lengths of any

countable collection of open covering intervals. A rigorous discussion of measure theory is properly a matter for graduate

analysis. Somewhat surprisingly, there exist sets with positive measure that contain no subintervals, and even sets which

are non-measurable!

5. Explicitly compute the value of the integral

15/2

1/2

x −⌊x⌋dx (recall Example 4.19).

6. We prove and extend Corollary 4.10. Suppose f is integrable on [a, b].

(a) If g : [a, b] → R satisﬁes f (x) = g(x) for all x ∈ (a, b), prove that g is integrable and

g =

f .

(Hint: consider h = f − g and show that

h = 0)

(b) Now suppose g : [a, b] → R satisﬁes f (x) = g(x) for all x ∈ [ a, b] except at ﬁnitely many

points. Prove that g is integrable and

g =

f .

7. Show that an increasing function on [a, b] is integrable and thus complete Theorem 4.16.

(Hint: Choose a partition P with mesh P <

f (b)−f (a)

)

8. Suppose f and g are integrable on [a, b].

(a) Deﬁne h(x) = ( f (x))

. We know:

• f is bounded: ∃K such that

f (x)

≤ K on [a, b].

• Given ϵ > 0, ∃P such that U( f , P) − L( f , P) <

. For each subinterval [x

i−1

, x

], let

= sup f (x), m

= inf f (x), M

= sup h(x), m

= inf h(x)

Prove that M

−m

≤ 2(M

−m

)K and use this to conclude that h is integrable.

(b) Prove that f g is integrable.

(Hint: f g =

( f + g)

−

( f − g)

)

, P) − L(

, P) ≤ U( f , P) − L( f , P) for any partition P. Hence conclude

that

is integrable.

(One can extend these arguments—it’s a bit harder!—to show that if j is continuous, then j ◦ f is

integrable. Parts (a) and (c) correspond, respectively, to j(x) = x

and j(x) =

9. (Hard) Let f (x) =











x if x = 0 and sin

> 0

−x if x = 0 and sin

< 0

0 if x = 0

(a) Show that f is not piecewise continuous on [0, 1].

(b) Show that f is not piecewise monotonic on [0, 1].

(Hint: given ϵ, hunt for a suitable partition to make U( f , P) − L( f , P) < ϵ by considering [0, x

]

differently to the other subintervals)

(d) Make a similar argument to show that g = sin

is integrable on ( 0, 1], where

g(x) =

(

sin

if x = 0

0 if x = 0

Note that neither argument evaluates the integrals!

34 The Fundamental Theorem of Calculus

The key result linking integration and differentiation is usually presented in two parts.

While there

are signiﬁcant subtleties, the rough statements are as follows:

Part I Differentiation reverses integration:

f ( t) dt = f (x)

Part II Integration reverses differentiation:

′

(x) dx = F(b) − F(a)

These facts seemed intuitively obvious to early practitioners of calcu-

lus: indeed, given a continuous positive function f :

• Let F(x) denote the area under the curve between 0 and x.

• A small increase ∆x results in the area increasing by ∆F.

• ∆F ≈ f (x)∆x is approximately the area of a rectangle, whence

∆F

∆x

≈ f (x). This is part I.

• F(b) − F(a) ≈

∑

∆F

≈

∑

f (x

) ∆x

. Since F

′

= f , this is part II.

∆F

∆x

f (x)

In fact when Leibniz introduced the symbols

and d in the late 1600’s, it was partly to reﬂect the

fundamental theorem.

If you’re happy with non-rigorous notions of limit, rate of change, area, and

(inﬁnite) sums, the above is all you need!

Of course, we are very much concerned with the details: What must we assume regarding f and F,

and how are these properties used in the proof?

Theorem 4.22 (FTC, part I). Suppose f is integrable on [a, b]. For any x ∈ [a, b], deﬁne

F(x) :=

f ( t) dt

Then:

1. F is uniformly continuous on [a, b];

2. If f is continuous at c ∈ [a, b], then F is differentiable at c with F

′

( c) = f (c).

As ever, the condition at c = a should be right-continuous and the conclusion right-differentiable, etc.

Compare this with the na

ıve version above where we assumed f was continuous. We now require

only the integrability of f , and its continuity at one point for the full result.

We follow the traditional numbering; some authors reverse these.

is a stylized S for sum, while d stands for difference. Given a sequence F = (F

, F

, . . . , F

), construct a new

sequence of differences

dF = (F

− F

, F

− F

, . . . , F

− F

n−1

)

which can then be summed:

dF = (F

− F

) + (F

− F

) + ···(F

− F

n−1

) = F

− F

(∗)

Viewing a function as an ‘inﬁnite sequence’ of values spaced along an interval, dF becomes a sequence of inﬁnitesimals and

(∗) is essentially the fundamental theorem:

dF = F(b) − F(a). It is the conception of a function that is suspect here, not

the essential relationship between sums and differences.

Examples 4.23. You should have seen many examples in an elementary calculus course.

1. Since f (x) = sin

−7) is continuous on any bounded interval, we conclude that

sin

( t

−7) dt = sin

−7)

If one follows Theorem 4.13 and its resulting conventions, then this is valid for all x ∈ R.

2. The chain rule permits more complicated examples. For instance, since f (t) = sin

√

t is contin-

uous on its domain [0, ∞) and y(x) = x

+ 3 has range [3, ∞) ⊆ dom( f ), we have

sin

√

t dt =

sin

√

t dt = 2x sin

+ 3

3. For a ﬁnal positive example, observe that

sin x

tan(t

) dt = e

tan(e

) − cos x tan(sin

To evaluate this, one ﬁrst chooses any constant a and writes

sin x

−

sin x

before differentiating. This is valid provided sin x, e

and a all lie in the same subinterval of

dom tan(t

) = R \{±

, ±

3π

, ±

5π

, . . .}

Since

sin x

≤ 1 <

, this requires



⇐⇒ x <

Choosing a = 1 would certainly sufﬁce.

4. Now consider why the theorem requires continuity. The piecewise

continuous function

f : [0, 2] → R : x 7→

(

2x if x ≤ 1

if x > 1

has a jump discontinuity at x = 1. We can still compute

F(x) =

(

2t dt = x

if x ≤ 1

2t dt +

dt =

(x + 1) if x > 1

This is continuous, indeed uniformly so. However the discontinu-

ity of f results in F having a corner and thus being non-differentiable

at x = 1. Indeed, F

′

(x) = f (x) for all x = 1; that is, at all values of

x where f is continuous.

f (x)

0 1 2

F(x )

0 1 2

Proving FTC I Neither half of the theorem is particularly difﬁcult once you write down what you

know and what you need to prove. Here are the key ingredients:

1. F uniformly continuous means controlling the size of

F(y) − F(x)



f ( t) dt −

f ( t) dt



f ( t) dt



≤

f ( t)

But the boundedness of f allows us to bound this last integral. . .

2. F

′

( c) = f (c) means showing that lim

x→c

F(x)−F(c)

x−c

= f (c), which means controlling the size of



F(x) − F(c)

x −c

− f (c)



x −c

f ( t) dt − f (c)



The trick here will be to bring the constant f (c) inside the integral as

x−c

f ( c) dt so that what

we really have to control is the size of

x−c

f ( t) − f (c)

dt. This is where the continuity of

f comes in. . .

Proof. 1. Since f is integrable, it is bounded: ∃M > 0 such that

f (x)

≤ M for all x.

Let ϵ > 0 be given and deﬁne δ =

. Then, for any x, y ∈ [a, b],

0 < y − x < δ =⇒

F(y) − F(x)



f ( t) dt



≤

f ( t)

dt (Theorem 4.12, part 4)

≤ M(y − x) (Theorem 4.12, part 2)

< Mδ = ϵ

We conclude that F is uniformly continuous on [a, b].

2. Let ϵ > 0 be given. Since f is continuous at c, ∃δ > 0 such that, for all t ∈ [a, b],

t −c

< δ =⇒

f ( t) − f (c)

Now for all x ∈ [a, b] (except c),

0 <

x −c

< δ =⇒



F(x) − F(c)

x −c

− f (c)



x −c

f ( t) − f (c) dt



(Theorem 4.8)

≤

x −c

f ( t) − f (c)

dt (Theorem 4.12)

≤

x −c

< ϵ

Clearly lim

x→c

F(x)−F(c)

x−c

= f (c), and so F is differentiable at c with F

′

( c) = f (c).

The Fundamental Theorem, part II As with part I, the formulaic part of the result should be familiar,

though we are more interested in the assumptions and where they are needed.

Theorem 4.24 (FTC, part II). Suppose g is continuous on [a, b], differentiable on (a, b), and that g

′

is integrable

on (a, b). Then,

′

= g(b) − g(a)

Part II is often expressed in terms of anti-derivatives: F being an anti-derivative of f if F

′

= f . Com-

bined with FTC I, we recover the familiar ‘+c’ result and a simpler version of the fundamental theo-

rem often seen in elementary calculus.

Corollary 4.25. Let f be continuous on [a, b].

• If F is an anti-derivative of f , then

f = F(b) − F(a).

• Every anti-derivative has the form F(x) =

f ( t) dt + c for some constant c.

Examples 4.26. Again, basic examples should be familiar.

1. Plainly g(x) = x

+ 2x

3/2

is continuous on [1, 4] and differentiable on (1, 4) with derivative

′

(x) = 2x + 3

√

x; this last is continuous (and thus integrable) on (1, 4). We conclude that

2x + 3

√

x dx = x

+ 2x

3/2



= (16 + 16) − (1 + 2) = 29

2. If g(x) = sin(3x

), then g

′

(x) = 6x cos(3x

). Certainly g satisﬁes the hypotheses of the theorem

on any bounded interval [a, b]. We conclude

6x cos(3x

) dx = sin(3b

) − sin(3a

)

Moreover, every anti-derivative of f (x) = 6x cos(3x

) has the form F(x) = sin(3x

) + c.

3. Recall Example 4.23.4 where we saw that the discontinuity of f led to the non-differentiability

of F(x) =

f ( t) dt at x = 1. The function F therefore fails the hypotheses of FTC II on the

interval [0, 2].

However, except at x = 1, F is an anti-derivative of f and moreover

f (x) dx = F(2) − F(0),

so we appear to have the formulaic conclusion of FTC II, though this is tautological given the

deﬁnition of F!

The way out of this conundrum is to note that other anti-derivatives

F of f exist (except at

x = 1), and which fail to satisfy the conclusion. For instance

F(x) =

(

if x < 1

x if x > 1

=⇒

F(2) −

F(0) = 1 =

f (x) dx

See Deﬁnition 4.11 if you’re unsure what it means for g

′

to be integrable on a bounded open interval.

Proving FTC II See Exercise 10 for a relatively easy proof when g

′

= f is continuous. For the real

McCoy, we can only rely on the integrability of g

′

: the trick is to use the mean value theorem to write

g(b) − g(a) as a Riemann sum over a suitable partition.

Proof. Let ϵ > 0 be given and choose a partition P such that U(g

′

, P) − L(g

′

, P) < ϵ. Since g satisﬁes

the mean value theorem on each subinterval of the partition P, we see that

∃ξ

∈ (x

i−1

, x

) such that g

′

( ξ

) =

g(x

) − g(x

i−1

)

− x

i−1

from which

g(b) − g(a) =

∑

i=1

g(x

) − g(x

i−1

) =

∑

i=1

′

( ξ

)( x

− x

i−1

)

This is a Riemann sum for g

′

associated to the partition P, hence,

L(g

′

, P) ≤ g(b) − g(a) ≤ U(g

′

, P)

However we also have L(g

′

, P) ≤

′

≤ U(g

′

, P). Since these hold for all ϵ, the proof is complete.

While we certainly used the integrability of g

′

in the proof, it might seem strange that we assumed it

at all: shouldn’t every derivative be integrable? Perhaps surprisingly, the answer is no! If you want

a challenge, look up the Volterra function, which is differentiable everywhere, but whose derivative is

non-integrable (on, for instance, [0, 1])!

The Rules of Integration

If one wants to evaluate an integral, rather than merely show it exists, there are really only two options:

1. Evaluate Riemann sums and take limits: often difﬁcult if not impossible to do explicitly.

2. Use FTC II. The problem now becomes the ﬁnding of anti-derivatives, for which the core method

is essentially guess and differentiate. To obtain general rules, we attempt to reverse the rules of

differentiation.

Integration by Parts First consider the product rule: the product g = uv of two differentiable

functions is differentiable with g

′

= u

′

v + uv

′

. Now apply Theorems 4.8, 4.12 and FTC II.

Corollary 4.27 (Integration by Parts). Suppose u, v are continuous on [a, b], differentiable on (a, b),

and that u

′

, v

′

are integrable on (a, b). Then

′

(x)v(x)dx = u(b)v(b) − u(a)v(a) −

u(x)v

′

(x)dx

This is signiﬁcantly less useful than the product rule since it is only capable of transforming the

integral of a product into another such integral.

Examples 4.28. You should have seen myriad examples in a previous course. With practice, there

is no need to explicitly state u and v.

1. Let u(x) = x and v

′

(x) = cos x. Then u

′

(x) = 1 and v(x) = sin x, whence

π/2

x cos x dx =

[

x sin x

]

π/2

−

π/2

sin x dx =

sin

−0 −

[

−cos x

]

π/2

+ cos

−cos 0 =

−1

2. Let u(x) = ln x and v

′

(x) = 1. Then u

′

(x) =

and v(x) = x, whence

ln x dx =

[

x ln x

]

−

dx = e

ln e

−e ln e −

[

]

= 2e

−e −e

+ e = e

Change of Variables/Substitution We now turn our attention to the chain rule. If g(x) = F



u(x)



where F and u are differentiable, then g is differentiable with

′

(x) =

= F

′



u(x)



′

(x)

Now integrate both sides; the only issue is what assumptions are needed to invoke FTC II.

Theorem 4.29 (Substitution Rule). Suppose we have two continuous functions: u : [a, b] → R and

f : range(u) → R. Suppose also that u is differentiable on (a, b) with integrable derivative u

′

. Then



u(x)



′

(x) dx =

u(b)

u(a)

f ( u) du

This is the famous ‘u-sub’ formula from elementary calculus.

Proof. We leave as an exercise the veriﬁcation that both integrals exist. We may also assume that

range(u) is an interval of positive length

for otherwise both integrals are trivially zero.

Choose any c ∈ range(u) and deﬁne

F : range(u) → R by F(v) :=

f ( t) dt

Since f is continuous, by FTC I we see that F is differentiable with F

′

( u) = f (u). But now



u(x)



′

(x) dx =





u(x)





dx (chain rule)

= F



u(b)



− F



u( a)



(FTC II)

u(b)

u(a)

f ( u) du

By the intermediate and extreme value theorems, range(u) is already a closed bounded interval.

Examples 4.30. Reading the theorem is bad enough; its application often requires signiﬁcant creativ-

ity in order to recognize a suitable substitution.

1. To evaluate the integral

√

2x sin x

dx, consider the substitution u(x) = x

deﬁned on

[0,

√

π]. Certainly u is continuous, and its derivative u

′

(x) = 2x is integrable on (0,

√

π).

Finally f ( u) = sin u is continuous on range(u) = [0, π]. The hypotheses are satisﬁed, whence

√

2x sin x

dx =

√



u(x)



′

(x) dx =

sin u du = −cos u



= 2

2. For the following integral with f (u) =

, we make the substitution u(x) = x

− 2. Note

that u : [

√

3] → [0, 1] and that u

′

(x) = 2x is integrable; moreover, f (u) is continuous on

range(u) = [0, 1]. We conclude that

√

−4x

+ 5

dx =

√

−2)

+ 1

dx =

+ 1

du = arctan u



3. The hypotheses on u really are all that is necessary. In particular, u doesn’t need to be left-

/right-differentiable at the endpoints of [a, b]! For instance, with f (u) = u

and u(x) =

√

x on

[0, 4], we easily verify

√

x dx =

√

dx =



u(x)



′

(x) dx =

f ( u) du =

du =

4. Sloppy use of the substitution rule might lead to utter nonsense. For instance, consider the

‘substitution’ u = x

in the following:

−1

dx =

−1

2x dx =

du =

(ln 4 −ln 1) = ln 2

Of course the left hand integral does not exist since

is undeﬁned at 0 ∈ (−1, 2), so the

conclusion is false. In the language of the substitution rule, f (u) =

is not continuous on

range(u) = [0, 4]: it is not even deﬁned at u = 0! You are very unlikely to make precisely this

mistake since the ﬁrst integral is so clearly undeﬁned, but for more complicated functions. . .

Hence the old adage, “Differentiation is a science; integration an art.” To illustrate via an example, consider the function

f (x) = tan(e

cos(3x

) + 4x

). The product and chain rules allow one to explicitly compute the derivative

d f

1 + (e

cos(3x

) + 4x

)



cos(3x

) − 6xe

sin(3x

) + 12x



By contrast, the integration analogues (integration by parts/substitution) are essentially useless in attempting to ﬁnd an

explicit anti-derivative facilitating the integration of the same function via FTC II; for instance, the integral

tan(e

cos(3x

) + 4x

) dx

is likely impossible to evaluate explicitly and can only be approximated (e.g. via Riemann sums).

Exercises 34 1. Calculate the following limits:

(a) lim

x→0

dt (b) lim

h→0

3+h

2. Let f ( t) =











0 if t < 0

t if 0 ≤ t ≤ 1

4 if t > 1

(a) Determine the function F(x) =

f ( t) dt.

(b) Sketch F. Where is F continuous?

′

at the points of differentiability.

3. Let f be a continuous function on R.

(a) Deﬁne F(x) =

x+1

x−1

f ( t) dt. Carefully show that F is differentiable on R and compute F

′

(b) Repeat for the function G(x) =

sin x

f ( t) dt.

4. Recall Examples 4.23.4 and 4.26.3. Find all anti-derivatives F of f on [ 0, 1) ∪ (1, 2]. How many

satisfy

f (x) dx = F(2) − F(0)?

5. Consider integration by parts. Plainly

′

( t)v(t) dt is an anti-derivative of u

′

(x)v(x) by FTC I:

what does integration by parts say is another?

6. Use change of variables to integrate

√

1 − x

7. Use integration by parts and the substitution rule to evaluate

arcsin x dx for any b < 1.

8. Use integration by parts to evaluate

x arctan x dxfor any b > 0

9. Check that the assumptions of int by subs guarantee that both integrals are well-deﬁned (i.e.

that ( f ◦u)u

′

and f are integrable on the required intervals.

10. We prove a simpler version of the fundamental theorem of calculus.

(a) Suppose f is continuous on [a, b] and deﬁne F(x) =

f ( t) dt. For any c, x ∈ [a, b] where

c = x, prove that

m ≤

F(x) − F(c)

x −c

≤ M

where m, M are the maximum and minimum values of f (t) on the closed interval bounded

by c, x. Make sure to explain why m, M exist, and use this to deduce that F

′

( c) = f (c).

(b) Suppose f is continuous on [a, b] and that F is any anti-derivative of f on a, b (that is,

′

= f ). Use part (a) and the mean value theorem to prove that

f ( t) dt = F(b) − F(a).

36 Improper Integrals

The Riemann integral has several limitations. Even allowing for functions to be integrable on open

intervals (Exercise 32.6), the deﬁnition of

f (x) dx requires the following:

• That (a, b) be a bounded interval.

• That f be bounded on (a, b).

There is a natural way to extend the Riemann integral to unbounded intervals and functions: limits.

Deﬁnition 4.31. Suppose f : [a, b) → R satisﬁes the following properties:

• f is integrable on every closed bounded subinterval [a, t] ⊆ [a, b).

• Either b = ∞, or b is ﬁnite and f is unbounded at b,

The improper integral of f on [a, b) is deﬁned to be

f (x) dx := lim

t→b

−

f (x) dx

This is convergent or divergent in the same manner as the limit.

If an integral is improper at its lower limit then

f (x) dx := lim

s→a

f (x) dx.

If an integral is improper at both ends, choose any c ∈ (a, b) and deﬁne

f (x) dx = lim

s→a

f (x) dx + lim

t→b

−

f (x) dx

provided both one-sided improper integrals exist and the limit sum makes sense.

Theorem 4.13 says that the choice of c for a doubly-improper integral is irrelevant.

Many properties of the Riemann integral transfer to improper integrals, though not all. For example,

part 1 of Theorem 4.12 extends:

Theorem 4.32. If 0 ≤ f (x) ≤ g(x) on [a, b), then

f ≤

g, whenever the integrals exist (standard

or improper). In particular:

•

f = ∞ =⇒

g = ∞

•

g converges =⇒

f converges to a value ≤

We leave some of the detail to Exercise 36.7.

Examples 4.33. 1.

dx =

for any t > 0. Clearly

∞

dx = lim

t→∞

= ∞

More formally, the improper integral

∞

dx diverges to inﬁnity.

2. With f (x) = x

−4/3

deﬁned on [1, ∞),

∞

−4/3

dx = lim

t→∞

−4/3

dx = lim

t→∞

−3x

−1/3

= lim

t→∞

3 −3t

−1/3

= 3

3. Consider f (x) =

−x

on (−∞, ∞). On a bounded interval [0, t), we have

f (x) dx =

−x

dx =

−e

−x

= 1 − e

−t

−−→

t→∞

By symmetry, we conclude that

∞

−∞

−x

dx = 1 + 1 = 2

This example is important in probability: multiplying by

√

2π

, we have computed the the ex-

pectation of

when X is a normally-distributed random variable

) =

∞

−∞

√

2π

−x

dx =

4. If t ∈ [ 0, 1), we can use our knowledge of derivatives

sin

−1

x =

√

1−x

to evaluate

√

1 − x

dx = lim

t→1

−

√

1 − x

dx = lim

t→1

−

sin

−1

t =

and that, moreover

−1

√

1−x

dx = π. By comparison, we see that

√

1 − x

≤

√

1 − x

=⇒

−1

√

1 − x

dx ≤

−1

√

1 − x

dx = π

5. Improper integrals need not exist. For instance,

lim

t→∞

sin x dx = lim

t→∞

1 −cos t

diverges by oscillation.

Exercises 36 1. Use your answers from the previous section to decide whether the improper inte-

grals

arcsin x dx and

∞

x arctan x dx exist. If so, what are their values?

2. Let p be a positive constant. Prove the following:

dx =

(

1−p

if p < 1

∞ if p ≥ 1

∞

dx =

(

p−1

if p > 1

∞ if p ≤ 1

3. Explain why

f (x) dx = lim

t→b

−

f (x) dx holds, even when f is integrable on [a, b].

4. State a version of integration by parts modiﬁed for when

′

(x)v(x) dx is an improper inte-

gral. Now evaluate

∞

−4x

dx.

5. What is wrong with the following calculation?

∞

−∞

x dx = lim

t→∞



−t

= lim

t→∞

( t

−t

) = lim

t→∞

0 = 0

6. Prove or disprove: if

f and

g are convergent improper integrals, so is

f g.

7. Prove part of Theorem 4.32. Suppose 0 ≤ f (x) ≤ g(x) for all x ∈ [a, b), and that

g is a

convergent improper integral. Prove that

f converges and that

f ≤

Generalizing the Riemann Integral (non-examinable)

In the 1890’s, Thomas Stieltjes

offered a generalization of the Riemann integral.

Deﬁnition 4.34. Let α be a monotonically increasing function on an interval [a, b]. Given a partition

P = {x

, . . . , x

} of [a, b] and a function f , deﬁne the differences

∆α

= α(x

) − α(x

i−1

)

The upper/lower Darboux–Stieltjes sums/integrals are deﬁned analogously to the pure Riemann case:

U( f , P, α) =

∑

i=1

∆α

L( f , P, α ) =

∑

i=1

∆α

U( f , α) = inf U( f , P, α) L( f , α) = sup L( f , P, α)

f is Riemann–Stieltjes integrable of class R(α) if U( f , α) = L( f , α): we denote this value

f (x) dα.

The standard Riemann integral corresponds to α(x) = x. It is the ability to choose other functions α

that makes the Riemann–Stieltjes integral both powerful and applicable.

Standard Properties Most results in sections 32 and 33 hold with suitable modiﬁcations, as does the

discussion of improper integrals. For instance,

f ∈ R(α) ⇐⇒ ∃P such that U( f , P, α) − L( f , P, α) < ϵ

The result regarding piecewise continuity of f is a notable exception: if f and α are simultane-

ously piecewise continuous then f might not lie in R(α).

Weighted integrals If α is differentiable, then we obtain a standard Riemann integral

f (x) dα =

f (x)α

′

(x) dx

weighted so that f (x) contributes more when α is increasing rapidly.

Probability If α(a) = 0 and α(b) = 1, then α may be viewed as a probability distribution function.Its

derivative α

′

is the corresponding probability density function. For example:

1. The uniform distribution on [a, b] has α =

b−a

(x −a) so that

f (x) dα =

b − a

f (x) dx

Since α

′

is constant, the integrals weigh all values of x uniformly.

2. The standard normal distribution has α(x) =

−∞

√

2π

−t

dt. The fact that α

′

√

2π

−x

is maximal when x = 0 reﬂects the fact that a normally distributed variable is clustered

near its mean.

In all cases,

f (x) dα = E( f (X)) computes an expectation (see, for instance, Example 4.33.3).

Stieltjes was Dutch; for the pronunciation try ‘steelchez.’

Non-differentiable α A major ﬂexibility comes when we allow α to be non-differentiable, or even dis-

continuous! For example, given a partition Q = {s

, . . . , s

} of [a, b], and a positive sequence

( c

)

k=1

, deﬁne

α(x) =











0 if x = a

∑

i=1

if x ∈ (s

k−1

, s

]

This is an increasing step function on [a, b]. The Riemann–Stieltjes integral becomes a weighted

sum

f (x) dα =

∑

i=1

f ( s

)

Taking instead an inﬁnite sequence (s

) ⊆ [a, b] results in an inﬁnite series, which helps explain

why so many results for series and integrals look similar!

This also touches on probability. For example, let p ∈ [0, 1], n ∈ N, and s

= k on the interval

[0, n]. If c

(

)

(1 − p)

n−k

, then

f (x) dα =

∑

k=0





(1 − p)

n−k

f (x) = E( f (X))

is the expectation of f (X) when X ∼ B(n, p) is a binomially distributed random variable.

Integrals and Convergence

The Lebesgue integral is another common generalization. Its main purpose is to permit the transfer

of integrability to the limit of a sequence of integrable functions.

To see the problem, consider the

sequence

: [0, 1] → R : x 7→

(

1 if x =

∈ Q with q ≤ n

0 otherwise

Each f

is piecewise continuous and thus Riemann integrable with

(x) dx = 0. However, the

pointwise limit of f

is the function

f (x) =

(

1 if x ∈ Q

0 if x ∈ Q

which is not Riemann integrable. In the Lebesgue theory, the limit f turns out to be integrable with

integral 0, so that

lim

n→∞

(x) dx =

lim

n→∞

(x) dx

Recall that the interchange of limits and integrals would be automatic if the convergence f

→ f

were uniform: of course the convergence isn’t uniform here.

Recall how uniform convergence does this for continuity.