Many important phenomena do not take place in a flat space. Just think
of taking a hike to your favorite mountain or of the relativistic
universe Einstein gave us. While, eventually, we would like to be
able to perform analysis on a nonlinear object, the first step is to
understand simpler matters as, say, convergence in such nonlinear
spaces. In $\mathbb{R}$ we measure the distance between two points $x$
and $y$ as the "space" in between them, that is,
$d(x,y)=|y-x|$. This obviously requires access to addition. Think now
of two points $A$ and $B$ on a donut.
How distant are they?
While not the only one, a "natural way" to go about defining a
distance would be to take the length of the shortest path from $A$ to
$B$. Something like what is depicited in the figure.
Keeping this example in mind, tet us start with a set $X$ and see how
a distance function $d$ should look like.
Where should the function be defined? What values should it take on?
Well, since it needs to deliver the distance between two points
$x,y\in X$ and since we think of lengths as positive, we need to
have
$$
d:X\times X\to [0,\infty).
$$
We also would like that there be no distance from a point to itself
$$
d(x,x)=0,\: x\in X
$$
since we can always stay put as a way to go from $x$ to $x$. We would,
however, like to have that
$$
d(x,y)>0\text{ if }x\neq y,
$$
since we expect it to be some way to go between two any distinct
points. Finally, our intuition above, also tells us to require that
$$
d(x,z)\leq d(x,y)+d(y,z)\text{ for }x,y,z\in X.
$$
This follows from the fact that we can obtain a path from $x$ to $z$
by concatenating the shortest paths from $x$ to $y$ and from $y$ to
$z$. The latter might, however, not be the shortest, hence the
inequality. These properties are precisely what will yield the
definition of a metric. As an exercise, show by concrete example using the
minimal path intuition that the triangle inequality above can hold with
a strict inequality.
Metric Spaces
Definitions (Distance Functions and Metric Spaces)
(i) A distance function $d$ on a non-empty set $X$ is
a function $d: X\times X\to [0, \infty)$ satisfying the following
three properties
$\quad$ 1. (symmetry) $d(x, y)=d(y,x)$ for all $x, y\in X$.
$\quad$ 2. (positivity) $d(x,y)\geq 0$ for any $x, y\in X$ and
$d(x,y)=0\iff x=y$.
$\quad$ 3. (triangle inequality) $d(x, z)\le d(x, y)+d(y,z)$ for all
$x, y, z\in X$.
(ii) A metric space $(X, d)$ consists of a set $X$ and
a distance function $d$ on $X$.
Question
What distance functions or metric spaces have you ever seen?
The followings are standard examples of simple metric spaces.
Example
On $\mathbb{R}=(-\infty, \infty)$, the distance between two numbers
$x, y\in \mathbb{R}$ is defined by
$$
d(x, y)=|x-y|.
$$
It is easy to verify $d$ satisfies the three conditions
of a distance function on $\mathbb{R}$.
Example
On $\mathbb{R}^n$, the distance between two points $x=(x_1,\dots,
x_n), y=(y_1,\dots, y_n)\in \mathbb{R}^n$ is
defined by
$$
d_2(x, y)=\|x-y\|_2=\Big(\sum_{j=1}^n (x_j-y_j)^2\Big)^{1/2}
$$
It is easy to verify that, for any $x, y\in \mathbb{R}^n$, one has
$d_2(x,y)=d_2(y,x)$ and that $d_2(x,y)=0$ if and only if $x=y$. As for
the triangle inequality, it suffices to prove that
$$
\|x+y\|_2\le \|x\|_2+\|y\|_2\iff \|x\|_2^2+2x\cdot y +\|y\|_2^2\le
(\|x\|_2+\|y\|_2)^2
$$
since $d_2(x,y)=\|x-y\|_2$. This is the same as Schwarz's inequality
$$
x\cdot y\le \|x\|_2\|y\|_2.
$$
Question
What other distance functions on $\mathbb{R}^n$ have you ever seen ?
Example
For any $1\le p < \infty$, the function
$$
d_p(x, y)=\|x-y\|_p=\Big(\sum_{j=1}^n |x_j-y_j|^p \Big)^{1/p}
$$
is a distance function on $\mathbb{R}^n$.
For any two points $x, y\in \mathbb{R}^n$, it is easy to see that
$d_p(x,y)=d_p(y,x)$, that $d_p(x, y)\ge 0$ and that $d_p(x, y)=0$ if
and only if $x=y$. We prove the triangle inequality for the case
$p=1$. Since $d_p(x,y)=\|x-y\|_p$. It suffices to prove
$$
\|x+y\|_p\le \|x\|_p+\|y\|_p,\quad x, y\in \mathbb{R}^n.
$$
When $p=1$, it is easy to see that
$$
\|x+y\|_1\le (|x_1|+\cdots+|x_n|+|y_1|+\cdots+|y_n|=\|x\|_1+\|y\|_1.
$$
Example
The function
$$
d_\infty(x, y)=\max \{|x_j-y_j|: 1\le j\le n\}
$$
is a distance function on $\mathbb{R}^n$.
For any two points $x, y\in \mathbb{R}^n$, it is easy to see that
$d_\infty (x, y)=d_\infty(y,x)$, $d_\infty(x, y)\ge 0$ and that
$d_\infty(x, y)=0$ if and only if $x=y$. Next we show that $d_\infty$
satisfies the triangle inequality, i.e. that
$$
d_\infty(x,z)\le d_\infty(x,y)+d_\infty(y,z).
$$
Without loss of generality, we may assume that
$\|z-x\|_\infty=|z_1-x_1|$, so that
\begin{equation*}
\|z-x\|_\infty=|z_1-x_1|\le |z_1-y_1|+|y_1-x_1|
\le\|z-y\|_\infty+\|y-x\|_\infty
\end{equation*}
Exercise
For any $x, y\in \mathbb{R}^n$, prove
$$
\lim_{p\to\infty }d_p(x, y)=d_{\infty}(x, y).
$$
Exercise
Let $A=[a_{jk}]$ be a $n\times n$ positive definite symmetric matrix
with real entries. The function
$$
\|x\|_A=\Big(\sum_{j, k=1}^n a_{jk} x_j x_k\Big)^{1/2}
$$
is a norm on $\mathbb{R}^n$. Moreover, it induces a distance
function $d_A(x, y)=\|x-y\|_A$ on $\mathbb{R}^n$.
Question
Given a metric space $(X, d)$ is it possible to construct a bounded
and equivalent distance function $\tilde{d}$ on $X$?
Example
Let $(X, d)$ be a metric space. Then
$$
\tilde{d}(x, y)={d(x, y)\over 1+d(x, y)},\: x,y\in X,
$$
is a metric on $X$. It is clearly bounded by $1$.
For any $x, y\in \mathbb{R}^n$, it is clear that
$\tilde{d}(x,y)=\tilde{d}(y,x)$ and $\tilde{d}(x, y)=0$ if and only if
$x=y$. For the proof of the triangle inequality, let
$$
g(t)=\frac{t}{1+t},\quad t\in (0,\infty),
$$
then
$$
g'(t)={1\over (1+t)^2}>0,\quad t\in [0, \infty)
$$
This implies that $g(t)$ is increasing function on
$[0,\infty)$. Therefore, for any $x, y, z\in \mathbb{R}^n$, one has
\begin{eqnarray*}
\tilde{d}(x,z)&=&g(d(x,z))
\le g(d(x,y)+d(y, z))\\
&=& {d(x,y)+d(y, z)\over 1+d(x,y)+d(y, z)}
\le {d(x,y)\over 1+d(x,y)}+{d(y, z)\over 1+d(y, z)}
=\tilde{d}(x,y)+\tilde{d}(y,z).
\end{eqnarray*}
The proof is complete.
Example
For any non-empty set $X$, the discrete metric $d_0$ on $X$ is defined
by
$$
d_0(x,y):=\begin{cases}
1\, ,&\text{if } x\ne y\, ,\\
0\, ,&\text{if } x=y\, .
\end{cases}
$$
$(X, d_0)$ is a called a discrete metric space.
For any $x, y\in X$, it is easy to see that $d_0(x, y)=d_0(y,x)$ and
$d_0(x, y)=0 $ if and only if $x=y$. For $x,y, z\in X$, without loss
of generality, one may assume that $x\ne z$, then either $x\ne y$
or $z\ne y$. Thus
$$
d_0(x,z)=1\le d_0(x,y)+d_0(y,z).
$$
By definition $(X, d_0)$ forms a metric space.
Topology of Metric Spaces
Given a metric space $(X, d)$, one may use the distance function $d$
on $X$ to define balls in $X$ and then, in turn, use balls to define
open sets. The collection of all open sets in $X$ give the topology
for the metric space $(X,d)$.
Definitions (Balls and Open Sets)
Let $(X, d)$ be a metric space. Then
(i) A ball centered at $x_0\in X$ with radius $r$ is
defined by
$$
B(x_0, r)=\{x\in X: d(x, x_0)< r\}.
$$
(ii) Let $A$ be a set of $X$. A point $x_0\in A$ is said to be an
interior point of $A$ if there is $r>0$ such that
$B(x_0, r)\subset A$.
(iii) We say that $A$ is open if every point of $A$ is
an interior point of $A$.
Proposition
Let $(X, d)$ be a metric space. Then
(i) The ball $B(x_0, r)$ is open for any $x_0\in X$ and $r>0$.
(ii) If $\overset{\circ}{A}$ denotes the subset of $A$ consisting of
all its interior points, then $\overset{\circ}{A}$ is open.
(i) To prove that $B(x_0, r)$ is open, by definition, one needs to
show that every point $y\in B(x_0, r)$ is an interior point of
$B(x_0, r)$. Let $r(y)=r-d(y, x_0)>0$, we claim that $B(y,
r(y))\subset B(x_0, r)$. In fact, for any $x\in B(y, r(y))$, by the
triangle inequality, one has
$$
d(x, x_0)\le d(x, y)+d(y, x_0) < r(y)+d(y,x_0)=r
$$
This implies $x\in B(x_0, r)$ and $B(y, r(y))\subset B(x_0, r)$. The
claim is proved and $y$ is an interior point. Therefore, $B(x_0, r)$
is open.
(ii) For any $x_0\in\overset{\circ}{A}$, then there is a ball $B(x_0,
r)\subset A$. Notice that if $B(x_0, r)\subset A$ then $B(x_0,
r)=B(x_0, r)_0\subset\overset{\circ}{A}$ (see next exercise). This
implies $\overset{\circ}{A}$ is open.
Exercise
Let $(X, d)$ be a metric space. If $A\subset B$ then $
\overset{\circ}{A}\subset \overset{\circ}{B}$.
Complete Metric Spaces
Definition (Cauchy Sequences and Completeness)
Let $(X, d)$ be a metric space.
(i) We say that a sequence $\{x_n\}_{n=1}^\infty$ converges to a limit
$x\in X$ iff
$$
\lim_{n \to \infty} d(x_n,x) = 0.
$$
(ii) We say that a sequence $\{x_n\}_{n=1}^\infty$ is a Cauchy
sequence if, for any $\epsilon>0$, there exists $N$ such that if $m,
n\ge N$ then
$$
d(x_n, x_m)\leq\epsilon.
$$
(iii) The metric space $(X, d)$ is called complete if every Cauchy
sequence converges in $X$.
Example
(a) $(\mathbb{R}^n, d_2)$ is a complete metric space.
(b) $(\mathbb{Q}^n, d_2)$ is not a complete metric space.
(c) Determine whether $(\mathbb{Z},d_0)$ is complete or not.
(a) Let $\{x^m=(x_1^m,\cdots, x_n^m)\}_{m=1}^\infty$ be a
Cauchy sequence in $(\mathbb{R}, d_2)$. Then for each $1\le j\le n$,
$\{x_j^m\}_{m=1}^\infty$ is a Cauchy sequence in $\mathbb{R}$. There
is a $x_j\in \mathbb{R}$ such that $\lim_{m\to\infty} x_j^m=x_j$. Let
$x=(x_1,\cdots, x_n)$. Then
$$
d_2(x^m, x)=\|x^m-x\|_2\le \sum_{j=1}^n |x^m_j-x_j|\to 0 \quad \text{ as } m\to \infty.
$$
(b) Let $x_n=(1+\frac{1}{n})^n$. Then
$$
\lim_{n\to \infty} x_n=e\in \mathbb{R}
$$
Then $\{x_n\}_{n=1}^\infty$ is a Cauchy sequence in $(\mathbb{Q}^n, d_2)$, it does not have
a limit in $(\mathbb{Q}^n, d_2)$. So, $(\mathbb{Q}^n, d_2)$ is not
complete.
(c) $(\mathbb{Z},d_0)$ is a complete metric space. Since for
any Cauchy sequence $\{x_n\}_{n=1}^\infty$, there is $N$ such that if
$m>n\ge N$, one has
$$
d_0(x_m, x_n)<1/2
$$
This implies that $x_m=x_n=X_N$ if $m,n\ge N$. Therefore, $x_n\to X_N$
as $n\to\infty$.
Definition (Contractive Sequences)
A sequence $\{x_n\}_{n=1}^\infty$ in $X$ is said to be
contractive iff there is a constant
$c\in [0, 1)$ such that
$$
d(x_n, x_{n+1}) \le cd (x_{n-1}, x_n)\quad \text{ for }n=2,3,\dots.
$$
Theorem
Every contractive sequence in a metric space $(X,d)$ is a Cauchy
sequence.
Using the triangle inequality and the inequality defining
contractivity we see that
\begin{eqnarray*}
d(x_m,x_n)&\le& d(x_m, x_{n+1}) + d(x_{n+1}, x_n)\\
&\le&\sum_{j=1}^{m-n} d(x_{n+j}, x_{n+j-1})
\le \sum_{j=1}^{m-n} c^j d(x_n, x_{n-1})\\
&\le & \sum_{j=1}^{m-n} c^j c^{n-2} d(x_2, x_1)
=\dfrac{1-c^{m-n}}{1-c} c^{n-1} d(x_2, x_1)
\le \frac{ c^{n-1}}{(1-c) } d(x_2, x_1).
\end{eqnarray*}
Observing that
$$
\frac{c^{n-1}}{1-c} d(x_1, x_2) \to 0\text{ as }n \to \infty\, ,
$$
we can find $N\in \mathbb{N}$ such that
$$
c^n / (1-c) \cdot d(x_2, x_1) < \epsilon\, ,
$$
so that $d(x_m, x_n)\leq\epsilon$, whenever $m\geq n \ge N$. This
proves the theorem.
Corollary
In a complete metric space $(X, d)$, every contractive sequence has a
limit in $X$.
Definition (Contractive Maps)
Let $(X, d)$ be a metric space. A map $f: X\to X$ is said to be a
contractive map if there is $c\in (0,1)$ such that
$$
d\bigl(f(x), f(y)\bigr)\le c\, d(x, y),\quad x, y\in X.
$$
Theorem (Banach Fixed Point Theorem)
In a complete metric space $(X, d)$, every contractive map $f:X\to X$ has a unique
fixed point in $X$.
The uniqueness is obvious. To prove the existence. Let $x_0\in X$ be arbitrary and define
$$
x_n=f(x_{n-1})\quad \text{ for } \: n=1,2,3,\dots.
$$
Then $\{x\}_{n=1}^\infty$ is a contractive sequence and therefore has a limit $x_\infty\in
X$. Since $f$ is continous, one can easily see that $x_\infty=f(x_\infty)$.
Remark
Strict contractivity is necessary for the validity of Banach's fixed
point theorem. If $f$ only satisfies $d \bigl( f(x),f(y)\bigr)\leq
d(x,y)$ for $x,y\in X$, it won't possess a fixed-point in general.
Let $f(x) = x + \frac{1}{1+e^x}$ and $X=\mathbb{R}$. Then
$$
f'(x) = 1 -\frac{e^x}{(1+e^x)^2} \in (0,1).
$$
and so
$$
|f(x) - f(y)| = |f'(\xi)||x-y| =\Big |1 - \frac{e^\xi}{(1+e^\xi)^2}
\Big||x-y| < |x-y|.
$$
But $f(x)=x$ has no solution in $\mathbb{R}$.
Proposition
If $\{x_n\}_{n=1}^\infty$ and $\{y_n\}_{n=1}^\infty$ are Cauchy
sequences in a metric $(X,d)$, then $\bigl(d(x_n, y_n)\bigr)_{n\in
\mathbb{N} }$ is a convergent sequence in $\mathbb{R}$.
Observe that
\begin{align*}
d(x_n, y_n) - d(x_m, y_m)& \le d(x_n, x_m) + d(x_m, y_n) - d(x_m, y_m)\\
& \le d(x_n, x_m) + d(x_m, y_m) +d(y_m, y_n) - d(x_m, y_m) \\
& = d(x_n, x_m) + d(y_m, y_n)
\end{align*}
for all $m,n\in \mathbb{N} $. By interchanging the roles of $x_n$ and $y_m$ it
follows that
$$
|d(x_m, y_m) - d(x_n, y_n) |\le d(x_m, x_n) + d(y_m, y_n)
$$
By assumption there is $N\in \mathbb{N} $ such that
$$
d(x_m, x_n)\leq\epsilon/2\text{ and }d(y_m, y_n)\leq\epsilon/2\text{
if }m,n\geq N.
$$
Then
$$
|d(x_n, y_n) - d(x_m, y_m) | < \epsilon/2 + \epsilon/2 =
\epsilon\text{ when }m,n \ge N.
$$
Therefore, $\bigl(d(x_n, y_n)\bigr)_{n\in \mathbb{N} }$ is a Cauchy
sequence in $\mathbb{R}$ and by completeness, it possesses a limit.
Union and intersection of sets
Let $(X,d)$ be a metric space and let $E \subset X$ be a set in
$X$. We denote the empty set by $\emptyset$ and by $E^c = \{x \in X: x
\not \in E\} = X \setminus E$ the complement of $E$ in $X$. Given
subsets $A$ and $B$ of $X$ we denote by
$$
A \cup B = \{ x : x\in A \mbox{ or } x \in B\}
$$
the union and by
$$
A \cap B = \{ x \in X: x\in A\mbox{ and } x\in B\}
$$
the intersection of the subsets $A$ and $B$.
Proposition
(i) Let $(A_\alpha)_{\alpha \in I}$ be a family (finite or infinite)
of set in $X$. Then
$$
\left ( \bigcup_{\alpha \in I} A_\alpha \right )^c = \bigcap_{\alpha
\in I} A_\alpha^c
$$
(ii) Let $(A_\alpha)_{\alpha \in I}$ be a family of open sets in a
metric space $(X, d)$. Then $\cup_{\alpha\in I} A_\alpha $ is open.
(iii) If $A_1,\cdots, A_m$ are open in a metric space $(X, d)$, then
$\cap_{j=1}^m A_j$ is also open
(i) It suffices to notice that
\begin{eqnarray*}
x \in \left ( \bigcup_{\alpha \in I} A_\alpha \right ) ^c \iff x \not \in
\bigcup_{\alpha \in I} A_\alpha\iff x \not \in A_\alpha \mbox{ for
any } \alpha \in
I\\
\iff x \in A_\alpha^c\mbox{ for all } \alpha \in I \iff x \in
\bigcap_{\alpha\in I} A_\alpha^c\, .
\end{eqnarray*}
(ii) is obvious.
(iii) For any $x_0\in \cap_{j=1}^m A_j$, there are $r_j>0$ such that
$B(x_0, r_j)\subset A_j$ for $1\le j\le m.$ Let
$$
r_0=\min\{ r_j: 1\leq j\leq m \}.
$$
Then $B(x_0, r_0)\subset \cap _{j=1}^m A_j$. Therefore, $\cap_{j=1}^m
A_j$ is open.