Lecture 9. Maxima and Minima

Motivation

Taylor's Theorem in $\mathbb{R}^n$

Multi-indeces are useful to deal with higher order partial derivatives. For $\alpha =(\alpha_1,...,\alpha_n)\in \mathbb{N}^n\cup\{0\}$ set $|\alpha|=\sum_{j=1}^n \alpha_j$ and define $$ \frac{\partial^{|\alpha|}f}{\partial x^\alpha}=\frac{\partial^{|\alpha|}f }{\partial x_1^{\alpha_1} \cdots \partial x_n^{\alpha_n}} $$

Theorem (Taylor's theorem)

Let $U\subset \mathbb{R}^n$ be open and convex and $\bar x\in U$. Let $f\in \operatorname{C}^{m+1} (U)$, that is, $(m+1)$ times continuously differentiable. Then $$ f(x) = \sum_{|\alpha|=0}^m {1\over \alpha!} {\partial^{|\alpha|} f(\bar x) \over \partial x^\alpha } (x-\bar x)^\alpha +\sum_{|\alpha| = m+1} {1\over \alpha!} {\partial^{|\alpha|} f(\xi)\over \partial x^\alpha } (x-\bar x)^{\alpha}, $$ where $\alpha! =\prod _{j=1}^n\alpha_j!$

Proof

Extremal problems

Definition (Extrema)

Let $U$ be a domain in $\mathbb{R}^n$, $f:U\to \mathbb{R}$ be continuous, and $x_0\in U$. Then we say that
(i) $f$ attains a local maximum [minimum] at $x_0$ if there is a $\delta>0$ such that $$ f(x_0)\geq f(x)\quad [ f(x_0)\leq\, f(x) ]\, ,\:x\in \mathbb{B}(x_0, \delta)\, . $$
(ii) $f$ attains a global maximum [minimum] on $U$ at $x_0$ if $$ f(x_0)\geq f(x)\quad[ f(x_0)\leq \, f(x) ]\, ,\:x\in U\, . $$ In these cases it is also said that $x_0$ is a (local) maximizer/minimizer.

A pervasive issue in mathematics is how to find maxima (or maximizers) and minima of $f$ in $U$ if they exist?

Proposition

Let $f\in \operatorname{C}^1(U)$. If $x_0\in U$ is a local maximizer or a local minimizer of $f$ in $U$, then $\nabla f(x_0) = 0$.

Proof

Definition (Critical points)

A point $x_0\in U$ is called a critical point of $f$ in $U$ if either $\nabla f(x_0) =0$ or $\nabla f(x_0)$ does not exist.

Now, if $x_0$ is a critical point for $f$ in $U$, how can it be determined whether $x_0$ is a local maximizer, minimizer, or a saddle point?

Definition (Positive definite matrix)

Let $A$ be an $n\times n$ symmetric matrix. Then
(i) $A$ is positive [negative] definite if and only if there is a constant $\lambda>0$ such that $$ \langle A x, x\rangle \ge \lambda |x|^2 \quad [ \langle A x, x\rangle \le -\lambda |x|^2 ],\quad x \in \mathbb{R}^n . $$
(ii) $A$ is positive semi-definite if and only if $$ \langle Ax, x\rangle \ge 0,\quad x\in\mathbb{R}^n $$

Theorem (Second derivatives test)

Let $f\in \operatorname{C}^2(U)$ and $x_0\in U$ be a critical point of $f$. Defining the Hessian $H_f(x_0)$ of $f$ at $x_0$ by $$ H_f(x_0) =\begin{bmatrix} \frac{\partial ^2f}{\partial x_1^2} & \cdots & \frac{\partial ^2f}{\partial x_1\partial x_n} \\ \vdots & & \vdots \\ \frac{\partial^2f}{\partial x_n \partial x_1} & \cdots & \frac{\partial ^2f}{\partial x_n^2}\end{bmatrix}(x_0)\, , $$ the following statements hold
(i) If $H_f(x_0)$ is positive definite, then $x_0$ is a local minimizer.
(ii) If $H_f(x_0)$ is negative definite, then $x_0$ is a local maximizer.
(iii) If $H_f(x_0)$ is indefinite, then $x_0$ is a saddle point.

Example

Let $f(x_1,x_2) = x_1^2 + x_2^2$ for $x\in \mathbb{R} ^2$. Then $(0,0)$ is a critical point and $$ H_f(0,0) = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} $$ is positive definite. Therefore $(0,0)$ is a local minimizer.

Example

Let $f(x_1,x_2) = -(x_1^2 + x_2^2)$ for $x\in \mathbb{R} ^2$. Then $(0,0)$ is a critical point and $$ H_f(0,0) = \begin{bmatrix}-2 & 0 \\ 0 & -2 \end{bmatrix} $$ is negative definite. Therefore $(0,0)$ is a local maximizer.

Example

Let $f(x_1,x_2) = x_1^2 - x_2^2$ for $x\in \mathbb{R} ^2$. Then $(0,0)$ is a critical point and $$ H_f(0,0) =\begin{bmatrix} 2 & 0 \\0 & -2 \end{bmatrix} $$ is indefinite. Therefore $(0,0)$ is a saddle point.

Example

Let $f(x,y) = x^4 + y^4$ for $x\in \mathbb{R} ^2$. Then $(0,0)$ is a critical point and $$ H_f(x,y) = \begin{bmatrix}12x^2 & 0 \\ 0 & 12y^2 \end{bmatrix},\quad H_f(0,0) = \begin{bmatrix}0 & 0 \\ 0 & 0 \end{bmatrix}\, $$ The second derivative test is therefore inconclusive. However, $(0,0)$ is the global minimizer for $f$ in $\mathbb{R}^2$.

Theorem

A function $f\in \operatorname{C}^2(U)$ is convex if and only $H_f(x)$ is positive semidefinite, i.e. if and only if $$ \sum_{i,j=1}^n {\partial^2 f(x) \over \partial x_i \partial x_j} \xi_i \xi_j \ge 0,\: \xi\in\mathbb{R}^n, \quad x\in U. $$

Proof

Theorem

If $f\in \operatorname{C}^2(U)$ is convex, every critical point of $f$ is a global minimizer.

Proof

Example

Find all global minimizers of $f:\mathbb{R}^2\to \mathbb{R}$ where $$ f(x, y)=x^4+y^4-32x-2y^2,\: (x,y)\in \mathbb{R} ^2\, . $$

Proof

Lagrange Multipliers

Often extrema of functionals need to be found which are subject to additional constraints, such as in the case where extrema are sought in the zero set of some function only. For given $f,g:\mathbb{R}^n\to \mathbb{R}$, a typical example is $$ \hbox{Min} \Big\{f(x): x\in[g=c] \Big\} , $$ which is interpreted as the problem of finding extremal points of $f$ on the level set of $g$ $$ [g=c]:=\big\{ x\in \mathbb{R}^n\, :\, g(x)=c\big\} $$ for a given constant $c$ (which can w.l.o.g. be taken to vanish). If you think geometrically, it is easy to see that, if $x_0$ is an extremal point of $f$ on $[g=c]$, then $\nabla f(x_0)$ has to point in a direction orthogonal to $[g=c]$ at the point $x_0$ (otherwise the function value could be increased/decerased by moving in the appropriate direction along the gradient and still remaining in the set $[g=c]$). Since $\nabla g(x_0)$ is the direction perpendicular to $[g=c]$ at $x_0$ it follows that $$\begin{cases} \nabla f(x_0) =\lambda \nabla g(x_0)&\\ g(x_0) =c&\end{cases} $$ for some $\lambda\in \mathbb{R}$. The parameter $\lambda$ is called Lagrange multiplier for the probelm. This is, of course, provided the functions involved are smooth and $x_0$ is not critical for $g$.

Example

Find maximum and minimum of $f(x,y,z) = x+y$ on the unit sphere $S^2=[x^2 + y^2 + z^2=1]$.

Discussion