$
\def\P{\mathsf{P}}
\def\R{\mathbb{R}}
\def\defeq{\stackrel{\tiny\text{def}}{=}}
\def\c{\,|\,}
\def\bb{\boldsymbol}
\def\diag{\operatorname{\sf diag}}
$

As described in Chapter A, the Euclidean space $\R^d=R\times\cdots\times\R$ is the set of all ordered $d$-tuples or vectors over the real numbers $\R$ (when $d=1$ we refer to the vectors as scalars). We denote a vector in $\R^d$ when $d>1$ in bold and refer to its scalar components via subscripts. For example the vector $\bb x=(x_1,\ldots,x_d)$ has $d$ scalar components $x_i\in\R$, $i=1,\ldots,d$. Together with the Euclidean distance \[ d(\bb x,\bb y) = \sqrt{\sum_{i=1}^d (x_i-y_i)^2},\] the Euclidean space is a metric space $(\R,d)$ (we prove later later in this chapter that the Euclidean distance above is a valid distance function).

- symmetry: $g( \bb v,\bb u) = g(\bb u,\bb v)$
- bi-linearity: $g(\alpha \bb u+\beta \bb v,\bb w) =\alpha g(\bb u,\bb w) + \beta g( \bb v,\bb w)$
- positivity: $g(\bb v,\bb v)\geq 0$.

- non-negativity: $h(\bb x)\geq 0$
- positivity: $h(\bb x)=0$ if and only if $\bb x=\bb 0$
- homogeneity: $h(c \,\bb x)=|c|\, h(\bb x)$
- triangle inequality: $h(\bb x+\bb y) \leq h(\bb x) + h(\bb y)$.

The Euclidean norm is the most popular norm. The more general $L_p$ norm \begin{align*} \|\bb x\|_p=\left(\sum_{i=1}^d |x_i|^p\right)^{1/p}, \qquad p\geq 1, \end{align*} has the following special cases \begin{align*} \|\bb x\|_2 &=\|\bb x\| \qquad \text{ (the Euclidean norm)}\\ \|\bb x \|_1 &= \sum_{i=1}^d |x_i|\\ \|\bb x\|_{\infty} &= \max\{|x_1|,\ldots,|x_d|\} \qquad \text{ (achieved by letting } p\to\infty). \end{align*}

The $L_p$ norm can be further generalized as follows.

Weighted norms are convenient for emphasizing some dimensions over others. For example, if $\diag(\bb w)$ is an all zero matrix, except for its diagonal \[ \diag(\bb w) = \begin{pmatrix} w_1 &0 &\cdots& 0\\ 0 & w_2&\cdots&0\\ \vdots & \vdots & \ddots &\vdots\\ 0 & \cdots & 0& w_d \end{pmatrix}, \] the corresponding weighted $L_2$ norm is \[ \|\bb x\|_{2,\diag(\bb w)} = \sqrt{\sum_{i=1}^d w_i^2 x_i^2}.\]

The propositions above confirm that the Euclidean space $(\R^d,d)$, where \[ d(\bb x,\bb y) = \sqrt{\sum_{i=1}^d (x_i-y_i)^2},\] is a metric space.

Figure B.4.1 below shows contour lines of four $L_p$ norms in the left column and four $L_{p,W}$ norms in the right column, where $W=\diag(2,1)$. A non-diagonal matrix $W$ would result in rotated versions of the figures in the right column.

The R code below generates the left and right columns of the figure.

s = seq(-1, 1, length.out = 50) R = expand.grid(x1 = s, x2 = s) # generate left column (Lp norms) D = rbind(R, R, R, R) D$Norm[1:2500] = abs(D$x1[1:2500]) + abs(D$x2[1:2500]) D$Norm[2501:5000] = ((abs(D$x1[2501:5000]))^1.5 + (abs(D$x2[2501:5000]))^1.5)^(2/3) D$Norm[5001:7500] = ((abs(D$x1[5001:7500]))^2 + (abs(D$x2[5001:7500]))^2)^0.5 D$Norm[7501:10000] = pmax(abs(D$x1[7501:10000]), abs(D$x2[7501:10000])) D$p = c(rep("Lp norm, p=1", 2500), rep("Lp norm, p=1.5", 2500), rep("Lp norm, p=2", 2500), rep("Lp norm, p = inf", 2500)) ggplot(D, aes(x1, x2, z = Norm)) + facet_grid(p ~ .) + stat_contour(bins = 4)

# generate right column (weighted Lp norms) D = rbind(R, R, R, R) D$Norm[1:2500] = abs(2 * D$x1[1:2500]) + abs(D$x2[1:2500]) D$Norm[2501:5000] = ((2 * abs(D$x1[2501:5000]))^1.5 + (abs(D$x2[2501:5000]))^1.5)^(2/3) D$Norm[5001:7500] = ((2 * abs(D$x1[5001:7500]))^2 + (abs(D$x2[5001:7500]))^2)^(1/2) D$Norm[7501:10000] = pmax(abs(2 * D$x1[7501:10000]), abs(D$x2[7501:10000])) D$p = c(rep("weighted Lp norm, p=1", 2500), rep("weighted Lp norm, p=1.5", 2500), rep("weighted Lp norm, p=2", 2500), rep("weighted Lp norm, p=inf", 2500)) ggplot(D, aes(x1, x2, z = Norm)) + facet_grid(p ~ .) + stat_contour(bins = 4)

Figure B.4.1:
Equal height contours of the $L_p$ norm (left column) and weighted $L_{p,W}$ norm with $W=\diag(2,1)$ (right column), in the two dimensional case $d=2$. Each row corresponds to a different value of $p$. As $p$ increases from 1 to $\infty$ the contours change their shape from diamond-shape to square-shape.

If $\bb x\in B_{\epsilon}(\bb 0)$ for a given $\epsilon$ then for all $n\in\mathbb{N}$ we have $\min(|x_n|,1) < n\epsilon$. There exists some $N$ such that $n>N$ corresponds to $n\epsilon > 1$, implying that the components $x_{N+1}, x_{N+2},\ldots$ of vectors $\bb x\in B_{\epsilon}(\bb 0)$ are unrestricted. Similarly, for $\bb x\in B_{\epsilon}(\bb 0)$ the components $x_n$, where $n < N$, satisfy $|x_n| < n\epsilon$. It follows that points in $B_{\epsilon}(\bb 0)$ are an intersection of a finite number of sets of the form \begin{align} \tag{*} \R\times \cdots \times \R\times (a,b) \times \R\times\cdots. \end{align} (we refer to sets of the form (*) as simple cylinders.) Since $\bar d(\bb x+\bb c,\bb z+\bb c)=\bar d(\bb x,\bb z)$, we also have that $B_{\epsilon}(\bb y)$ is an intersection of a finite number of simple cylinders or sets of the form expressed in (*).

On the other hand, let $A$ be an intersection of a finite number of simple cylinders. Then for each $\bb x\in A$ we can construct $B_{\epsilon}(\bb y)$ such that $\bb x\in B_{\epsilon}(\bb y)\subset A$ (taking $\epsilon$ to be sufficiently small). This implies that $A$ is a union of open balls and is therefore an open set. A union of intersections of a finite number of simple cylinders is a union of open sets and therefore is open also.

In summary, we have thus demonstrated that in the space $(\R^{\infty},\bar d)$, the set of open sets is equivalent to the set of unions of intersections of a finite number of simple cylinders.

The convergence problems mentioned in Example~B.4.3 leads to the common practice of defining the metric structure on $\R^{\infty}$ using the distance function $\bar d$ in Example B.4.4 rather than the Euclidean distance. In fact, whenever we refer to the metric structure of $\R^{\infty}$ we will assume the metric structure of $\bar d$ derived above. This metric structure is commonly referred to in the literature as the product topology of $\R^{\infty}$.

The metric structure of the Euclidean space simplifies some of the properties described in the previous chapter.

Let $G$ be an open set and ${\bb x}\in G$. By Proposition B.1.2 there exists $r>0$ for which $B_r({\bb x})\subset G$. Since for every real number there is a rational number that is arbitrarily close, we can select $B_{r'}({\bb x}')$ where $r',{\bb x}'$ are rationals such that ${\bb x}\in B_{r'}({\bb x}') \subset B_r({\bb x})\subset G$. Repeating this for every ${\bb x}\in G$ and taking the union of the resulting rational balls completes the proof.

In the case of $R^{\infty}$, second countability is demonstrated by taking all simple cylinders (see Example B.4.4) whose base $(a,b)$ has rational endpoints and noting that a countable union of sets that are countably infinite is countably infinite.

Note that above we are denoting the function using bold-face $\bb f$ to indicate that $\bb f(\bb x)$ is a vector.

Conversely, if $\bb f$ is continuous, for all $\epsilon>0$ there exists $\delta>0$ such that whenever $\|\bb x-\bb y\|^2\leq \delta^2$ we have \[\|\bb f(\bb x)-\bb f(\bb y)\|^2=\sum_{j=1}^k |f_j(\bb x)-f_j(\bb y)|^2 < \epsilon^2.\] This implies $|f_j(\bb x)-f_j(\bb y)|^2 < \epsilon^2$ and $|f_j(\bb x)-f_j(\bb y)| < \epsilon$, implying the continuity of $f_1,\ldots,f_k$.

Note that in the first and third cases above, we have $k=1$, and in the last case above, we have $k=d=1$.

To prove the second assertion note that whenever $\|(\bb u,\bb w)-(\bb x,\bb y)\|<\epsilon/\sqrt{2d}$, then for all $j=1,\ldots,d$, $|x_j-u_j|< \epsilon/\sqrt{2d}$ and $|y_j-w_j|<\epsilon/\sqrt{2d}$. Using the triangle inequality property of the Euclidean norm, we have \begin{align*} \|(\bb x+\bb y)-(\bb u+\bb w)\| &= \|(\bb x-\bb u)+(\bb y-\bb w)\| \\ &\leq \|\bb x-\bb u\|+\|\bb y-\bb w\| \\ &\leq \sqrt{2d\epsilon^2/(2d)}=\epsilon. \end{align*} The proofs of the other propositions are similar.

The concept of compactness (Definition B.1.6) has some important consequences (Propositions B.3.5 and B.3.6). The general definition of a compact space (Definition B.1.6) is hard to verify, but the following simple condition is veru useful for verifying compactness in $\R^d$. A proof is available in (Rudin, 1976).