$
\def\P{\mathsf{\sf P}}
\def\E{\mathsf{\sf E}}
\def\Var{\mathsf{\sf Var}}
\def\Cov{\mathsf{\sf Cov}}
\def\std{\mathsf{\sf std}}
\def\Cor{\mathsf{\sf Cor}}
\def\R{\mathbb{R}}
\def\c{\,|\,}
\def\bb{\boldsymbol}
\def\diag{\mathsf{\sf diag}}
\def\defeq{\stackrel{\tiny\text{def}}{=}}
$

When indexing vectors we sometimes omit the redundant index. For example, if $\bb v$ is a column vector we write $[\bb v]_{j1}$ as simply $v_j$. We similarly consider $1\times 1$ matrices as scalars and refer to them without redundant indices.

We follow convention and write vectors in bold lowercase, for example $\bb x,\bb y$ and scalars in non-bold lowercase, for example $x,y$. We denote matrices in non-bold uppercase, such as $A, B$. Important exceptions are random variables, denoted by non-bold uppercase, for example $X,Y$ and random vectors, denoted by bold uppercase, for example $\bb X,\bb Y$.

Note that while matrix multiplication by a scalar and matrix addition are commutative, matrix product is not: $AB\neq BA$ in general. All three operations are associative: $A+(B+C)=(A+B)+C$, $c(AB)=(cA)B$, and $A(BC)=(AB)C$. This implies we can omit parenthesis and write $A+B+C$, $cAB$, and $ABC$ without any ambiguity.

When we use the matrix product notation $AB$ we assume implicitly that it is defined, implying that $A$ has the same number of columns as $B$ has rows. Repeated multiplications is denoted using an exponent, for example $AA=A^2$ and $AA^k=A^{k+1}$.

Examining Definition C.1.5 we see that multiplying a matrix by a column vector $A\bb v$ yields a column vector that is a linear combination of the columns of $A$. Similarly the matrix multiplication $AB$ is a matrix whose columns are each a linear combination of the column vectors of $A$. We will use these important observations in several proofs later on.

The proposition below helps to explain why matrices and linear algebra are so important.

One of the applications of matrix inversion is in solving linear systems of equations. Given the system of equations $A\bb x=\bb y$ where $A,\bb y$ are known and $\bb x$ is not known, we can solve for $\bb x$ by inverting the matrix \[\bb x=A^{-1}A\bb x=A^{-1}\bb y.\]

%This, $A^{-1}\bb y$ is the vector of coefficients of the expansion of $\bb y$ in a linear combination using the columns of $A$.As a consequence of the above proposition, mapping vectors by multiplying with an orthogonal matrix preserves the angle between the two vectors (see Section B.4). Moreover norm preservation implies that multiplying a vector by an orthogonal matrix preserves the distance between the vector and the origin. We thus interpret the mapping $\bb x\mapsto A\bb x$ for an orthogonal $A$ as rotation or reflection in $n$-dimension.

Note that the Kronecker product is consistent with the earlier definition of an outer product of two vectors $\bb v\otimes \bb w$: \[\bb u\otimes \bb v = \bb u {\bb v}^{\top}.\]