$
\def\P{\mathsf{\sf P}}
\def\E{\mathsf{\sf E}}
\def\Var{\mathsf{\sf Var}}
\def\Cov{\mathsf{\sf Cov}}
\def\std{\mathsf{\sf std}}
\def\Cor{\mathsf{\sf Cor}}
\def\R{\mathbb{R}}
\def\c{\,|\,}
\def\bb{\boldsymbol}
\def\diag{\mathsf{\sf diag}}
\def\col{\mathsf{\sf col}}
\def\row{\mathsf{\sf row}}
\def\rank{\mathsf{\sf rank}}
\def\S{\mathfrak{S}}
\def\trace{\mathsf{\sf trace}}
\def\defeq{\stackrel{\tiny\text{def}}{=}}
\def\nulll{\operatorname{\sf null}}
$

We make the following observations.

- The above definition implies that the eigenvectors and eigenvalues of $A$ are solutions of the vector equation $(A-\lambda I)\bb v=0$.
- The vector $\bb v$ can only be a solution of $(A-\lambda I)\bb v=0$ if $\dim\nulll(A-\lambda I)\geq 1$, implying that $(A-\lambda I)$ is a singular matrix.
- If $\bb v$ is an eigenvector of $A$, then so is $c\bb v$ (with the same eigenvalue).

The definition above states that the determinant is a sum of many terms, each a product of matrix elements from each row and with differing columns. The sum alternates between adding and subtracting these products, depending on the parity of the permutation.

Note that the above proposition applies in particular to diagonal matrices.

- $\det I=1$.
- $\det A$ is a linear function of the $j$-column vector $\bb v=(A_{1j},\ldots,A_{nj})$ assuming other columns are held fixed.
- If $A'$ is obtained from $A$ by interchanging two columns then $\det A=-\det A'$.
- If $A$ has two equal columns then $\det A=0$.
- $\det (A)=\det( A^{\top})$.

Substituting the definition of the determinant in the equation above, we see that $f(\lambda)$ is indeed a polynomial function in $\lambda$.

Recall from the previous section that for $(\lambda,\bb v)$ to be eigenvalue-eigenvector of $A$ the matrix $(A-\lambda I)$ must be singular. Combining this with the proposition above, we get that the eigenvalues are the roots of the characteristic polynomial: \[f(\lambda)=\det(\lambda I-A)=0.\] This observation leads to a simple procedure for finding the eigenvalues of a given square matrix $A$ by finding the roots of $f(\lambda)$ (either analytically or numerically). Once the eigenvalues $\lambda_1,\ldots,\lambda_k$ are known, we can obtain the eigenvectors by solving the linear equations \[(A-\lambda_i I){\bb v}^{(i)}=\bb 0,\qquad i=1,\ldots,k.\]

Since the eigenvalues $\lambda_1,\ldots,\lambda_n$ are the roots of the characteristic polynomial $f(\lambda)$, we can write it as the following product \begin{align} f(\lambda) = \det(\lambda I-A) = \prod_i (\lambda-\lambda_i). \end{align} This factorization applies for any polynomial $f(x)=\prod(x-x_i)$ where $x_i$ are the roots. For example $f(x)=x^2-3x+2=(x-1)(x-2)$.

Since the diagonal elements of $AA^{\top}$ are the sum of squares of the rows of $A$, and the diagonal elements of $A^{\top}A$ are the sum of squares of the columns of $A$, we have \[ \|A\|_F = \sqrt{\trace(A^{\top}A)}=\sqrt{\trace(AA^{\top})}.\]

Substituting the definition of the determinant in $\det(\lambda I-A)$, we see that the only terms of power $\lambda^{n-1}$ result from a multiplication of the diagonal terms $\prod_{i=1}^{n} (\lambda-A_{ii})$. More specifically, there are $n$ terms containing a power $\lambda^{n-1}$ in the determinant expansion: $-A_{11}\lambda^{n-1},\ldots,-A_{11}\lambda^{n-1}$. Collecting these terms, we get that the coefficient associated with $\lambda^{n-1}$ in the characteristic polynomial is $-\trace(A)=\sum_{i=1}^{n} A_{ii}$. Comparing this to the coefficient of $\lambda^{n-1}$ in the equation above we get that $\trace (A)=\sum\lambda_i$.

The proposition below is one of the central results in linear algebra. A proof is available in most linear algebra textbooks.

Recall that every linear transformation $T$ is realized by a matrix multiplication operation: $T(\bb x)=A\bb x$. If $A$ is orthogonal, the mapping $\bb x\mapsto A\bb x$ may be interpreted as a geometric rotation or reflection around the axis and the mapping $\bb x\mapsto A^{\top}\bb x= A^{-1}\bb x$ is the inverse rotation or reflection. If $A$ is diagonal, the mapping $\bb x\mapsto A\bb x$ may be interpreted as stretching some dimensions and compressing other dimensions. Applying the spectral decomposition to a symmetric $A$, we get a decomposition of $A$ as a product of three matrices $U \diag(\bb\lambda) U^{\top}$. This implies that the linear transformation $T(\bb x)=A\bb x$ can be viewed as a sequence of three linear transformations: the first begin a rotation or reflection, the second being scaling of the dimensions, and the third being another rotation or reflection.