Probability
The Analysis of Data, volume 1
Linear Algebra: Rank
$
\def\P{\mathsf{\sf P}}
\def\E{\mathsf{\sf E}}
\def\Var{\mathsf{\sf Var}}
\def\Cov{\mathsf{\sf Cov}}
\def\std{\mathsf{\sf std}}
\def\Cor{\mathsf{\sf Cor}}
\def\R{\mathbb{R}}
\def\c{\,|\,}
\def\defeq{\stackrel{\tiny\text{def}}{=}}
\def\bb{\boldsymbol}
\def\diag{\mathsf{\sf diag}}
\def\col{\mathsf{\sf col}}
\def\row{\mathsf{\sf row}}
\def\rank{\mathsf{\sf rank}}
\def\nulll{\mathsf{\sf null}}
$
C.2. Rank
Definition C.2.1.
The linear space spanned by the vectors $S=\{{\bb v}^{(1)},\ldots, {\bb v}^{(n)}\}$ is the set of all linear combinations of vectors in $S$. A basis of a linear space $S$ is any set of linearly independent vectors that span it. The dimension $\dim S$ of a linear space $S$ is the size of its basis.
Example C.2.1.
The space $\R^n$ is spanned by the standard basis ${\bb e}^{(i)}, i=1,\ldots,n$ from Example C.1.4. Since the standard basis vectors are linearly independent, they are a basis for $\R^n$ in the sense of the previous definition. Another possible basis for $\R^n$ is $\{{\bb u}^{(i)}: i=1,\ldots,n\}$, where ${\bb u}^{(i)}$ is defined by $u^{(i)}_j=1$ if $j\leq i$ and 0 otherwise.
Definition C.2.2.
The column space of a matrix $A\in\R^{n\times m}$ is the space $\col(A)$ spanned by the columns of $A$ or equivalently \[\col(A)=\{A\bb v: \bb v\in\R^{m\times 1}\}\subset \R^n.\] We refer to $\dim\, \col(A)$ as the rank of the column space of $A$. The row space of $A\in\R^{n\times m}$ is the space $\row(A)$ spanned by the rows of $A$. The null space of $A\in\R^{n\times m}$ is \[\nulll(A)=\{\bb v:A\bb v=\bb 0\}\subset \R^m.\]
Proposition C.2.1.
For any matrix $A$, \[ \dim\col(A)=\dim\row(A).\]
We denote that number as $\rank(A)$.
Proof.
Consider a matrix $A$ with $\dim\col(A)=r$ and arrange the $r$ vectors spanning $\col(A)$ as columns of a matrix $C$. Since the columns of $A$ are linear combination of the columns of $C$ we have $A=CR$ for some matrix $R$ with $r$ rows. Every row of $A=CR$ is a linear combination of the rows of $R$ and thus the row space of $A$ is a subset of the row space of $R$ whose dimension is bounded by $r$. The reverse inequality is obtained by applying the same argument to $A^{\top}$.
Definition C.2.3.
A matrix $A\in\R^{n\times m}$ has full rank is a matrix for which $\rank(A)=\min(n,m)$. Otherwise the matrix has low rank.
Proposition C.2.2.
\[ \rank(AB)\leq \min (\rank A,\rank B).\]
Proof.
Since the columns of $AB$ are linear combinations of the columns of $A$, $\rank(AB)\leq \rank(A)$. Similarly, since the rows of $AB$ are linear combinations of the rows of $B$, $\rank(AB)\leq \rank(B)$.
Proof.
For non-singular matrices $P,Q$ we have
\[ \rank(PAQ) = \rank(A).\]
Proof.
Applying the above proposition, we have
\begin{align*}
\rank(PAQ) &\leq \rank(AQ)\leq \rank (A),\\
\rank(A) &= \rank(IAI)=\rank(P^{-1}PAQQ^{-1})\leq \rank(PAQ).
\end{align*}
Proposition C.2.3.
For a matrix $A\in\mathbb{R}^{m\times n}$,
\[\rank(A)+\dim(\nulll(A))=n.\]
Proof.
For $s=\dim(\nulll(A))$, let ${\bb \alpha}^{(i)}, i=1,\ldots,s$ be a basis for $\nulll(A)$, and extend it with the vectors ${\bb\beta}^{(i)}$ $i=1,\ldots,n-s$ that together span $\R^n$. Given $\bb v\in\col(A)$, we have
\begin{align*}
\bb v&=A\bb x= A \left(\sum_{i=1}^s a_i {\bb\alpha}^{(i)} + \sum_{i=1}^{n-s} b_i {\bb\beta}^{(i)}\right)=\sum_{i=1}^{n-s} b_i A {\bb \beta}^{(i)}.
\end{align*}
(The second equality above follows from representing $\bb x$ as a linear combination of the basis composed by ${\bb \alpha}^{(i)}, i=1,\ldots,s$ and ${\bb\beta}^{(i)}$ $i=1,\ldots,n-s$.)
We thus have that every vector in $\col(A)$ can be written as a linear combination of $n-s$ vectors ${\bb\gamma}^{(i)}=A{\bb\beta}^{(i)}$. This shows that $\rank(A)\leq n-\dim(\nulll(A))$. Assuming that $\sum_{i=1}^{n-s} c_i {\bb\gamma}^{(i)}=0$ we have $A\sum_{i=1}^{n-s} c_i {\bb\beta}^{(i)}=0$ implying $\sum_{i=1}^{n-s} c_i {\bb\beta}^{(i)}\in \nulll(A).$ But by construction of ${\bb\beta}^{(i)}$ this is possible only if $c_1=\cdots=c_{n-s}=0$ which implies that ${\bb\gamma}^{(i)}$, $i=1,\ldots,n-s$ are linearly independent. Since the column space of $A$ is spanned by $n-s$ linearly independent vectors, $\rank(A)=n-s$.
Proposition C.2.4.
\[\rank(A)=\rank(A^{\top})=\rank(AA^{\top})=\rank(A^{\top}A).\]
Proof.
Since $A\bb x=0$ implies $A^{\top}A\bb x=0$, and since $A^{\top}A\bb x$ implies $\bb xA^{\top}A\bb x=0$, which implies $A\bb x=0$, we have $\nulll(A)=\nulll(A^{\top}A)$. Since both matrices have the same number of columns, the previous proposition implies that $\rank(A)=\rank(A^{\top}A)$. Applying the same argument to $A^{\top}$ shows that $\rank(A^{\top})=\rank(AA^{\top})$. We conclude the proof by noting that $\rank(A)=\rank(A^{\top})$ as implied by Proposition C.2.1.