Linear Algebra: Rank

Probability

The Analysis of Data, volume 1

0
- Front Matter
- 0.1: Contents
- 0.2: Preface
1
2
- Random Variables
- 2.1: Basic Definitions
- 2.2: Functions of RVs
- 2.3: Expectation and Variance
- 2.4: Moments and MGF
- 2.5: RVs and Measure Theory
- 2.6: Notes
- 2.7: Exercises
3
4
5
- Important Vectors
- 5.1: Multinomial Vectors
- 5.2: Gaussian Vectors
- 5.3: Dirichlet Vectors
- 5.4: Mixture Vectors
- 5.5: Exponential Family
- 5.6: Notes
- 5.7: Exercises
6
- Random Processes
- 6.1: Basic Definitions
- 6.2: Marginals
- 6.3: Moments
- 6.4: Random Walk
- 6.5: Processes and Measure
- 6.6: Borell-Cantelli and Zero-One
- 6.7: Notes
- 6.8: Exercises
7
- Important RPs
- 7.1: Markov Chains
- 7.2: Poisson Process
- 7.3: Gaussian Process
- 7.4: Notes
- 7.5: Exercises
8
A
- Set Theory
- A.1: Basic Definition
- A.2: Functions
- A.3: Cardinality
- A.4: Limits of Sets
- A.5: Notes
- A.6: Exercises
B
- Metric Spaces
- B.1: Basic Definitions
- B.2: Limits
- B.3: Continuity
- B.4: Euclidean Space
- B.5: Growth of Functions
- B.6: Notes
- B.7: Exercises
C
- Linear Algebra
- C.1: Basic Definitions
- C.2: Rank
- C.3: Eigenvalues and Determinant
- C.4: Semidefinite Matrices
- C.5: SVD
- C.6: Notes
- C.7: Exercises
D
- Differentiation
- D.1: Scalar Differentiation
- D.2: Power and Taylor Series
- D.3: Notes
- D.4: Exercises
E
- Measure Theory
- E.1: Sigma Algebras
- E.2: Measure Function
- E.3: Extension Theorem
- E.4: Independence
- E.5: Important Measures
- E.6: Measurable Functions
- E.7: Notes
F

$ \def\P{\mathsf{\sf P}} \def\E{\mathsf{\sf E}} \def\Var{\mathsf{\sf Var}} \def\Cov{\mathsf{\sf Cov}} \def\std{\mathsf{\sf std}} \def\Cor{\mathsf{\sf Cor}} \def\R{\mathbb{R}} \def\c{\,|\,} \def\defeq{\stackrel{\tiny\text{def}}{=}} \def\bb{\boldsymbol} \def\diag{\mathsf{\sf diag}} \def\col{\mathsf{\sf col}} \def\row{\mathsf{\sf row}} \def\rank{\mathsf{\sf rank}} \def\nulll{\mathsf{\sf null}} $

C.2. Rank

Definition C.2.1. The linear space spanned by the vectors $S=\{{\bb v}^{(1)},\ldots, {\bb v}^{(n)}\}$ is the set of all linear combinations of vectors in $S$. A basis of a linear space $S$ is any set of linearly independent vectors that span it. The dimension $\dim S$ of a linear space $S$ is the size of its basis.

Example C.2.1. The space $\R^n$ is spanned by the standard basis ${\bb e}^{(i)}, i=1,\ldots,n$ from Example C.1.4. Since the standard basis vectors are linearly independent, they are a basis for $\R^n$ in the sense of the previous definition. Another possible basis for $\R^n$ is $\{{\bb u}^{(i)}: i=1,\ldots,n\}$, where ${\bb u}^{(i)}$ is defined by $u^{(i)}_j=1$ if $j\leq i$ and 0 otherwise.

Definition C.2.2. The column space of a matrix $A\in\R^{n\times m}$ is the space $\col(A)$ spanned by the columns of $A$ or equivalently \[\col(A)=\{A\bb v: \bb v\in\R^{m\times 1}\}\subset \R^n.\] We refer to $\dim\, \col(A)$ as the rank of the column space of $A$. The row space of $A\in\R^{n\times m}$ is the space $\row(A)$ spanned by the rows of $A$. The null space of $A\in\R^{n\times m}$ is \[\nulll(A)=\{\bb v:A\bb v=\bb 0\}\subset \R^m.\]

Proposition C.2.1. For any matrix $A$, \[ \dim\col(A)=\dim\row(A).\] We denote that number as $\rank(A)$.

Proof. Consider a matrix $A$ with $\dim\col(A)=r$ and arrange the $r$ vectors spanning $\col(A)$ as columns of a matrix $C$. Since the columns of $A$ are linear combination of the columns of $C$ we have $A=CR$ for some matrix $R$ with $r$ rows. Every row of $A=CR$ is a linear combination of the rows of $R$ and thus the row space of $A$ is a subset of the row space of $R$ whose dimension is bounded by $r$. The reverse inequality is obtained by applying the same argument to $A^{\top}$.

Definition C.2.3. A matrix $A\in\R^{n\times m}$ has full rank is a matrix for which $\rank(A)=\min(n,m)$. Otherwise the matrix has low rank.

Proposition C.2.2. \[ \rank(AB)\leq \min (\rank A,\rank B).\]

Proof. Since the columns of $AB$ are linear combinations of the columns of $A$, $\rank(AB)\leq \rank(A)$. Similarly, since the rows of $AB$ are linear combinations of the rows of $B$, $\rank(AB)\leq \rank(B)$.

Proof. For non-singular matrices $P,Q$ we have \[ \rank(PAQ) = \rank(A).\]

Proof. Applying the above proposition, we have \begin{align*} \rank(PAQ) &\leq \rank(AQ)\leq \rank (A),\\ \rank(A) &= \rank(IAI)=\rank(P^{-1}PAQQ^{-1})\leq \rank(PAQ). \end{align*}

Proposition C.2.3. For a matrix $A\in\mathbb{R}^{m\times n}$, \[\rank(A)+\dim(\nulll(A))=n.\]

Proof. For $s=\dim(\nulll(A))$, let ${\bb \alpha}^{(i)}, i=1,\ldots,s$ be a basis for $\nulll(A)$, and extend it with the vectors ${\bb\beta}^{(i)}$ $i=1,\ldots,n-s$ that together span $\R^n$. Given $\bb v\in\col(A)$, we have \begin{align*} \bb v&=A\bb x= A \left(\sum_{i=1}^s a_i {\bb\alpha}^{(i)} + \sum_{i=1}^{n-s} b_i {\bb\beta}^{(i)}\right)=\sum_{i=1}^{n-s} b_i A {\bb \beta}^{(i)}. \end{align*} (The second equality above follows from representing $\bb x$ as a linear combination of the basis composed by ${\bb \alpha}^{(i)}, i=1,\ldots,s$ and ${\bb\beta}^{(i)}$ $i=1,\ldots,n-s$.) We thus have that every vector in $\col(A)$ can be written as a linear combination of $n-s$ vectors ${\bb\gamma}^{(i)}=A{\bb\beta}^{(i)}$. This shows that $\rank(A)\leq n-\dim(\nulll(A))$. Assuming that $\sum_{i=1}^{n-s} c_i {\bb\gamma}^{(i)}=0$ we have $A\sum_{i=1}^{n-s} c_i {\bb\beta}^{(i)}=0$ implying $\sum_{i=1}^{n-s} c_i {\bb\beta}^{(i)}\in \nulll(A).$ But by construction of ${\bb\beta}^{(i)}$ this is possible only if $c_1=\cdots=c_{n-s}=0$ which implies that ${\bb\gamma}^{(i)}$, $i=1,\ldots,n-s$ are linearly independent. Since the column space of $A$ is spanned by $n-s$ linearly independent vectors, $\rank(A)=n-s$.

Proposition C.2.4. \[\rank(A)=\rank(A^{\top})=\rank(AA^{\top})=\rank(A^{\top}A).\]

Proof. Since $A\bb x=0$ implies $A^{\top}A\bb x=0$, and since $A^{\top}A\bb x$ implies $\bb xA^{\top}A\bb x=0$, which implies $A\bb x=0$, we have $\nulll(A)=\nulll(A^{\top}A)$. Since both matrices have the same number of columns, the previous proposition implies that $\rank(A)=\rank(A^{\top}A)$. Applying the same argument to $A^{\top}$ shows that $\rank(A^{\top})=\rank(AA^{\top})$. We conclude the proof by noting that $\rank(A)=\rank(A^{\top})$ as implied by Proposition C.2.1.