Important Random Variables: The t Distribution

Probability

The Analysis of Data, volume 1

0
- Front Matter
- 0.1: Contents
- 0.2: Preface
1
2
- Random Variables
- 2.1: Basic Definitions
- 2.2: Functions of RVs
- 2.3: Expectation and Variance
- 2.4: Moments and MGF
- 2.5: RVs and Measure Theory
- 2.6: Notes
- 2.7: Exercises
3
4
5
- Important Vectors
- 5.1: Multinomial Vectors
- 5.2: Gaussian Vectors
- 5.3: Dirichlet Vectors
- 5.4: Mixture Vectors
- 5.5: Exponential Family
- 5.6: Notes
- 5.7: Exercises
6
- Random Processes
- 6.1: Basic Definitions
- 6.2: Marginals
- 6.3: Moments
- 6.4: Random Walk
- 6.5: Processes and Measure
- 6.6: Borell-Cantelli and Zero-One
- 6.7: Notes
- 6.8: Exercises
7
- Important RPs
- 7.1: Markov Chains
- 7.2: Poisson Process
- 7.3: Gaussian Process
- 7.4: Notes
- 7.5: Exercises
8
A
- Set Theory
- A.1: Basic Definition
- A.2: Functions
- A.3: Cardinality
- A.4: Limits of Sets
- A.5: Notes
- A.6: Exercises
B
- Metric Spaces
- B.1: Basic Definitions
- B.2: Limits
- B.3: Continuity
- B.4: Euclidean Space
- B.5: Growth of Functions
- B.6: Notes
- B.7: Exercises
C
- Linear Algebra
- C.1: Basic Definitions
- C.2: Rank
- C.3: Eigenvalues and Determinant
- C.4: Semidefinite Matrices
- C.5: SVD
- C.6: Notes
- C.7: Exercises
D
- Differentiation
- D.1: Scalar Differentiation
- D.2: Power and Taylor Series
- D.3: Notes
- D.4: Exercises
E
- Measure Theory
- E.1: Sigma Algebras
- E.2: Measure Function
- E.3: Extension Theorem
- E.4: Independence
- E.5: Important Measures
- E.6: Measurable Functions
- E.7: Notes
F

$ \def\P{\mathsf{\sf P}} \def\E{\mathsf{\sf E}} \def\Var{\mathsf{\sf Var}} \def\Cov{\mathsf{\sf Cov}} \def\std{\mathsf{\sf std}} \def\Cor{\mathsf{\sf Cor}} \def\R{\mathbb{R}} \def\c{\,|\,} \def\bb{\boldsymbol} \def\diag{\mathsf{\sf diag}} $

3.11. The t Distribution

A $t_{\nu}$ RV, where $\nu > 0$ is a parameter called the degrees of freedom, has the following pdf \begin{align*} f_X(x) &= \frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu\pi}\,\Gamma(\nu/2)} \left(1+\frac{x^2}{\nu}\right)^{-\frac{\nu+1}{2}}, \quad x\in\R, \nu>0. \end{align*}

It can be shown that if $Z\sim N(0,1), W\sim \chi^2_{\nu}$ are two independent RVs, then \[\frac{Z}{\sqrt{W/\nu}} \sim t_{\nu}.\]

The $t$ distribution with $\nu=1$ is called the Cauchy distribution. Its pdf is \[ f_X(x) = \frac{1}{\pi(1+x^2)}. \] The $t$ distribution with $\nu=2$ has the following pdf \[ f_X(x) = \frac{1}{(2+x^2)^{2/3}}.\] As $\nu$ increases, the polynomial decay becomes sharper. At the limit $\nu\to\infty$ the $t$ distribution pdf converges to the pdf of a $N(0,1)$ RV.

The main use of the $t$-distribution is in statistical tests and confidence intervals. In addition, it is also used to model heavy tailed distributions. A $t_{\nu}$-RV decays qualitatively slower than any Gaussian RV, regardless of the $\nu$ parameter. In fact, the decay of the $t_{\nu}$-RV pdf is so slow that for $\nu < 1$ the expectation does not exist (the integral diverges). Similarly, the variance of $t_{\nu}$ is $\infty$ for $1 < \nu\leq 2$ and is undefined for $\nu\leq 1$.

The R code below graphs the pdfs of a Gaussian and a $t$ RVs. Both have symmetric bell-shaped pdf, but the $t$ pdf decays more slowly and is sharper and the center.

x = seq(-6, 6, length.out = 200)
R = data.frame(density = dnorm(x, 0, 1))
R$tdensity = dt(x, 1.5)
R$x = x
P = ggplot(R, aes(x = x, y = density)) + geom_area(fill = I("grey")) +
    geom_line(aes(x = x, y = tdensity), xlab = "$x$",
        ylab = "$f_X(x)$", lwd = I(2))
P + opts(title = "Gaussian (shaded) and t-distribution (dof=1.5) densities")