Important Random Variables: The Gaussian Distribution

Probability

The Analysis of Data, volume 1

0
- Front Matter
- 0.1: Contents
- 0.2: Preface
1
2
- Random Variables
- 2.1: Basic Definitions
- 2.2: Functions of RVs
- 2.3: Expectation and Variance
- 2.4: Moments and MGF
- 2.5: RVs and Measure Theory
- 2.6: Notes
- 2.7: Exercises
3
4
5
- Important Vectors
- 5.1: Multinomial Vectors
- 5.2: Gaussian Vectors
- 5.3: Dirichlet Vectors
- 5.4: Mixture Vectors
- 5.5: Exponential Family
- 5.6: Notes
- 5.7: Exercises
6
- Random Processes
- 6.1: Basic Definitions
- 6.2: Marginals
- 6.3: Moments
- 6.4: Random Walk
- 6.5: Processes and Measure
- 6.6: Borell-Cantelli and Zero-One
- 6.7: Notes
- 6.8: Exercises
7
- Important RPs
- 7.1: Markov Chains
- 7.2: Poisson Process
- 7.3: Gaussian Process
- 7.4: Notes
- 7.5: Exercises
8
A
- Set Theory
- A.1: Basic Definition
- A.2: Functions
- A.3: Cardinality
- A.4: Limits of Sets
- A.5: Notes
- A.6: Exercises
B
- Metric Spaces
- B.1: Basic Definitions
- B.2: Limits
- B.3: Continuity
- B.4: Euclidean Space
- B.5: Growth of Functions
- B.6: Notes
- B.7: Exercises
C
- Linear Algebra
- C.1: Basic Definitions
- C.2: Rank
- C.3: Eigenvalues and Determinant
- C.4: Semidefinite Matrices
- C.5: SVD
- C.6: Notes
- C.7: Exercises
D
- Differentiation
- D.1: Scalar Differentiation
- D.2: Power and Taylor Series
- D.3: Notes
- D.4: Exercises
E
- Measure Theory
- E.1: Sigma Algebras
- E.2: Measure Function
- E.3: Extension Theorem
- E.4: Independence
- E.5: Important Measures
- E.6: Measurable Functions
- E.7: Notes
F

$ \def\P{\mathsf{\sf P}} \def\E{\mathsf{\sf E}} \def\Var{\mathsf{\sf Var}} \def\Cov{\mathsf{\sf Cov}} \def\std{\mathsf{\sf std}} \def\Cor{\mathsf{\sf Cor}} \def\R{\mathbb{R}} \def\c{\,|\,} \def\bb{\boldsymbol} \def\diag{\mathsf{\sf diag}} $

3.9. The Gaussian Distribution

The Gaussian or normal RV, $X\sim N(\mu,\sigma^2)$, where $\mu\in\mathbb{R}, \sigma^2 > 0$, has the pdf \[f_X(x)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right).\] If $\mu=0,\sigma^2=1$, we refer to this RV as the standard normal or standard Gaussian RV. We sometimes attach the parameters $\mu,\sigma$ as subscripts, for example $f_{0,1}$ and $F_{0,1}$ denote the pdf and cdf of $X_{0,1}$, the standard normal RV. A common notation for $F_{0,1}(x)$ is $\Phi(x)$.

To verify that the pdf integrates to 1, we use the change of variable $(x-\mu)/\sigma \mapsto y$ to get \begin{align*} \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) dx &= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}}\exp\left(-y^2/2\right) \, dy \\ &= \frac{1}{\sqrt{2\pi}} \sqrt{2\pi}=1 \end{align*} where the first equality follows from Example F.2.3 and the second from Example F.6.1.

Example F.2.3 (substitute $a=-\infty$ and $b=x$) shows the following relations between the cdfs and pdfs of the Gaussian distribution: \begin{align} \tag{*} F_{\mu,\sigma^2}(x) &= F_{0,1}\left(\frac{x-\mu}{\sigma}\right)=\Phi\left(\frac{x-\mu}{\sigma}\right),\\ f_{\mu,\sigma^2}(x) &= \frac{d}{d x} F_{\mu,\sigma^2}(x)=\frac{d}{dx} F_{0,1}\left(\frac{x-\mu}{\sigma}\right) = \frac{1}{\sigma} f_{0,1}\left(\frac{x-\mu}{\sigma}\right).\notag \end{align} (The last equality follows from the chain rule (Proposition D.1.3).)

Proposition 3.9.1. \begin{align*} (X_{\mu,\sigma}-\mu)/\sigma & \sim N(0,1)\\ \sigma X_{0,1}+\mu & \sim N(\mu,\sigma). \end{align*}

Proof. Using Equation (*), we have $\P(X_{\mu,\sigma^2}\leq x)=t=\P(X_{0,1}\leq (x-\mu)/\sigma)$, which implies $\sigma X_{0,1}+\mu\sim N(\mu,\sigma^2)$. The second statement follows similarly.

Proposition 3.9.1. The mgf of $X\sim N(\mu,\sigma^2)$ is \[m(t)=\exp(\mu t+t^2\sigma^2/2).\]

Proof. Using the change of variables $u=r-\mu$, \begin{align*} m(t) &= \frac{1}{\sqrt{2\pi\sigma^2}} \int_{\mathbb{R}} e^{tr} e^{-(r-\mu)^2/(2\sigma^2)} dr = \frac{1}{\sqrt{2\pi\sigma^2}} \int_{\mathbb{R}} e^{t(u+\mu)} e^{-u^2/(2\sigma^2)} du \\ &= \frac{1}{\sqrt{2\pi\sigma^2}} e^{t\mu} \int_{\mathbb{R}} e^{tu} e^{-u^2/(2\sigma^2)} du = \frac{1}{\sqrt{2\pi\sigma^2}} e^{t\mu} \int_{\mathbb{R}} e^{-(u^2-tu2\sigma^2)/(2\sigma^2)} du\\ &= \frac{1}{\sqrt{2\pi\sigma^2}} e^{t\mu}e^{t^2\sigma^2/2} \int_{\mathbb{R}} e^{-(u^2-tu2\sigma^2+t^2\sigma^4)/(2\sigma^2)} du\\ &= \frac{1}{\sqrt{2\pi\sigma^2}} e^{t\mu} e^{t^2\sigma^2/2} \int_{\mathbb{R}} e^{-(u-t\sigma^2)^2/(2\sigma^2)} du =e^{t\mu+t^2\sigma^2/2}, \end{align*} where the last equality follows from the fact that the Gaussian density integrates to 1.

Differentiating the mgf, we have for $X\sim N(\mu,\sigma^2)$, \begin{align*} \E(X) & = \psi'(0) = (\mu+\sigma^2 0)1=\mu\\ \Var(X) &= \E(X^2)-(\E (X))^2 = \psi''(0)-\mu^2= \sigma^2 1 + (\mu+0)^2 1 - \mu^2 = \sigma^2. \end{align*}

The Gaussian distribution is one of the most important distributions. The central limit theorem (see Section 8.9 for details) informally states that a sum of many independent random variables is approximately a Gaussian distribution. As a consequence, quantities that are sums of a large number of independent random factors are approximately Gaussian. Examples include performance measures like IQ test results and physiological measurements like height.

The R code below graphs the pdf of the Gaussian distribution for different values of $\sigma$ and $\mu$.

x = seq(-8, 8, length.out = 100)
gf = function(x, s) exp(-x^2/(2 * s^2))/(sqrt(2 * pi) *
    s)
R = stack(list(`$\\sigma=1$` = gf(x, 1), `$\\sigma=2$` = gf(x,
    2), `$\\sigma=3$` = gf(x, 3), `$\\sigma=4$` = gf(x,
    4)))
# below, x is recycled four times
names(R) = c("y", "sigma")
R$x = x
qplot(x, y, color = sigma, main = "Gaussian pdf functions",
    lty = sigma, lwd = I(2), geom = "line", xlab = "$x$",
    ylab = "$f_X(x)$", data = R)

x = seq(-4, 4, length = 100)
y1 = dnorm(x, 0, 1)
y2 = dnorm(x, 1, 1)
y3 = dnorm(x, 0, 2)
y4 = pnorm(x, 0, 1)
y5 = pnorm(x, 1, 1)
y6 = pnorm(x, 0, 2)
D = data.frame(probability = c(y1, y2, y3, y4, y5,
    y6))
D$x = x
D$parameter[1:100] = "$N(0,1)$"
D$parameter[301:400] = "$N(0,1)$"
D$parameter[101:200] = "$N(1,1)$"
D$parameter[401:500] = "$N(1,1)$"
D$parameter[201:300] = "$N(0,2)$"
D$parameter[501:600] = "$N(0,2)$"
D$type[1:300] = "$f_X(x)$"
D$type[301:600] = "$F_X(x)$"
qplot(x, probability, data = D, main = "Gaussian pdf and cdf functions",
    geom = "area", facets = parameter ~ type, xlab = "$x$",
    ylab = "")