Loading [MathJax]/extensions/TeX/boldsymbol.js

Probability

The Analysis of Data, volume 1

Important Random Processes: Gaussian Processes

7.3. Gaussian Processes

Definition 7.3.1. A Gaussian process is a continuous-time continuous-state RP \{X_{\bb t}:\bb t\in J\subset \R^d\} in which all finite-dimensional marginal distributions follow a multivariate Gaussian distribution \bb {X}=(X_{{\bb t}_1},\ldots,X_{{\bb t}_k}) \sim N((m({\bb t}_1),\ldots,m({\bb t}_k)), C({\bb t}_1,\ldots,{\bb t}_k)), \qquad {\bb t}_1,\ldots,{\bb t}_k\in\R^d.

In order to invoke Kolmogorov's extension theorem and ensure the above definition is rigorous, we need the covariance function C to (i) produce legitimate covariance matrices (symmetric positive definite matrices), and (ii) produce consistent finite dimensional marginals, in the sense of Definition 6.2.1. The second requirement of consistency of the finite dimensional marginals can be ensured by defining the covariance function to be [C({\bb t}_1,\ldots,{\bb t}_k))]_{ij} = C'({\bb t}_i,{\bb t}_j) for some function C':\R^d\times \R^d\to\R.

Example 7.3.1. Denoting by T the matrix whose rows are formed by {\bb t}_i, i=1,\ldots,k, we consider a Gaussian process defined by the following covariance function \begin{align*} [C({\bb t}_1,\ldots,{\bb t}_k)]_{ij} &= \alpha \langle {\bb t}_i,{\bb t}_j\rangle + \beta\delta_{ij}, \end{align*} where \alpha,\beta>0, and \delta_{ij}=1 if i=j and 0 otherwise, or in vector notation \begin{align*} C({\bb t}_1,\ldots,{\bb t}_k) &= \alpha TT^{\top}+\beta I, \beta\delta_{ij}. \end{align*} The matrix C({\bb t}_1,\ldots,{\bb t}_k)) is symmetric, and is also positive definite: for all \bb v\neq \bb 0: \begin{align*} {\bb v}^{\top}C({\bb t}_1,\ldots,{\bb t}_k)\bb v&=\alpha {\bb v}^{\top} T T^{\top}\bb v + {\bb v}^{\top}\beta I{\bb v} = \alpha \|{\bb w}\|^2 + \beta \|\bb v\|_2^2 > 0. \end{align*} which makes it a suitable covariance function.

As \alpha,\beta increase the variance increases making the process more likely to vary further from the expectation function m. If \beta \ll \alpha, the first term in the covariance function becomes dominant, implying that the variance grows similarly to \langle {\bb t}_1, {\bb t}_2\langle. If d=1, this means that X_t,X_s are positively correlated if t, s have the same sign and negatively correlated if t, s have opposing signs. Furthermore, the degree of correlation in absolute value increases as |t|,|s| increase. If \alpha \ll \beta the second term in the covariance function becomes dominant, implying that as the process values at two distinct times {\bb t}_1, {\bb t}_2 is uncorrelated and therefore independent. The resulting process has marginals X_{\bb t}, \bb t \in \R^k that are independent N(m(\bb t), \beta) random variables.

The R code below graphs samples from this random process in four cases: first term dominant with low variance, second term dominant with low variance, first term dominant with high variance, and second term dominant with high variance. In all cases the expectation function is a sinusoidal curve m(t)=\sin(t). Note that in general, the sample paths are not smooth curves.

X = seq(-3, 3, length.out = 100)
m = sin(X)
n = 30
I = diag(1, nrow = 100, ncol = 100)
Y1 = rmvnorm(n, m, X %o% X/100 + I/1000)
par(cex.main = 1.5, cex.axis = 1.2, cex.lab = 1.5)
plot(X, Y1[1, ], type = "n", xlab = "$t$", ylab = "$X_t$",
    main = "first term dominant with low variance")
for (s in 1:n) lines(X, Y1[s, ], col = rgb(0, 0, 0,
    alpha = 0.5))
Y2 = rmvnorm(n, m, X %o% X/1000 + I/100)
plot(X, Y2[1, ], type = "n", xlab = "$t$", ylab = "$X_t$",
    main = "second term dominant with low variance")
for (s in 1:n) lines(X, Y2[s, ], col = rgb(0, 0, 0,
    alpha = 0.5))
Y3 = rmvnorm(n, m, X %o% X + I/10)
plot(X, Y3[1, ], type = "n", xlab = "$t$", ylab = "$X_t$",
    main = "first term dominant with high variance")
for (s in 1:n) lines(X, Y3[s, ], col = rgb(0, 0, 0,
    alpha = 0.5))
Y4 = rmvnorm(n, m, X %o% X + I * 10)
plot(X, Y4[1, ], type = "n", xlab = "$t$", ylab = "$X_t$",
    main = "second term dominant with high variance")
for (s in 1:n) lines(X, Y4[s, ], col = rgb(0, 0, 0,
    alpha = 0.5))

The Wiener Process

An interesting special case of the Gaussian process is the Wiener process.

Definition 7.3.1. The Wiener process \mathcal{Z}=\{Z_t,t\geq 0\} is a Gaussian process with \begin{align*} Z_0&=0, & \text{with probability }1,\\ m_{\mathcal{Z}}(t)&=0, & \forall t\geq 0,\\ C_{\mathcal{Z}}(t,s)&=\alpha\min(t,s), & \alpha>0, \quad \forall s,t\geq 0. \end{align*}

We motivate the Wiener process by deriving it informally as the limit of a discrete-time discrete-valued random walk process with step size h.

Let Z_t=\sum_{i=1}^n h Y_n, for some h>0, where Y_n=2 X_n-1 and X_n\iid \text{Ber}(1/2), n\in \mathbb{N}. Since \E(Y_n)=\E(2X_n-1) = 2\cdot 1/2-1=0 and \Var(Y_n)=\Var(2X_n-1)=4\Var(X_n)=4(1/2)(1-1/2)=1, we have \E(hY_n) = 0 and \Var(hY_n) = h^2.

We set h=\sqrt{\alpha \delta} and let \delta\to 0, h\to 0, or alternatively h=\sqrt{\alpha}\sqrt{t/n} and let n\to\infty. This yields \begin{align*} Z_t & = \lim_{n\to\infty} \sqrt{\alpha}\sqrt{t/n} \sum_{i=1}^n Y_n = \sqrt{\alpha t} \lim_{n\to\infty} \frac{\sum_{i=1}^n Y_n}{\sqrt{n}}. \end{align*} Intuitively, the segment [0,t] is divided to n steps of size \delta=t/n each, and Z_t measures the position of the symmetric random walk at time t or equivalently after n steps of size \delta. By the central limit theorem (see Section 8.9) \lim_{n\to\infty} \sum_{i=1}^n Y_n / \sqrt{n} approaches a Gaussian RV with mean zero and variance one. Since Z_t approaches \sqrt{\alpha t} times a N(0,1) RV, we have as n\to\infty Z_t\sim N(0,\alpha t).

To show that the limit of the random walk above corresponds to Definition 7.3.2, it remains to show that (i) \mathcal{Z} is a Gaussian process and (ii) it has a zero expectation function and its auto-covariance function is C_{\mathcal{Z}}(t,s)=\alpha\min(t,s).

The increments of \mathcal{Z} are independent and have the same distribution. For example Z_5-Z_3 is independent of Z_2-Z_0=Z_2 and have the same distribution. Since f_{Z_{t_1},\ldots,Z_{t_k}}(\bb{z})= f_{Z_{t_1}}(z_1) f_{Z_{t_2}-Z_{t_1}}(z_2-z_1)\cdots f_{Z_{t_{k}}-Z_{t_{k-1}}}(z_k-z_{k-1}), the pdf of a finite dimensional marginal f_{Z_{t_1},\ldots,Z_{t_k}} is a product of independent univariate Gaussians, which has a multivariate Gaussian distribution.

The mean function m_{\mathcal{Z}}(t)=0 as it is a sum of zero mean random variables. The auto-covariance function is \begin{align*} C_{\mathcal{Z}}(t,s) &=\E\left(\left(\lim \sum_{i=1}^{ t/\delta } h Y_i\right)\left( \lim \sum_{i=1}^{s/\delta } h Y_i\right)\right) = \lim h^2 \E\left(\sum_{i=1}^{ t/\delta }\sum_{j=1}^{ s/\delta }Y_iY_j \right) \\ &= \lim h^2 \sum_{i=1}^{ t/\delta }\sum_{j=1}^{ s/\delta } \E(Y_iY_j) = \lim h^2 \sum_{i=1}^{\min(s,t)/\delta } \E(Y_i^2)\\ & =\lim h^2 \min(s,t)\Var(Y)/\delta = \lim \alpha\delta\min(s,t)/\delta \\ &=\alpha \min(s,t), \end{align*} where in the third equation above we used the fact that \E(Y_iY_j)=0 for i\neq j (since Y_i,Y_j are independent RVs with mean 0).