To use the change of variables technique, we need to compute the determinant of the matrix above. Recalling that adding one of the rows of the matrix to another row does not change the determinant, we can add the first k−1 rows to the last row, and then compute the determinant of the resulting matrix. That matrix is an upper diagonal matrix (all elements below the diagonal are zero) with diagonal elements Z,Z,⋯,Z,1. Since the determinant of an upper diagonal matrix is the product of the diagonal terms (all other terms in the product defining the determinant expression zero out), the determinant is Zk−1. It follows that the pdf of the transformed variables (X1,…,Xk−1,Z) is fX1,…,Xk−1,Z(x1,…,xk−1,z)=zk−1k∏i=1e−xiz(xiz)αi−1/Γ(αi)=z∑αi−1e−1⋅zk∏i=1xαi−1i/Γ(αi).
We integrate the above expression with respect to z in order to obtain the pdf of X1,…,Xk−1. fX1,…,Xk−1(x1,…,xk−1)=k∏i=1xαi−1i/Γ(αi)⋅∫∞0z∑αi−1e−1⋅zdz=k∏i=1xαi−1i/Γ(αi)⋅Γ(k∑i=1αi) where the last equation follows from the fact that Gamma pdf integrates to one (see Section 3.10). Since ∑ki=1Xi=1, the RV Xk must equal 1−∑k−1i=1Xi. The pdf above therefore remains the same if we add the RV Xk taking value 1−∑k−1i=1Xi (the pdf becomes zero for all other values of Xk).
Recall that the simplex Pk represents the set of all probability functions on a sample space Ω of size k. The Dirichlet distribution may thus be viewed as a distribution over distributions over a finite sample space.
It is straightforward to generalize the proposition above and show that whenever we aggregate some of the Gamma Wi RVs in categories, and then form the normalization construction that defines the Dirichlet random vector, we obtain a Dirichlet random vector with a parameter vector that is similarly aggregated.
The R code below graphs the pdf of several Dirichlet distributions. Dark shades correspond to high probability and light shades correspond to low probabilities. Note that the pdf assigns non-zero values only to the simplex Pk (in this case k=2).
library(MCMCpack) trellis.par.set(list(fontsize = list(text = 18))) R = expand.grid(seq(0.001, 0.999, length.out = 80), seq(0.001, 0.999, length.out = 80)) R[, 3] = 1 - R[, 1] - R[, 2] names(R) = c("x1", "x2", "x3") R = R[R[, 3] > 0, ] for (i in 1:nrow(R)) R$p[i] = ddirichlet(R[i, 1:3], alpha = c(3, 3, 3)) cloud(x3 ~ x1 * x2, data = R, col = gray(1 - R$p/(max(R$p) + 3) - 0.1), pch = 16, cex = 1, main = "Dirichlet (3,3,3) pdf ", scales = list(arrows = FALSE), xlab = "$x_1$", ylab = "$x_2$", zlab = "$x_3$")
for (i in 1:nrow(R)) R$p[i] = ddirichlet(R[i, 1:3], alpha = c(10, 10, 10)) cloud(x3 ~ x1 * x2, data = R, col = gray(1 - R$p/(max(R$p) + 3) - 0.1), pch = 16, cex = 0.8, main = "Dirichlet (10,10,10) pdf ", scales = list(arrows = FALSE), xlab = "$x_1$", ylab = "$x_2$", zlab = "$x_3$")
for (i in 1:nrow(R)) R$p[i] = ddirichlet(R[i, 1:3], alpha = c(1, 3, 6)) cloud(x3 ~ x1 * x2, data = R, col = gray(1 - R$p/(max(R$p) + 3) - 0.1), pch = 16, cex = 0.8, main = "Dirichlet (1,3,6) pdf", scales = list(arrows = FALSE), xlab = "$x_1$", ylab = "$x_2$", zlab = "$x_3$")