Loading [MathJax]/jax/output/HTML-CSS/jax.js

Probability

The Analysis of Data, volume 1

Relationships Between the Modes of Convergences

8.2. Relationships Between the Modes of Convergences

Proposition 8.2.1. X(n)asXif and only ifP(X(n)Xϵ i.o.)=0,ϵ>0.
Proof. The event (X(n)asX)c is equivalent to the event ϵ>0{X(n)Xϵ i.o.}. It follows that the event X(n)asX is equivalent to P(X(n)Xϵ i.o.)=0 for all ϵ>0.
See Section A.4 for the definition of i.o. or infinitely often.
Proposition 8.2.2. X(n)asXimpliesX(n)pXX(n)pXimpliesX(n)X.
Proof. We first show that convergence with probability 1 implies convergence in probability. Since {X(n)Xϵ i.o.}=lim supn{X(n)Xϵ}, the event X(n)asX implies limnP(X(n)Xϵ)lim supnP(X(n)Xϵ)P(lim supnX(n)Xϵ)=0. The inequality lim supP(An)P(lim supAn) follows from Fatou's lemma (see Chapter F) applied to the sequence of indicator functions fn=IAn and the measure μ=P. The last equality follows from the previous proposition.

We next show that convergence in probability implies convergence in distribution. Denoting 1=(1,,1), we have that if X(n)x then either Xx+ϵ1, or XX(n)>ϵ, or both (we interpret inequality between two vectors as a sequence of inequalities for the corresponding components: uv implies uivi, i=1,,d). Similarly, if Xxϵ1 then either X(n)x or XX(n)>ϵ, or both. This implies that for all n, F(n)X(x)P(Xx+ϵ1)+P(XX(n)>ϵ)=FX(x+ϵ1)+P(XX(n)>ϵ)FX(xϵ1)P(X(n)x)+P(XX(n)>ϵ)=FX(n)(x)+P(XX(n)>ϵ). Since X(n)pX, we have P(XX(n)>ϵ)0 and letting n in the two inequalities above, we get FX(xϵ1)lim infFX(n)(x)lim supFX(n)(x)FX(x+ϵ1). The left hand side and the right hand side converge to FX(x) as ϵ0 at points x where FX is continuous, implying that FX(n)(x)FX(x).

The following example shows that convergence in probability may occur even if convergence with probability one does not occur.

Example 8.2.1. Consider Ω=[0,1] with P being the uniform distribution over Ω. The sequence of random variables X(1)=I(0,1/2], X(2)=I(1/2,1], X(3)=I(0,1/4], X(5)=I(1/4,1/2], X(6)=I(1/2,3/4], and so on, does not converge with probability one to any limit (for all ω, X(n)(ω) is a divergent sequence). On the other hand, X(n)p0 since P(|X(n)|ϵ)0 for all ϵ>0.
Proposition 8.2.3. If cRd then X(n)cif and only ifX(n)pc.
Proof. It suffices to prove that convergence in distribution to a constant vector implies probability in probability (convergence in probability always implies convergence in distribution). We prove the result below for two dimensions d=2. The cases of other dimensions are similar.

We have (see Figure 8.2.1 below for an illustration) P(X(n)c2ϵ)=P(X(n)c22ϵ2)P(cϵ(1,1)<X(n)c+ϵ(1,1))=P(X(n)c+ϵ(1,1))P(X(n)c+ϵ(1,1))P(X(n)c+ϵ(1,1))+P(X(n)c+ϵ(1,1)). In the first inequality above, we used the fact that if |a1b1|ϵ and |a2b2|ϵ then ab22ϵ2. In the second equality we used the principle of inclusion-exclusion (see Figure 8.2.1 for an illustration). If we have convergence in distribution X(n)c, then the last term in the inequality above converges to 0+0+0+1, which implies convergence in probability X(n)pc.

Figure illustrating proof

Figure 8.2.1: This figure illustrates the proof of Proposition 8.2.3. The white square region may be expressed as the region contained within the dashed rectangle minus the two shaded rectangles plus the intersection of the two shaded rectangles (since it was subtracted twice).

Proposition 8.2.4. The convergence X(n)pX occurs if and only if every sequence of natural numbers n1,n2,N has a subsequence r1,r2,{n1,n2,} such that X(rk)asX as k.
Proof. We assume that X(n)pX and consider a sequence of positive numbers ϵi such that iϵi<. For each ϵi, we can find a natural number ni such that P(X(n)Xϵi)<ϵi for all n>ni. We can assume without loss of generality that n1<n2<n3< (otherwise replace ni with max(n1,n2,,ni)).

Defining Ai to be the event {X(ni)Xϵi}, we have iP(Ai)iϵi< and by the first Borell-Cantelli Lemma (Proposition 6.6.1) we have P(Ai i.o.)=0. Since limkϵk=0, this implies that for all ϵ>0, P({X(ni)Xϵ} i.o.)=0, which by Proposition 8.2.1 implies that X(ni)asX as i. We have thus shown that there exists a subsequence n1,n2, of 1,2, along which convergence with probability 1 occurs.

Considering now an arbitrary sequence n1,n2, of natural numbers, we have X(ni)pX as i, and repeating the above argument with n1,n2,n3, replacing 1,2,3, we can find a subsequence r1,r2,r3, of n1,n2,n3, along which X(ri)asX as i.

To show that converse, we assume that X(n)pX. Then there exists ϵ>0 and δ>0 such that P(X(k)Xϵ)>δ for infinitely many k, which we denote n1,n2,. This implies that there exists no subsequence of n1,n2, along which X(ni)asX.