本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
Convergence in Probability
- We can interpret the weak law of large numbers as stating that " M n M_n Mn converges to μ μ μ". However. since M 1 , M 2 . . . . M_1, M_2 .... M1,M2.... is a sequence of random variables, not a sequence of numbers, the meaning of convergence has to be made precise.
- Given this definition. the weak law of large numbers simply states that the sample mean converges in probability to the true mean μ μ μ.
- More generally, the Chebyshev inequality implies that if all Y n Y_n Yn have the same mean μ μ μ, and v a r ( Y n ) var(Y_n) var(Yn) converges to 0, then Y n Y_n Yn converges to μ μ μ in probability.
- If the random variables Y 1 , Y 2 , . . . . Y_1, Y_2, .... Y1,Y2,.... have a PMF or a PDF and converge in probability to a a a, then according to the above definition, almost all of the PMF or PDF of Y n Y_n Yn is concentrated within ϵ \epsilon ϵ of a for large values of n n n. It is also instructive to rephrase the above definition as follows: for every ϵ > 0 \epsilon > 0 ϵ>0, and for every δ > 0 \delta > 0 δ>0, there exists some n 0 n_0 n0 such that
P ( ∣ Y n − a ∣ ≥ ϵ ) ≤ δ f o r a l l n ≥ n 0 P(|Y_n-a|\geq\epsilon)\leq\delta\ \ \ \ \ \ for\ all\ n\geq n_0 P(∣Yn−a∣≥ϵ)≤δ for all n≥n0- If we refer to ϵ \epsilon ϵ as the accuracy level (精度), and δ \delta δ as the confidence level (置信水平), the definition takes the following intuitive form: for any given levels of accuracy and confidence, Y n Y_n Yn will be equal to a a a, within these levels of accuracy and confidence, provided that n n n is large enough.
Example 5.6.
- Consider a sequence of independent random variables X n X_n Xn that are uniformly distributed in the interval [ 0 , 1 ] [0, 1] [0,1], and let
Y n = { X 1 , . . . , X n } Y_n=\{X_1,...,X_n\} Yn={ X1,...,Xn} - In particular.
lim n → ∞ P ( ∣ Y n − 0 ∣ ≥ ϵ ) = lim n → ∞ ( 1 − ϵ ) n = 0 \lim_{n\rightarrow\infty}P(|Y_n-0|\geq\epsilon)=\lim_{n\rightarrow\infty}(1-\epsilon)^n=0 n→∞limP(∣Yn−0∣≥ϵ)=n→∞lim(1−ϵ)n=0Since this is true for every ϵ > 0 \epsilon > 0 ϵ>0, we conclude that Y n Y_n Yn converges to zero, in probability.
- One might be tempted to believe that if a sequence Y n Y_n Yn converges to a number a a a, then E [ Y n ] E[Y_n] E[Yn] must converge to a a a.
- The following example shows that this need not be the case, and illustrates some of the limitations (局限性) of the notion of convergence in probability.
Example 5.8.
- Consider a sequence of discrete random variables Y n Y_n Yn with the following distribution:
- For every ϵ > 0 \epsilon > 0 ϵ>0, we have
lim n → ∞ P ( ∣ Y n − 0 ∣ ≥ ϵ ) = lim n → ∞ 1 n = 0 \lim_{n\rightarrow\infty}P(|Y_n-0|\geq\epsilon)=\lim_{n\rightarrow\infty}\frac{1}{n}=0 n→∞limP(∣Yn−0∣≥ϵ)=n→∞limn1=0and Y n Y_n Yn converges to zero in probability. - On the other hand,
E [ Y n ] = n 2 / n = n E[Y_n ] = n^2 /n = n E[Yn]=n2/n=nwhich goes to infinity as n n n increases.
Problem 5
Let X 1 , X 2 , . . . X_1,X_2, .. . X1,X2,... be independent random variables that are uniformly distributed over [ − 1 , 1 ] [-1, 1] [−1,1]. Show that the sequence Y 1 , Y 2 . . . . Y_1, Y_2 .... Y1,Y2.... converges in probability to some limit, and identify the limit.
Y n = X 1 ⋅ X 2 . . . X n Y_n=X_1\cdot X_2... X_n Yn=X1⋅X2...Xn
SOLUTION
- We have
E [ Y n ] = E [ X 1 ] . . . E [ X n ] = 0 E[Y_n] = E[X_1]...E[X_n] = 0 E[Yn]=E[X1]...E[Xn]=0Also
v a r ( Y n ) = E [ Y n 2 ] = E [ X 1 2 ] . . . E [ X 2 n ] = v a r ( X 1 ) n = ( 4 12 ) n var(Y_n) = E[Y_n^2]= E[X_1^2]...E[X_2^n]= var(X_1)^n =(\frac{4}{12})^n var(Yn)=E[Yn2]=E[X12]...E[X2n]=var(X1)n=(124)nso v a r ( Y n ) → 0 var(Y_n)\rightarrow 0 var(Yn)→0. Since all Y n Y_n Yn have 0 as a common mean, from Chebyshev’s inequality it follows that Y n Y_n Yn converges to 0 in probability.
Problem 6.
Consider two sequences of random variables X 1 , X 2 , . . . X_1, X_2, ... X1,X2,... and Y 1 , Y 2 , . . . Y_1, Y_2, ... Y1,Y2,..., which converge in probability to some constants. Let c c c be another constant. Show that c X n cX_n cXn, X n + Y n X_n + Y_n Xn+Yn, max { 0 , X n } \max\{0, X_n \} max{
0,Xn}, ∣ X n ∣ |X_n| ∣Xn∣, and X n Y n X_nY_n XnYn all converge in probability to corresponding limits.
SOLUTION
- Let x x x and y y y be the limits of X n X_n Xn and Y n Y_n Yn, respectively. Fix some ϵ > 0 \epsilon > 0 ϵ>0 and a constant c c c. If c = 0 c = 0 c=0, then c X n cX_n cXn equals zero for all n n n, and convergence trivially holds. If c ≠ 0 c\neq0 c=0, we observe that P ( ∣ c X n − c x ∣ ≥ ϵ ) = P ( ∣ X n − x ∣ ≥ ϵ / ∣ c ∣ ) P(|cX_n-cx|\geq \epsilon)=P(|X_n-x|\geq \epsilon/|c|) P(∣cXn−cx∣≥ϵ)=P(∣Xn−x∣≥ϵ/∣c∣), which converges to zero, thus establishing convergence in probability of c X n cX_n cXn.
- We note that
P ( ∣ X n + Y n − x − y ∣ ≥ ϵ ) ≤ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) + P ( ∣ Y n − y ∣ ≥ ϵ / 2 ) P(|X_n + Y_n-x-y|\geq\epsilon)\leq P(|X_n-x|\geq \epsilon/2)+P(|Y_n-y|\geq \epsilon/2) P(∣Xn+Yn−x−y∣≥ϵ)≤P(∣Xn−x∣≥ϵ/2)+P(∣Yn−y∣≥ϵ/2)Therefore,
lim n → ∞ P ( ∣ X n + Y n − x − y ∣ ≥ ϵ ) ≤ lim n → ∞ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) + lim n → ∞ P ( ∣ Y n − y ∣ ≥ ϵ / 2 ) = 0 \lim_{n\rightarrow\infty}P(|X_n + Y_n-x-y|\geq\epsilon)\leq \lim_{n\rightarrow\infty}P(|X_n-x|\geq \epsilon/2)+\lim_{n\rightarrow\infty}P(|Y_n-y|\geq \epsilon/2)=0 n→∞limP(∣Xn+Yn−x−y∣≥ϵ)≤n→∞limP(∣Xn−x∣≥ϵ/2)+n→∞limP(∣Yn−y∣≥ϵ/2)=0 - We have
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n ≥ 0 ) P ( x X n ≥ 0 ) + lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n < 0 ) P ( x X n < 0 ) = lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ∣ x X n ≥ 0 ) \begin{aligned} &\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}) \\=&\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n\geq0)P(xX_n\geq0)+\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n<0)P(xX_n<0) \\=&\lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\}|xX_n\geq0) \end{aligned} ==n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ})n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ}∣xXn≥0)P(xXn≥0)+n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ}∣xXn<0)P(xXn<0)n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ}∣xXn≥0)- If x > 0 x>0 x>0, then
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { ∣ X n − x ∣ ≥ ϵ } ∣ x X n ≥ 0 ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=\lim_{n\rightarrow\infty}P(\{|X_n-x|\geq\epsilon\}|xX_n\geq0)=0 n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ})=n→∞limP({ ∣Xn−x∣≥ϵ}∣xXn≥0)=0 - If x ≤ 0 x\leq0 x≤0, then
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = lim n → ∞ P ( { 0 ≥ ϵ } ∣ x X n ≥ 0 ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=\lim_{n\rightarrow\infty}P(\{0\geq\epsilon\}|xX_n\geq0)=0 n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ})=n→∞limP({ 0≥ϵ}∣xXn≥0)=0 - Thus, we have
lim n → ∞ P ( { ∣ max { 0 , X n } − max { 0 , x } ∣ ≥ ϵ } ) = 0 \lim_{n\rightarrow\infty}P(\{|\max\{0,X_n\}-\max\{0,x\}|\geq\epsilon\})=0 n→∞limP({ ∣max{ 0,Xn}−max{ 0,x}∣≥ϵ})=0
- If x > 0 x>0 x>0, then
- We have ∣ X n ∣ = max { 0 , X n } + max { 0 , − X n } |X_n| = \max\{0, X_n\}+\max\{0, -X_n\} ∣Xn∣=max{ 0,Xn}+max{ 0,−Xn}. Since max { 0 , X n } \max\{0, X_n\} max{ 0,Xn} and max { 0 , − X n } \max\{0, -X_n\} max{ 0,−Xn} converge, it follows that their sum, ∣ X n ∣ |X_n| ∣Xn∣, converges to max { 0 , x } + max { 0 , − x } = ∣ x ∣ \max\{0, x\}+\max\{0, -x\}=|x| max{ 0,x}+max{ 0,−x}=∣x∣ in probability.
- Finally, we have
P ( ∣ X n Y n − x y ∣ ≥ ϵ ) = P ( ∣ ( X n − x ) ( Y n − y ) + x Y n + y X n − 2 x y ∣ ≥ ϵ ) ≤ P ( ∣ ( X n − x ) ( Y n − y ) ∣ ≥ ϵ / 2 ) + P ( ∣ x Y n + y X n − 2 x y ∣ ≥ ϵ / 2 ) ≤ P ( ∣ X n − x ∣ ≥ ϵ / 2 ) P ( ∣ Y n − x ∣ ≥ ϵ / 2 ) + P ( ∣ x Y n + y X n − 2 x y ∣ ≥ ϵ / 2 ) \begin{aligned}P(|X_nY_n-xy|\geq\epsilon)&=P(|(X_n-x)(Y_n-y)+xY_n+yX_n-2xy|\geq\epsilon) \\&\leq P(|(X_n-x)(Y_n-y)|\geq\epsilon/2)+P(|xY_n+yX_n-2xy|\geq\epsilon/2) \\&\leq P(|X_n-x|\geq\sqrt{\epsilon/2})P(|Y_n-x|\geq\sqrt{\epsilon/2})+P(|xY_n+yX_n-2xy|\geq\epsilon/2)\end{aligned} P(∣XnYn−xy∣≥ϵ)=P(∣(Xn−x)(Yn−y)+xYn+yXn−2xy∣≥ϵ)≤P(∣(Xn−x)(Yn−y)∣≥ϵ/2)+P(∣xYn+yXn−2xy∣≥ϵ/2)≤P(∣Xn−x∣≥ϵ/2)P(∣Yn−x∣≥ϵ/2)+P(∣xYn+yXn−2xy∣≥ϵ/2)Since x Y n xY_n xYn and y X n yX_n yXn both converge to x y xy xy in probability. the last probability in the above expression converges to 0. It will thus suffice to show that
lim x → ∞ P ( ∣ X n Y n − x y ∣ ≥ ϵ ) ≤ 0 \begin{aligned}\lim_{x\rightarrow\infty}P(|X_nY_n-xy|\geq\epsilon)&\leq0\end{aligned} x→∞limP(∣XnYn−xy∣≥ϵ)≤0
Problem 7.
A sequence X n X_n Xn of random variables is said to converge to a number c c c in the mean square (均方收敛), if
lim n → ∞ E [ ( X n − c ) 2 ] = 0 \lim_{n\rightarrow\infty}E[(X_n-c)^2]=0 n→∞limE[(Xn−c)2]=0
- (a) Show that convergence in the mean square implies convergence in probability.
- (b) Give an example that shows that convergence in probability does not imply convergence in the mean square.
SOLUTION
- ( a ) (a) (a) Suppose that X n X_n Xn converges to c c c in the mean square. Using the Markov inequality, we have
P ( ∣ X n − c ∣ ≥ ϵ ) = P ( ( X n − c ) 2 ≥ ϵ 2 ) ≤ E [ ( X n − c ) 2 ] ϵ 2 P(|X_n-c|\geq\epsilon)=P((X_n-c)^2\geq\epsilon^2)\leq\frac{E[(X_n-c)^2]}{\epsilon^2} P(∣Xn−c∣≥ϵ)=P((Xn−c)2≥ϵ2)≤ϵ2E[(Xn−c)2]Taking the limit as n → ∞ n\rightarrow\infty n→∞. we obtain
lim n → ∞ P ( ∣ X n − c ∣ ≥ ϵ ) = 0 \lim_{n\rightarrow\infty}P(|X_n-c|\geq\epsilon)=0 n→∞limP(∣Xn−c∣≥ϵ)=0 - (b) In Example 5.8, we have convergence in probability to 0 but E [ Y 2 ] = n 3 E[Y^2] = n^3 E[Y2]=n3 , which diverges to infinity.