本文为 $I n t r o d u c t i o n$ $t o$ $P r o b a b i l i t y$ 的读书笔记

The Weak Law of Large Numbers

The weak law of large numbers asserts that the sample mean of a large number of independent identically distributed random variables is very likely to be close to the true mean.

We consider a sequence $X_1, X_2 ...$ of independent identically distributed random variables with mean $μ$ and variance $\sigma^2$ , and define the sample mean by
$M_n=\frac{X_1+...+X_n}{n}$ We have
$E[M_n]=\mu,\ \ \ \ var(M_n)=\frac{\sigma^2}{n}$
We apply the Chebyshev inequality and obtain
$P(|M_n-\mu|\geq\epsilon)\leq\frac{\sigma^2}{n\epsilon^2},\ \ \ \ \ for\ any\ \epsilon>0$ We observe that for any fixed $\epsilon > 0$ , the right-hand side of this inequality goes to zero as $n$ increases.

在这里插入图片描述

It turns out that this law remains true even if the $X_i$ have infinite variance, but a much more elaborate argument is needed, which we omit. The only assumption needed is that $E[X_i]$ is well-defined.

Example 5.5. Polling. (选举问题)

Let $p$ be the fraction of voters who support a particular candidate for office. We interview $n$ “randomly selected” voters and record $M_n$ , the fraction of them that support the candidate. We view $M_n$ as our estimate of $p$ and would like to investigate its properties.
We interpret “randomly selected” to mean that the $n$ voters are chosen independently and uniformly from the given population. Thus, the reply of each person interviewed can be viewed as an independent Bernoulli random variable $X_i$ with success probability $p$ and variance $\sigma^2= p(1 - p)$ . The Chebyshev inequality yields
$P(|M_n-p|\geq\epsilon)\leq\frac{p(1-p)}{n\epsilon^2}$ The true value of the parameter $p$ is assumed to be unknown. On the other hand, it may be verified that $\sigma^2=p(1 - p)\leq1/4$ (cf. Example 5.3), which yields
$P(|M_n-p|\geq\epsilon)\leq\frac{1}{4n\epsilon^2}$
Suppose now that we impose some tight specifications on our poll. We would like to have high confidence (probability at least 95%) that our estimate will be very accurate (within .01 of $p$ ). How many voters should be sampled?
- The only guarantee that we have at this point is the inequality
  $P(|M_n-p|\geq0.01)\leq\frac{1}{4n(0.01)^2}$ We will be sure to satisfy the above specifications if we choose $n$ large enough so that
  $\frac{1}{4n(0.01)^2}\leq 0.05$ which yields $n\geq50,000$ .
- This turns out to be fairly conservative, because it is based on the rather loose Chebyshev inequality.

Chapter 5 (Limit Theorems): The Weak Law of Large Numbers (弱大数定律)