本文为 $I n t r o d u c t i o n$ $t o$ $P r o b a b i l i t y$ 的读书笔记

Expectation

Expectation of $X$ : a weighted (in proportion to probabilities) average of the possible values of $X$ .

Variance, Moments, and the Expected Value Rule

方差、矩、随机变量函数的期望规则

Moments

We define the $n$ th moment ( $n$ 阶矩) as $E[X^n]$ . With this terminology, the 1st moment of $X$ is just the mean.

Variance
$var(X)=E[(X-E[X])^2]$

The variance provides a measure of dispersion (分散程度) of $X$ around its mean.

Standard Deviation (标准差)

Another measure of dispersion is the standard deviation of $X$ .
$\sigma_X=\sqrt{var(X)}$
The standard deviation is often easier to interpret because it has the same units as $X$ .

Expected Value Rule
在这里插入图片描述
Proof

Using the expected value rule, we can write the variance of $X$ as
$var(X)=E[(X-E[X])^2]=\sum_x(X-E[X])^2p_X(x)$
Similarly, the $n$ th moment is given by
$E[X^n]=\sum_xx^np_X(x)$

Properties of Mean and Variance

We will now use the expected value rule in order to derive some important properties of the mean and the variance.

Mean and Variance of a Linear Function of a Random Variance

We start with a random variable $X$ and define a new random variable $Y = a X + b$ , where $a$ and $b$ are given scalars. Let us derive the mean and the variance of the linear function $Y$ . We have
$E[Y]=\sum_x(ax+b)p_X(x)=a\sum_xxp_X(x)+b\sum_xp_X(x)=aE[X]+b\\ var(Y)=\sum_x(ax+b-E[Y])^2p_X(x)=a^2\sum_x(x-E[X])^2p_X(x)=a^2var(X)$

Variance in Terms of Moments Expression
$var(X)=E[X^2]-(E[X])^2$

We finally illustrate by example a common pitfall:

Unless $g (X)$ is a linear function, it is not generally true that $E [g (X)]$ is equal to $g (E [X])$ .

Example 2.4. Average Speed Versus Average Time.
If the weather is good (which happens with probability $0.6$ ). Alice walks the 2 miles to class at a speed of $V = 5$ miles per hour, and otherwise rides her motorcycle at a speed of $V = 30$ miles per hour. What is the mean of the time $T$ to get to class?

A correct way to solve the problem is to first derive the PMF of $T$ and then calculate its mean by
$E[T]=0.6\cdot\frac{2}{5}+0.4\cdot\frac{2}{30}=\frac{4}{15}hours$
However, it is wrong to calculate the mean of the speed $V$ ,
$15\ miles/hour$ and then claim that the mean of the time $T$ is
$\frac{2}{E[V]}=\frac{2}{15}hours$
To summarize, in this example we have
$T=\frac{2}{V},\ \ \ \ and \ \ \ E[T]=E[\frac{2}{V}]\neq\frac{2}{E[V]}$

Mean and Variance of Some Common Random Variables

Bernoulli Random Variable

在这里插入图片描述

The mean. second moment. and variance of $X$ are given by the following calculations:

Geometric Random Variable

$E[X]=\frac{1}{p}\\ var(X)=\frac{1-p}{p^2}$

Problem 23.

(a) A fair coin is tossed repeatedly and independently until two consecutive heads or two consecutive tails appear. Find the PMF, the expected value, and the variance of the number of tosses.
(b) Assume now that the coin is tossed until we obtain a tail that is immediately preceded by a head. Find the PMF and the expected value of the number of tosses.

SOLUTION

Let $X$ be the total number of tosses.
(a) The random variable $X$ is of the form $X = Y + 1$ , where $Y$ is a geometric random variable with parameter $p = 1 / 2$ . It follows that
and
$E[X]=E[Y]+1=\frac{1}{p}+1=3\\ var(X)=var(Y)=\frac{1-p}{p^2}=2$
(b) If $k > 2$ , there are $k - 1$ sequences that lead to the event ${X = k\}$ . One such sequence is $\ ...\ H\ T$ , where $k - 1$ heads are followed by a tail. The other $k - 2$ possible sequences are of the form $T\ ...\ T\ H\ ...\ H\ T$ , for various lengths of the initial $T\ ...\ T$ segment. For the case where $k = 2$ , there is only one (hence $k - 1$ ) possible sequence that leads to the event ${X = k\}$ , namely the sequence $H\ T$ . Therefore, for any $\geq 2$ ,
$P(X=k)=(k-1)(1/2)^k$ It follows that

and

We have used here the equalities

and
where $Y$ is a geometric random variable with parameter $p = 1 / 2$ .

Discrete Uniform Random Variable

离散均匀随机变量

Discrete uniformly distributed random variable (or discrete uniform for short)

A discrete uniform random variable takes one out of a range of contiguous (相邻的) integer values, with equal probability.
where $a$ and $b$ are two integers with $a < b$
$E[X]=\frac{a+b}{2}$
To calculate the variance of $X$ . we first consider the simpler case where $a = 1$ and $b = n$ . It can be verified by induction on $n$ that
$E[X^2]=\frac{1}{n}\sum_{k=1}^nk^2=\frac{1}{6}(n+1)(2n+1)$ The variance can now be obtained in terms of the first and second moments
$var(X)=E[X^2]-(E[X])^2=\frac{n^2-1}{12}$
For the case of general integers $a$ and $b$ , we note that a random variable which is uniformly distributed over the interval $[a, b]$ has the same variance as one which is uniformly distributed over $[1, b - a + 1]$ . Therefore, the desired variance is given by the above formula with $n = b - a + 1$ , which yields
$var(X)=\frac{(b - a+ 1)^2-1}{12}=\frac{(b - a)(b - a+ 2)}{12}$

Poisson Random Variable

$\begin{aligned}E[X]=&\sum_{k=0}^\infty ke^{-\lambda}\frac{\lambda^k}{k!} \\=&\sum_{k=1}^\infty ke^{-\lambda}\frac{\lambda^k}{k!}\ \ \ \ \ \ \ \ (the\ k=0\ term\ is\ zero) \\=&\lambda\sum_{k=1}^\infty e^{-\lambda}\frac{\lambda^{k-1}}{(k-1)!} \\=&\lambda\sum_{k=1}^\infty e^{-\lambda}\frac{\lambda^{m}}{m!}\ \ \ \ \ \ \ \ (let\ m=k-1)\\ =&\lambda\end{aligned}$

A similar calculation shows that the variance of a Poisson random variable is also $\lambda$ .

Decision Making Using Expected Values

Example 2.8. The Quiz Problem.
Consider a quiz game where a person is given two questions and must decide which one to answer first. Question 1 will be answered correctly with probability $p_1$ , and the person will then receive as prize $v_1$ , while question 2 will be answered correctly with probability $p_2$ , and the person will then receive as prize $v_2$ . If the first question attempted is answered incorrectly, the quiz terminates. If the first question is answered correctly, the person is allowed to attempt the second question. Which question should be answered first to maximize the expected value of the total prize money received?

If question 1 is answered first, we have
$E[X]=p_1(1-p_2)v_1+p_1p_2(v_1+v_2)=p_1v_1+p_1p_2v_2$ while if question 2 is answered first, we have
$E[X]=p_2(1-p_1)v_2+p_2p_1(v_2+v_1)=p_2v_2+p_2p_1v_1$
It is thus optimal to answer question 1 first if and only if
$p_1v_1+p_1p_2v_2\geq p_2v_2+p_2p_1v_1$ or equivalently if
$\frac{p_1v_1}{1-p_1}\geq\frac{p_2v_2}{1-p_2}$ Therefore , it is optimal to order the questions in decreasing value of the expression $p v / (1 - p)$ . which provides a convenient index of quality for a question with probability of correct answer $p$ and value $v$ .
Interestingly, this rule generalizes to the case of more than two questions.

Chapter 2 (Discrete Random Variables): Expectation, Mean, and Variance (期望、均值、方差)

目录