本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
Continuous Random Variables and PDFs
- A random variable X X X is called continuous if there is a nonnegative function f X f_X fX, called the probability density function of X X X, or PDF for short, such that
P ( X ∈ B ) = ∫ B f X ( x ) d x P(X\in B)=\int_Bf_X(x)dx P(X∈B)=∫BfX(x)dxfor every subset B B B of the realline. Note that to qualify as a PDF, a function f X f_X fX must be nonnegative, and must also have the normalization property
∫ − ∞ ∞ f X ( x ) d x = P ( − ∞ < X < ∞ ) = 1 \int_{-\infty}^\infty f_X(x)dx=P(-\infty<X<\infty)=1 ∫−∞∞fX(x)dx=P(−∞<X<∞)=1 - In particular, the probability that the value of X X X falls within an interval is
P ( a ≤ X ≤ B ) = ∫ a b f X ( x ) d x P(a\leq X\leq B)=\int_a^bf_X(x)dx P(a≤X≤B)=∫abfX(x)dxFor any single value a a a, we have
P ( X = a ) = ∫ a a f X ( x ) d x = 0 P(X=a)=\int_a^af_X(x)dx=0 P(X=a)=∫aafX(x)dx=0For this reason, including or excluding the endpoints of an interval has no effect on its probability:
P ( a ≤ X ≤ B ) = P ( a < X < B ) = P ( a ≤ X < B ) = P ( a < X ≤ B ) P(a\leq X\leq B)=P(a< X<B)=P(a\leq X< B)=P(a< X\leq B) P(a≤X≤B)=P(a<X<B)=P(a≤X<B)=P(a<X≤B) - To interpret the PDF, note that for an interval [ x , x + δ ] [x,x+\delta] [x,x+δ] with a very small length δ \delta δ, we have
P ( [ x , x + δ ] ) = ∫ x x + δ f X ( t ) d t ≈ f X ( x ) ⋅ δ P([x,x+\delta])=\int_{x}^{x+\delta}f_X(t)dt\approx f_X(x)\cdot\delta P([x,x+δ])=∫xx+δfX(t)dt≈fX(x)⋅δso we can view f X ( x ) f_X(x) fX(x) as the “probability mass per unit length” near x x x. It is important to realize that even though a PDF is used to calculate event probabilities, f X ( x ) f_X (x) fX(x) is not the probability of any particular event. In particular, it is not restricted to be less than or equal to one.
Example 3.2. Piecewise Constant PDF. (逐段常数的 PDF)
Alvin’s driving time to work is between 15 and 20 minutes if the day is sunny, and between 20 and 25 minutes if the day is rainy, with all times being equally likely in each case. Assume that a day is sunny with probability 2 / 3 2/3 2/3 and rainy with probability 1 / 3 1 /3 1/3. What is the PDF of the driving time, viewed as a random variable X X X?
SOLUTION
- PDF:
where c 1 c_1 c1 and c 2 c_2 c2 are some constants. We can determine these constants by using the given probabilities of a sunny and of a rainy day:
2 3 = P ( s u n n y d a y ) = ∫ 15 20 f X ( x ) d x = 5 c 1 , c 1 = 2 15 1 3 = P ( r a i n y d a y ) = ∫ 20 25 f X ( x ) d x = 5 c 2 , c 2 = 1 15 \frac{2}{3}=P(sunny\ day)=\int_{15}^{20}f_X(x)dx=5c_1,\ \ \ \ c_1=\frac{2}{15} \\\frac{1}{3}=P(rainy\ day)=\int_{20}^{25}f_X(x)dx=5c_2,\ \ \ \ c_2=\frac{1}{15} 32=P(sunny day)=∫1520fX(x)dx=5c1, c1=15231=P(rainy day)=∫2025fX(x)dx=5c2, c2=151
Example 3.3. A PDF Can Take Arbitrarily Large Values.
- Consider a random variable X X X with PDF
- Even though f X ( x ) f_X(x) fX(x) becomes infinitely large as x x x approaches zero, this is still a valid PDF, because
∫ − ∞ ∞ f X ( x ) d x = ∫ 0 1 1 2 x d x = 1 \int_{-\infty}^\infty f_X(x)dx=\int_0^1\frac{1}{2\sqrt x}dx=1 ∫−∞∞fX(x)dx=∫012x1dx=1
Expectation
- The expected value or expectation or mean of a continuous random variable X X X is defined by
E [ X ] = ∫ − ∞ ∞ x f X ( x ) d x E[X]=\int_{-\infty}^\infty xf_X(x)dx E[X]=∫−∞∞xfX(x)dx - If X X X is a continuous random variable with given PDF. any real-valued function Y = g ( X ) Y = g(X) Y=g(X) of X X X is also a random variable.
E [ g ( X ) ] = ∫ − ∞ ∞ g ( x ) f X ( x ) d x E[g(X)]=\int_{-\infty}^\infty g(x)f_X(x)dx E[g(X)]=∫−∞∞g(x)fX(x)dx - The variance of X X X is defined by
v a r ( X ) = E [ ( X − E [ X ] ) 2 ] = ∫ − ∞ ∞ ( x − E [ X ] ) 2 f X ( x ) d x = E [ X 2 ] − ( E [ X ] ) 2 \begin{aligned}var(X)&=E[(X-E[X])^2]=\int_{-\infty}^\infty (x-E[X])^2f_X(x)dx \\&=E[X^2]-(E[X])^2\end{aligned} var(X)=E[(X−E[X])2]=∫−∞∞(x−E[X])2fX(x)dx=E[X2]−(E[X])2 - If Y = a X + b Y = aX + b Y=aX+b, where a a a and b b b are given scalars, then
E [ Y ] = a E [ X ] + b v a r ( Y ) = a 2 v a r ( X ) E[Y]=aE[X]+b\\ var(Y)=a^2var(X) E[Y]=aE[X]+bvar(Y)=a2var(X)
One has to deal with the possibility that the integral ∫ − ∞ ∞ x f X ( x ) d x \int_{-\infty}^\infty xf_X(x)dx ∫−∞∞xfX(x)dx is infinite or undefined. More concretely. we will say that the expectation is well-defined if ∫ − ∞ ∞ ∣ x ∣ f X ( x ) d x < ∞ \boldsymbol{\int_{-\infty}^\infty |x|f_X(x)dx<\infty} ∫−∞∞∣x∣fX(x)dx<∞. In that case, it is known that the integral ∫ − ∞ ∞ x f X ( x ) d x \int_{-\infty}^\infty xf_X(x)dx ∫−∞∞xfX(x)dx takes a finite and unambiguous value. Throughout this book. in the absence of an indication to the contrary, we implicitly assume that the expected value of any random variable of interest is well-defined.
Example 3.4. Mean and Variance of the Uniform Random Variable. (均匀随机变量)
- We can consider a random variable X X X that takes values in an interval [ a , b ] [a, b] [a,b], and again assume that any two subintervals of the same length have the same probability. We refer to this type of random variable as uniform or uniformly distributed. Its PDF has the form
E [ X ] = ∫ − ∞ ∞ x f X ( x ) d x = ∫ a b x b − a d x = a + b 2 E [ X 2 ] = ∫ a b x 2 b − a d x = a 2 + a b + b 2 3 v a r ( X ) = E [ X 2 ] − ( E [ X ] ) 2 = ( b − a ) 2 12 E[X]=\int_{-\infty}^\infty xf_X(x)dx=\int_a^b\frac{x}{b-a}dx=\frac{a+b}{2}\\ E[X^2]=\int_a^b\frac{x^2}{b-a}dx=\frac{a^2+ab+b^2}{3}\\ var(X)=E[X^2]-(E[X])^2=\frac{(b-a)^2}{12} E[X]=∫−∞∞xfX(x)dx=∫abb−axdx=2a+bE[X2]=∫abb−ax2dx=3a2+ab+b2var(X)=E[X2]−(E[X])2=12(b−a)2
Problem 3
Show that the expected value of a discrete or continuous random variable X X X satisfies
E [ X ] = ∫ 0 ∞ P ( X > x ) d x − ∫ 0 ∞ P ( X < − x ) d x E[X] = \int_0^\infty P(X > x) dx - \int_0^\infty P(X < -x) dx E[X]=∫0∞P(X>x)dx−∫0∞P(X<−x)dx
SOLUTION
- Suppose that X X X is continuous. We then have
∫ 0 ∞ P ( X > x ) d x = ∫ 0 ∞ ( ∫ x ∞ f X ( y ) d y ) d x = ∫ 0 ∞ ( ∫ 0 y f X ( y ) d x ) d y = ∫ 0 ∞ f X ( y ) ( ∫ 0 y d x ) d y = ∫ 0 ∞ y f X ( y ) d y \begin{aligned}\int_0^\infty P(X > x)dx &=\int_0^\infty (\int_x^\infty f_X(y)dy)dx \\&=\int_0^\infty (\int_0^y f_X(y)dx)dy \\&=\int_0^\infty f_X(y)(\int_0^y dx )dy \\&=\int_0^\infty yf_X(y)dy\end{aligned} ∫0∞P(X>x)dx=∫0∞(∫x∞fX(y)dy)dx=∫0∞(∫0yfX(y)dx)dy=∫0∞fX(y)(∫0ydx)dy=∫0∞yfX(y)dy, where for the second equality we have reversed the order of integration by writing the set { ( x . y ) ∣ 0 ≤ x < ∞ , x ≤ y < ∞ } \{(x.y) | 0\leq x <\infty, x\leq y <\infty\} { (x.y)∣0≤x<∞,x≤y<∞} as { ( x . y ) ∣ 0 ≤ x ≤ y , 0 ≤ y < ∞ } \{(x.y) |0\leq x\leq y, 0\leq y <\infty\} { (x.y)∣0≤x≤y,0≤y<∞}. Similarly. we can show that
∫ 0 ∞ P ( X < − x ) d x = − ∫ − ∞ 0 y f X ( y ) d y \int_0^\infty P(X < -x) dx = - \int^0_{-\infty} yf_X(y) dy ∫0∞P(X<−x)dx=−∫−∞0yfX(y)dyCombining the two relations above, we obtain the desired result. - If X X X is discrete, we have
P ( X > x ) = ∫ 0 ∞ ( ∑ y > x p X ( y ) ) d x = ∑ y > 0 ( ∫ 0 y p X ( y ) d x ) = ∑ y > 0 p X ( y ) ( ∫ 0 y d x ) = ∑ y > 0 p X ( y ) y \begin{aligned}P(X > x) &=\int_0^\infty(\sum_{y>x}p_X(y))dx \\&=\sum_{y>0}(\int_0^y p_X(y)dx) \\&=\sum_{y>0}p_X(y)(\int_0^y dx) \\&=\sum_{y>0} p_X(y)y\end{aligned} P(X>x)=∫0∞(y>x∑pX(y))dx=y>0∑(∫0ypX(y)dx)=y>0∑pX(y)(∫0ydx)=y>0∑pX(y)yand the rest of the argument is similar to the continuous case.
Problem 4.
Establish the validity of the expected value rule
E [ g ( X ) ] = ∫ − ∞ ∞ g ( x ) f X ( x ) d x E[g(X)]=\int_{-\infty}^\infty g(x)f_X(x)dx E[g(X)]=∫−∞∞g(x)fX(x)dxwhere X X X is a continuous random variable with PDF f X f_X fX.
SOLUTION
- Let us express the function g g g as the difference of two nonnegative functions,
g ( x ) = g + ( x ) − g − ( x ) g(x) =g^+(x)-g^-(x) g(x)=g+(x)−g−(x)where g + ( x ) = m a x { g ( x ) , 0 } g^+(x)= max\{g(x ),0\} g+(x)=max{ g(x),0}, and g − ( x ) = m a x { − g ( x ) , 0 } g^-(x) = max\{-g( x ),0\} g−(x)=max{ −g(x),0}. We will use the result
E [ g ( X ) ] = ∫ 0 ∞ P ( g ( X ) > x ) d x − ∫ 0 ∞ P ( g ( X ) < − x ) d x E[g(X)] = \int_0^\infty P(g(X)> x) dx - \int_0^\infty P(g(X)< -x) dx E[g(X)]=∫0∞P(g(X)>x)dx−∫0∞P(g(X)<−x)dxfrom the preceding problem. The first term in the right-hand side is equal to
∫ 0 ∞ ∫ { x ∣ g ( x ) > t } f X ( x ) d x d t = ∫ − ∞ ∞ ∫ { t ∣ 0 ≤ t < g ( x ) } f X ( x ) d t d x = ∫ − ∞ ∞ f X ( x ) g + ( x ) d x \int_0^\infty \int_{\{x|g(x)>t\}}f_X(x) dx dt = \int_{-\infty}^\infty\int_{\{t|0\leq t<g(x)\}}f_X(x)dtdx=\int_{-\infty}^\infty f_X(x)g^+(x)dx ∫0∞∫{ x∣g(x)>t}fX(x)dxdt=∫−∞∞∫{ t∣0≤t<g(x)}fX(x)dtdx=∫−∞∞fX(x)g+(x)dxBy a symmetrical argument, the second term in the right-hand side is given by
∫ − ∞ ∞ f X ( x ) g − ( x ) d x \int_{-\infty}^\infty f_X(x)g^-(x)dx ∫−∞∞fX(x)g−(x)dx - Combining the above equalities, we obtain
E [ g ( X ) ] = ∫ − ∞ ∞ f X ( x ) g + ( x ) d x − ∫ − ∞ ∞ f X ( x ) g − ( x ) d x = ∫ − ∞ ∞ f X ( x ) g ( x ) d x E[g(X)] =\int_{-\infty}^\infty f_X(x)g^+(x)dx-\int_{-\infty}^\infty f_X(x)g^-(x)dx=\int_{-\infty}^\infty f_X(x)g(x)dx E[g(X)]=∫−∞∞fX(x)g+(x)dx−∫−∞∞fX(x)g−(x)dx=∫−∞∞fX(x)g(x)dx
Exponential Random Variable
指数随机变量
- An exponential random variable has a PDF of the form
where λ \lambda λ is a positive parameter characterizing the PDF. This is a legitimate PDF because
∫ − ∞ ∞ f X ( x ) d x = ∫ 0 ∞ λ e − λ x = 1 \int_{-\infty}^\infty f_X(x)dx=\int_0^\infty\lambda e^{-\lambda x}=1 ∫−∞∞fX(x)dx=∫0∞λe−λx=1 - Note that the probability that X X X exceeds a certain value decreases exponentially. Indeed, for any a ≥ 0 a \geq 0 a≥0, we have
P ( X ≥ a ) = ∫ a ∞ λ e − λ x d x = e − λ a P(X\geq a)=\int_a^\infty\lambda e^{-\lambda x}dx=e^{-\lambda a} P(X≥a)=∫a∞λe−λxdx=e−λa( P ( a ≤ X ≤ b ) = P ( X ≥ a ) − P ( X ≥ b ) = e − λ a − e − λ b P(a\leq X \leq b)=P(X\geq a)-P(X\geq b)=e^{-\lambda a}-e^{-\lambda b} P(a≤X≤b)=P(X≥a)−P(X≥b)=e−λa−e−λb)
- The mean and the variance can be calculated to be
E [ X ] = 1 λ , v a r ( X ) = 1 λ 2 E[X]=\frac{1}{\lambda},\ \ \ \ \ \ \ \ var(X)=\frac{1}{\lambda^2} E[X]=λ1, var(X)=λ21 E [ X ] = ∫ 0 ∞ x λ e − λ x d x = ( − x e − λ x ) ∣ 0 ∞ + ∫ 0 ∞ e − λ x d x = 0 − e − λ x λ ∣ 0 ∞ = 1 λ E [ X 2 ] = ∫ 0 ∞ x 2 λ e − λ x d x = ( − x 2 e − λ x ) ∣ 0 ∞ + ∫ 0 ∞ 2 x e − λ x d x = 0 + 2 λ E [ X ] = 2 λ 2 v a r ( X ) = E [ X 2 ] − ( E [ X ] ) 2 = 1 λ 2 \begin{aligned}E[X]&=\int_0^\infty x\lambda e^{-\lambda x}dx\\ &=(-xe^{-\lambda x})\Big|^\infty_0+\int_0^\infty e^{-\lambda x}dx \\&=0-\frac{e^{-\lambda x}}{\lambda}\Big|^\infty_0 \\&=\frac{1}{\lambda} \\E[X^2]&=\int_0^\infty x^2\lambda e^{-\lambda x}dx\\ &=(-x^2e^{-\lambda x})\Big|^\infty_0+\int_0^\infty 2xe^{-\lambda x}dx \\&=0+\frac{2}{\lambda}E[X] \\&=\frac{2}{\lambda^2} \\var(X)&=E[X^2]-(E[X])^2=\frac{1}{\lambda^2}\end{aligned} E[X]E[X2]var(X)=∫0∞xλe−λxdx=(−xe−λx)∣∣∣0∞+∫0∞e−λxdx=0−λe−λx∣∣∣0∞=λ1=∫0∞x2λe−λxdx=(−x2e−λx)∣∣∣0∞+∫0∞2xe−λxdx=0+λ2E[X]=λ22=E[X2]−(E[X])2=λ21
- An exponential random variable can, for example, be a good model for the amount of time until an incident of interest takes place. We will see that it is closely connected to the geometric random variable, which also relates to the (discrete) time that will elapse until an incident of interest takes place.