Chapter 4 (Further Topics on Random Variables): Transforms (矩母函数)

本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记

Transform

  • The transform associated with a random variable X X X (also referred to as the associated moment generating function (矩母函数)) is a function M X ( s ) M_X(s) MX(s) of a scalar parameter s s s, defined by
    M X ( s ) = E [ e s X ] M_X(s)=E[e^{sX}] MX(s)=E[esX]

The simpler notation M ( s ) M(s) M(s) can also be used whenever the underlying random variable X X X is clear from the context.

  • It is important to realize that the transforrn is not a number but rather a f u n c t i o n function function of a parameter s s s.
  • Strictly speaking, M ( s ) M(s) M(s) is only defined for those values of s s s for which E [ e s X ] E[e^{sX}] E[esX] is finite.

Two useful and generic properties of transforms.

  • For any random variable X X X, we have
    M X ( 0 ) = E [ e 0 X ] = E [ 1 ] = 1 M_X(0)=E[e^{0X}]=E[1]=1 MX(0)=E[e0X]=E[1]=1and if X X X takes only nonnegative integer values, then
    lim ⁡ s → − ∞ M X ( s ) = P ( X = 0 ) \lim_{s\rightarrow-\infty}M_X(s)=P(X=0) slimMX(s)=P(X=0)

Example 4.25. The Transform Associated with a Linear Function of a Random Variable.

  • Let M X ( s ) M_X(s) MX(s) be the transform associated with a random variable X X X. Consider a new random variable Y = a X + b Y = aX + b Y=aX+b. We then have
    M Y ( s ) = E [ e s ( a X + b ) ] = e s b E [ e s a X ] = e s b M X ( s a ) M_Y(s) = E[e^{s(aX+b)}] = e^{sb} E[e^{sa X} ] = e^{sb}M_X(sa) MY(s)=E[es(aX+b)]=esbE[esaX]=esbMX(sa)

Transforms for Common Random Variables

在这里插入图片描述
在这里插入图片描述


Example 4.23. The Transform Associated with a Poisson Random Variable.

  • Let X X X be a Poisson random variable with parameter λ \lambda λ. The corresponding transform is
    M ( s ) = ∑ x = 0 ∞ e s x λ x e − λ x ! M(s)=\sum_{x=0}^\infty e^{sx}\frac{\lambda^xe^{-\lambda}}{x!} M(s)=x=0esxx!λxeλ
  • We let a = e s λ a=e^s\lambda a=esλ and obtain
    M ( s ) = e − λ ∑ x = 0 ∞ a x x ! = e − λ e a = e λ ( e s − 1 ) M(s)=e^{-\lambda}\sum_{x=0}^\infty \frac{a^x}{x!}=e^{-\lambda}e^a=e^{\lambda(e^s-1)} M(s)=eλx=0x!ax=eλea=eλ(es1)

Example 4.24. The Transform Associated with an Exponential Random Variable.

  • Let X X X be an exponential random variable with parameter λ \lambda λ. Then
    M ( s ) = λ ∫ 0 ∞ e s x e − λ x d x = λ ∫ 0 ∞ e ( s − λ ) x d x = λ e ( s − λ ) x s − λ ∣ 0 ∞      ( i f   s < λ ) = λ λ − s      ( i f   s < λ ) \begin{aligned}M(s)&=\lambda\int_0^\infty e^{sx}e^{-\lambda x}dx=\lambda\int_0^\infty e^{(s-\lambda)x}dx \\&=\lambda\frac{e^{(s-\lambda)x}}{s-\lambda}\bigg|^\infty_0\ \ \ \ (if\ s<\lambda) \\&=\frac{\lambda}{\lambda-s}\ \ \ \ (if\ s<\lambda)\end{aligned} M(s)=λ0esxeλxdx=λ0e(sλ)xdx=λsλe(sλ)x0    (if s<λ)=λsλ    (if s<λ)

Example 4.26. The Transform Associated with a Normal Random Variable.

  • Let X X X be a normal random variable with mean μ μ μ and variance σ 2 \sigma^2 σ2.
  • To calculate the corresponding transform, we first consider the special case of the standard normal random variable Y Y Y, and then use the formula derived in the preceding example.
    M Y ( s ) = ∫ − ∞ ∞ 1 2 π e − y 2 / 2 e s y d y = 1 2 π ∫ − ∞ ∞ e − y 2 / 2 + s y d y = e s 2 / 2 1 2 π ∫ − ∞ ∞ e − ( y − s ) 2 / 2 d y = e s 2 / 2 \begin{aligned} M_Y(s)&=\int_{-\infty}^\infty\frac{1}{\sqrt{2\pi}}e^{-y^2/2}e^{sy}dy \\&=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-y^2/2+sy}dy \\&=e^{s^2/2}\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-(y-s)^2/2}dy \\&=e^{s^2/2} \end{aligned} MY(s)=2π 1ey2/2esydy=2π 1ey2/2+sydy=es2/22π 1e(ys)2/2dy=es2/2
  • A general normal random variable with mean μ μ μ and variance σ 2 \sigma^2 σ2 is obtained from the standard normal via the linear transformation
    X = σ Y + μ X=\sigma Y+\mu X=σY+μThen
    M X ( s ) = e s μ M Y ( s σ ) = e ( σ 2 s 2 / 2 ) + μ s M_X(s)=e^{s\mu}M_Y(s\sigma)=e^{(\sigma^2s^2/2)+\mu s} MX(s)=esμMY(sσ)=e(σ2s2/2)+μs

From Transforms to Moments

矩母函数

  • The reason behind the alternative name “moment generating function" is that the moments of a random variable are easily computed once a formula for the associated transform is available.
  • To see this, let us consider a continuous random variable X X X, and let us take the derivative of both sides of the definition with respect to s s s. We obtain
    d d s M ( s ) = d d s ∫ − ∞ ∞ e s x f X ( x ) d x = ∫ − ∞ ∞ x e s x f X ( x ) d x \frac{d}{ds}M(s)=\frac{d}{ds}\int_{-\infty}^\infty e^{sx}f_X(x)dx=\int_{-\infty}^\infty xe^{sx}f_X(x)dx dsdM(s)=dsdesxfX(x)dx=xesxfX(x)dx
  • This equality holds for all values of s s s. By considering the special case where s = 0 s = 0 s=0, we obtain
    d d s M ( s ) ∣ s = 0 = ∫ − ∞ ∞ x f X ( x ) d x = E [ X ] \frac{d}{ds}M(s)\bigg|_{s=0}=\int_{-\infty}^\infty xf_X(x)dx=E[X] dsdM(s)s=0=xfX(x)dx=E[X]More generally, if we differentiate n n n times the function M ( s ) M(s) M(s), with respect to s s s, a similar calculation yields
    d n d s n M ( s ) ∣ s = 0 = ∫ − ∞ ∞ x n f X ( x ) d x = E [ X n ] \frac{d^n}{ds^n}M(s)\bigg|_{s=0}=\int_{-\infty}^\infty x^nf_X(x)dx=E[X^n] dsndnM(s)s=0=xnfX(x)dx=E[Xn]

Example 4.27.

  • For an exponential random variable with PDF
    f X ( x ) = λ e − λ x ,        x ≥ 0 f_X(x) =\lambda e^{-\lambda x},\ \ \ \ \ \ x\geq0 fX(x)=λeλx,      x0we have
    M ( s ) = λ λ − s M(s)=\frac{\lambda }{\lambda -s} M(s)=λsλ
  • Thus,
    d d s M ( s ) = λ ( λ − s ) 2 ,        d 2 d s 2 M ( s ) = 2 λ ( λ − s ) 3 \frac{d}{ds}M(s)=\frac{\lambda}{(\lambda-s)^2},\ \ \ \ \ \ \frac{d^2}{ds^2}M(s)=\frac{2\lambda}{(\lambda-s)^3} dsdM(s)=(λs)2λ,      ds2d2M(s)=(λs)32λBy setting s = 0 s = 0 s=0, we obtain
    E [ X ] = 1 λ ,      E [ X 2 ] = 2 λ 2 E[X]=\frac{1}{\lambda},\ \ \ \ E[X^2]=\frac{2}{\lambda^2} E[X]=λ1,    E[X2]=λ22

Inversion of Transforms

矩母函数的逆

Inversion Property

  • The transform M X ( s ) M_X(s) MX(s) associated with a random variable X X X uniquely determines the CDF of X X X, assuming that M X ( s ) M_X(s) MX(s) is finite for all s s s in some interval [ − a , a ] [-a,a] [a,a], where a a a is a positive number.

Its proof is beyond our scope.


  • There exist explicit formulas that allow us to recover the PMF or PDF of a random variable starting from the associated transform, but they are quite difficult to use. In practice, transforms are usually inverted by “pattern matching,” based on tables of known distribution-transform pairs.

Example 4.28

  • We are told that the transform associated with a random variable X X X is
    M ( s ) = 1 4 e − s + 1 2 + 1 8 e 4 s + 1 8 e 5 s M(s) =\frac{1}{4}e^{-s}+\frac{1}{2}+\frac{1}{8}e^{4s}+\frac{1}{8}e^{5s} M(s)=41es+21+81e4s+81e5s
  • Since M ( s ) M (s) M(s) is a sum of terms of the form e s x e^{sx} esx, we can compare with the general formula
    M ( s ) = ∑ x e s x p X ( x ) M(s) = \sum_xe^{sx}p_X(x) M(s)=xesxpX(x)and infer that X X X is a discrete random variable. The different values that X X X can take are − 1 -1 1, 0 0 0, 4 4 4, and 5 5 5.
    P ( X = − 1 ) = 1 4 ,    P ( X = 0 ) = 1 2 ,    P ( X = 4 ) = 1 8 ,    P ( X = 5 ) = 1 8 P(X=-1)=\frac{1}{4},\ \ P(X=0)=\frac{1}{2},\ \ P(X=4)=\frac{1}{8},\ \ P(X=5)=\frac{1}{8} P(X=1)=41,  P(X=0)=21,  P(X=4)=81,  P(X=5)=81
  • Generalizing from the last example, the distribution of a finite-valued discrete random variable can be always found by inspection of the corresponding transform.

  • The same procedure also works for discrete random variables with an infinite range, as in the example that follows.

Example 4.29. The Transform Associated with a Geometric Random Variable.

  • We are told that the transform associated with a random variable X X X is of the form
    M ( s ) = p e s 1 − ( 1 − p ) e s M(s)=\frac{pe^s}{1-(1-p)e^s} M(s)=1(1p)espeswhere p p p is a constant in the range 0 < p < 1 0 < p<1 0<p<1. We wish to find the distribution of X X X. We recall the formula for the geometric series:
    1 1 − α = 1 + α + α 2 + . . . \frac{1}{1-\alpha} =1+\alpha+\alpha^2+... 1α1=1+α+α2+...which is valid whenever ∣ α ∣ < 1 |\alpha|< 1 α<1. We use this formula with α = ( 1 − p ) e s \alpha = (1 - p)e^s α=(1p)es, and for s s s sufficiently close to zero so that ( 1 − p ) e s < 1 (1 - p)e^s< 1 (1p)es<1. We obtain
    M ( s ) = p e s ( 1 + ( 1 − p ) e s + ( 1 − p ) 2 e 2 s + . . . ) M(s)=pe^s(1+(1 - p)e^s+(1 - p)^2e^{2s}+...) M(s)=pes(1+(1p)es+(1p)2e2s+...)
  • As in the previous example, we infer that this is a discrete random variable that takes positive integer values. The probability P ( X = k ) P(X = k) P(X=k) is found by reading the coefficient of the term e k s e^{ks} eks. In particular, P ( X = k ) = p ( 1 − p ) k − 1 , k = 1 , 2 , . . . P(X = k) = p(1 - p)^{k -1}, k = 1, 2, ... P(X=k)=p(1p)k1,k=1,2,...We recognize this as the geometric distribution with parameter p p p.

Example 4.30. The Transform Associated with a Mixture of Two Distributions.

  • Let X 1 , . . . , X n X_1 , ... , X_n X1,...,Xn be continuous random variables with PDFs f X 1 , . . . , f X n f_{X_1}, ... , f_{X_n} fX1,...,fXn. The value y y y of a random variable Y Y Y is generated as follows: an index i i i is chosen with a corresponding probability p i p_i pi, and y y y is taken to be equal to the value of X i X_i Xi. Then,
    f Y ( y ) = p 1 f X 1 ( y ) + ⋅ ⋅ ⋅ + p n f X n ( y ) f_Y(y) = p_1f_{X_1}(y) +· · ·+ p_n f_{X_n}(y) fY(y)=p1fX1(y)++pnfXn(y)and
    M Y ( s ) = ∫ − ∞ ∞ e s x f Y ( y ) d y = p 1 M X 1 ( s ) + . . . + p n M X n ( s ) M_Y(s) =\int_{-\infty}^\infty e^{sx}f_Y(y)dy= p_1 M_{X_1}(s)+...+p_n M_{X_n} (s) MY(s)=esxfY(y)dy=p1MX1(s)+...+pnMXn(s)
  • The steps can be reversed. For example, we may be given that the transform associated with a random variable Y Y Y is of the form
    1 2 ⋅ 1 2 − s + 3 4 ⋅ 1 1 − s \frac{1}{2}\cdot\frac{1}{2-s}+\frac{3}{4}\cdot\frac{1}{1-s} 212s1+431s1We can then rewrite it as
    1 4 ⋅ 2 2 − s + 3 4 ⋅ 1 1 − s \frac{1}{4}\cdot\frac{2}{2-s}+\frac{3}{4}\cdot\frac{1}{1-s} 412s2+431s1and recognize that Y Y Y is the mixture of two exponential random variables with parameters 2 and 1, which are selected with probabilities 1 / 4 1 / 4 1/4 and 3 / 4 3 / 4 3/4, respectively.

Problem 37.
A pizza parlor serves n n n different types of pizza, and is visited by a number K K K of customers in a given period of time, where K K K is a nonnegative integer random variable with a known associated transform M K ( s ) = E [ e s K ] M_K (s) = E[e^{sK} ] MK(s)=E[esK]. Each customer orders a single pizza, with all types of pizza being equally likely, independent of the number of other customers and the types of pizza they order. Give a formula, in terms of M K ( . ) M_K(.) MK(.) for the expected number of different types of pizzas ordered.

SOLUTION

  • Let X X X be the number of different types of pizza ordered. Let X i X_i Xi be the random variable defined by
    在这里插入图片描述
  • We have X = X 1 + . . . + X n X = X_1 +... + X_n X=X1+...+Xn, and by the law of iterated expectations,
    E [ X ] = E [ E [ X ∣ K ] = E [ E [ X 1 + . . . + X n ∣ K ] ] = n E [ E [ X 1 ∣ K ] ] E[X] = E[E[X |K]=E[E[X_1 +... + X_n|K]]=nE[E[X_1|K]] E[X]=E[E[XK]=E[E[X1+...+XnK]]=nE[E[X1K]]
  • Furthermore, since the probability that a customer does not order a pizza of type 1 is ( n − 1 ) / n (n - 1)/n (n1)/n, we have
    E [ X 1 ∣ K = k ] = 1 − ( n − 1 n ) k E[X_1|K = k] = 1-(\frac{n-1}{n})^k E[X1K=k]=1(nn1)kso that
    E [ X 1 ∣ K ] = 1 − ( n − 1 n ) K E[X_1|K] = 1-(\frac{n-1}{n})^K E[X1K]=1(nn1)KThus, denoting
    p = n − 1 n p =\frac{n- 1}{n} p=nn1we have
    E [ X ] = n E [ 1 − p K ] = n − n E [ p K ] = n − n E [ e K l o g p ] = n − n M K ( l o g p ) E[X]=nE[1-p^K]=n-nE[p^K]=n-nE[e^{Klogp}]=n-nM_K(logp) E[X]=nE[1pK]=nnE[pK]=nnE[eKlogp]=nnMK(logp)

Problem 40.
Suppose that the transform associated with a discrete random variable X X X has the form
M ( s ) = A ( e s ) B ( e s ) M(s)=\frac{A(e^s)}{B(e^s)} M(s)=B(es)A(es)where A ( t ) A(t) A(t) and B ( t ) B(t) B(t) are polynomials of the generic variable t t t. Assume that A ( t ) A(t) A(t) and B ( t ) B (t) B(t) have no common roots and that the degree of A ( t ) A (t) A(t) is smaller than the degree of B ( t ) B(t) B(t). Assume also that B ( t ) B(t) B(t) has distinct, real, and nonzero roots that have absolute value greater than 1 1 1. Then it can be seen that M ( s ) M (s) M(s) can be written in the form
M ( s ) = a 1 1 − r 1 e s + . . . + a m 1 − r m e s M(s)=\frac{a_1}{1-r_1e^s}+...+\frac{a_m}{1-r_me^s} M(s)=1r1esa1+...+1rmesamwhere 1 / r 1 , . . . , 1 / r m 1/r_1, ..., 1 / r_m 1/r1,...,1/rm are the roots of B ( t ) B (t) B(t) and the a i a_i ai are constants that are equal to l i m e s → 1 r i ( 1 − r i e s ) M ( s ) , i = 1 , . . . , m lim_{e^s\rightarrow\frac{1}{r_i}}(1 -r_ie^s) M (s), i = 1, ... , m limesri1(1ries)M(s),i=1,...,m.

  • ( a ) (a) (a) Show that the PMF of X X X has the form
    在这里插入图片描述[Note: For large k k k, the PMF of X X X can be approximated by a i ‾ r i ‾ k a_{\overline i}r_{\overline i}^k airik, where i ‾ {\overline i} i is the index corresponding to the largest ∣ r i ∣ |r_i| ri (assuming i ‾ {\overline i} i, is unique).
  • ( b ) (b) (b) Extend the result of part ( a ) (a) (a) to the case where M ( s ) = e b s A ( e s ) / B ( e s ) M(s) = e^{bs}A(e^s )/ B(e^s) M(s)=ebsA(es)/B(es) and b b b is an integer.

SOLUTION

  • (a) We have for all s s s such that ∣ r i ∣ e s < 1 |r_i|e^s<1 ries<1
    1 1 − r i e s = 1 + r i e s + r i 2 e 2 s + . . . \frac{1}{1-r_ie^s}=1+r_ie^s+r_i^2e^{2s}+... 1ries1=1+ries+ri2e2s+...Therefore
    M ( s ) = ∑ i = 1 m a i + ( ∑ i = 1 m a i r i ) e s + ( ∑ i = 1 m a i r i 2 ) e 2 s + . . . M(s)=\sum_{i=1}^ma_i+(\sum_{i=1}^ma_ir_i)e^s+(\sum_{i=1}^ma_ir_i^2)e^{2s}+... M(s)=i=1mai+(i=1mairi)es+(i=1mairi2)e2s+...We see that
    P ( X = k ) = ∑ i = 1 m a i r i k P(X=k)=\sum_{i=1}^ma_ir_i^k P(X=k)=i=1mairikfor k ≥ 0 k\geq0 k0, and P ( X = k ) = 0 P(X = k) = 0 P(X=k)=0 for k < 0 k < 0 k<0.
  • ( b ) (b) (b) In this case, M ( s ) M(s) M(s) corresponds to the translation (平移) by b b b of a random variable whose transform is A ( e s ) / B ( e s ) A(e^s )/ B(e^s ) A(es)/B(es), so we have
    在这里插入图片描述

Sums of Independent Random Variables

  • Transform methods are particularly convenient when dealing with a sum of random variables. The reason is that addition of independent random variables corresponds to multiplication of transforms. This provides an often convenient alternative to the convolution formula.

  • If X 1 , . . . , X n X_1 , ... , X_n X1,...,Xn is a collection of independent random variables, and
    Z = X 1 + . . . + X n Z = X_1+... + X_n Z=X1+...+Xn The transform associated with Z Z Z is, by definition,
    M Z ( s ) = E [ e s Z ] = E [ e s X 1 e s X 2 . . . e s X n ] M_Z(s)=E[e^{sZ}]=E[e^{sX_1}e^{sX_2}...e^{sX_n}] MZ(s)=E[esZ]=E[esX1esX2...esXn]Since X i X_i Xi and X j X_j Xj are independent ( i ≠ j i\neq j i=j), e s X i e^{sX_i} esXi and e s X j e^{sX_j} esXj are independent random variables, for any fixed value of s s s. Hence, the expectation of their product is the product of the expectations, and
    M Z ( s ) = E [ e s X 1 ] E [ e s X 2 ] . . . E [ e s X n ] = M X 1 ( s ) M X 2 ( s ) . . . M X n ( s ) M_Z(s)=E[e^{sX_1}]E[e^{sX_2}]...E[e^{sX_n}]=M_{X_1}(s)M_{X_2}(s)...M_{X_n}(s) MZ(s)=E[esX1]E[esX2]...E[esXn]=MX1(s)MX2(s)...MXn(s)

Example 4.31. The Transform Associated with the Binomial.

  • Let X 1 , . . . , X n X_1, ... , X_n X1,...,Xn be independent Bernoulli random variables with a common parameter p p p. Then,
    M X i ( s ) = ( 1 − p ) e 0 s + p e 1 s = 1 − p + p e s ,     f o r   a l l   i M_{X_i}(s) = (1 - p) e^{0s} + pe^{1s} = 1 - p + pe^s,\ \ \ for\ all\ i MXi(s)=(1p)e0s+pe1s=1p+pes,   for all i
  • The random variable Z = X 1 + ⋅ ⋅ ⋅ + X n Z = X_1 +· · ·+ X_n Z=X1++Xn is binomial with parameters n n n and p p p. The corresponding transform is given by
    M Z ( s ) = ( 1 − p + p e s ) n M_Z(s)=(1 - p + pe^s)^n MZ(s)=(1p+pes)n

Example 4.32. The Sum of Independent Poisson Random Variables is Poisson.

  • Let X X X and Y Y Y be independent Poisson random variables with means λ \lambda λ and μ μ μ , respectively, and let Z = X + Y Z = X + Y Z=X+Y. Then,
    M X ( s ) = e λ ( e s − 1 ) ,      M Y ( s ) = e μ ( e s − 1 ) M_X(s)=e^{\lambda(e^s-1)},\ \ \ \ M_Y(s)=e^{\mu(e^s-1)} MX(s)=eλ(es1),    MY(s)=eμ(es1)and
    M Z ( s ) = M X ( s ) M Y ( s ) = e ( λ + μ ) ( e s − 1 ) M_Z(s)=M_X(s)M_Y(s)=e^{(\lambda+\mu)(e^s-1)} MZ(s)=MX(s)MY(s)=e(λ+μ)(es1)Thus, the transform associated with Z Z Z is the same as the transform associated with a Poisson random variable with mean λ + μ \lambda+μ λ+μ. By the uniqueness property of transforms, Z Z Z is Poisson with mean λ + μ \lambda+μ λ+μ.

Similarly, The Sum of Independent Normal Random Variables is Normal.

Transforms Associated with Joint Distributions

  • Consider n n n random variables X 1 , . . . , X n X_1, ... , X_n X1,...,Xn related to the same experiment. Let s 1 , . . . , s n s_1,...,s_n s1,...,sn be scalar free parameters (无量纲实参). The associated multivariate transform is a function of these n n n parameters and is defined by
    M X 1 , . . . , X n ( s 1 , . . . , s n ) = E [ e s 1 X 1 + . . . + s n X n ] M_{X_1,...,X_n}(s_1,...,s_n)=E[e^{s_1X_1+...+s_nX_n}] MX1,...,Xn(s1,...,sn)=E[es1X1+...+snXn]
  • The inversion property of transforms discussed earlier extends to the multivariate case. In particular. if Y 1 . . . . . Y n Y_1 ..... Y_n Y1.....Yn is another set of random variables and if M X 1 , … , X n ( s 1 . . . . . s n ) = M Y 1 , … , Y n ( s 1 . . . . . s n ) M_{X_1 , … ,X_n} (s_1 ..... s_n) = M_{Y_1 , … ,Y_n} (s_1 ..... s_n) MX1,,Xn(s1.....sn)=MY1,,Yn(s1.....sn) for all ( s 1 . . . . . s n ) (s_1 ..... s_n) (s1.....sn) belonging to some n n n-dimensional cube with positive volume, then the joint distribution of X 1 , . . . , X n X_1, ...,X_n X1,...,Xn is the same as the joint distribution of Y 1 , . . . , Y n Y_1, ... , Y_n Y1,...,Yn.

猜你喜欢

转载自blog.csdn.net/weixin_42437114/article/details/113852111