本文为 $I n t r o d u c t i o n$ $t o$ $P r o b a b i l i t y$ 的读书笔记

Transform

The transform associated with a random variable $X$ (also referred to as the associated moment generating function (矩母函数)) is a function $M_X(s)$ of a scalar parameter $s$ , defined by
$M_X(s)=E[e^{sX}]$

The simpler notation $M (s)$ can also be used whenever the underlying random variable $X$ is clear from the context.

It is important to realize that the transforrn is not a number but rather a $f u n c t i o n$ of a parameter $s$ .
Strictly speaking, $M (s)$ is only defined for those values of $s$ for which $E[e^{sX}]$ is finite.

Two useful and generic properties of transforms.

For any random variable $X$ , we have
$M_X(0)=E[e^{0X}]=E[1]=1$ and if $X$ takes only nonnegative integer values, then
$\lim_{s\rightarrow-\infty}M_X(s)=P(X=0)$

Example 4.25. The Transform Associated with a Linear Function of a Random Variable.

Let $M_X(s)$ be the transform associated with a random variable $X$ . Consider a new random variable $Y = a X + b$ . We then have
$M_Y(s) = E[e^{s(aX+b)}] = e^{sb} E[e^{sa X} ] = e^{sb}M_X(sa)$

Transforms for Common Random Variables

在这里插入图片描述

Example 4.23. The Transform Associated with a Poisson Random Variable.

Let $X$ be a Poisson random variable with parameter $\lambda$ . The corresponding transform is
$M(s)=\sum_{x=0}^\infty e^{sx}\frac{\lambda^xe^{-\lambda}}{x!}$
We let $a=e^s\lambda$ and obtain
$M(s)=e^{-\lambda}\sum_{x=0}^\infty \frac{a^x}{x!}=e^{-\lambda}e^a=e^{\lambda(e^s-1)}$

Example 4.24. The Transform Associated with an Exponential Random Variable.

Let $X$ be an exponential random variable with parameter $\lambda$ . Then
$\begin{aligned}M(s)&=\lambda\int_0^\infty e^{sx}e^{-\lambda x}dx=\lambda\int_0^\infty e^{(s-\lambda)x}dx \\&=\lambda\frac{e^{(s-\lambda)x}}{s-\lambda}\bigg|^\infty_0\ \ \ \ (if\ s<\lambda) \\&=\frac{\lambda}{\lambda-s}\ \ \ \ (if\ s<\lambda)\end{aligned}$

Example 4.26. The Transform Associated with a Normal Random Variable.

Let $X$ be a normal random variable with mean $μ$ and variance $\sigma^2$ .
To calculate the corresponding transform, we first consider the special case of the standard normal random variable $Y$ , and then use the formula derived in the preceding example.
$\begin{aligned} M_Y(s)&=\int_{-\infty}^\infty\frac{1}{\sqrt{2\pi}}e^{-y^2/2}e^{sy}dy \\&=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-y^2/2+sy}dy \\&=e^{s^2/2}\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-(y-s)^2/2}dy \\&=e^{s^2/2} \end{aligned}$
A general normal random variable with mean $μ$ and variance $\sigma^2$ is obtained from the standard normal via the linear transformation
$X=\sigma Y+\mu$ Then
$M_X(s)=e^{s\mu}M_Y(s\sigma)=e^{(\sigma^2s^2/2)+\mu s}$

From Transforms to Moments

从 矩母函数 到矩

The reason behind the alternative name “moment generating function" is that the moments of a random variable are easily computed once a formula for the associated transform is available.
To see this, let us consider a continuous random variable $X$ , and let us take the derivative of both sides of the definition with respect to $s$ . We obtain
$\frac{d}{ds}M(s)=\frac{d}{ds}\int_{-\infty}^\infty e^{sx}f_X(x)dx=\int_{-\infty}^\infty xe^{sx}f_X(x)dx$
This equality holds for all values of $s$ . By considering the special case where $s = 0$ , we obtain
$\frac{d}{ds}M(s)\bigg|_{s=0}=\int_{-\infty}^\infty xf_X(x)dx=E[X]$ More generally, if we differentiate $n$ times the function $M (s)$ , with respect to $s$ , a similar calculation yields
$\frac{d^n}{ds^n}M(s)\bigg|_{s=0}=\int_{-\infty}^\infty x^nf_X(x)dx=E[X^n]$

Example 4.27.

For an exponential random variable with PDF
$f_X(x) =\lambda e^{-\lambda x},\ \ \ \ \ \ x\geq0$ we have
$M(s)=\frac{\lambda }{\lambda -s}$
Thus,
$\frac{d}{ds}M(s)=\frac{\lambda}{(\lambda-s)^2},\ \ \ \ \ \ \frac{d^2}{ds^2}M(s)=\frac{2\lambda}{(\lambda-s)^3}$ By setting $s = 0$ , we obtain
$E[X]=\frac{1}{\lambda},\ \ \ \ E[X^2]=\frac{2}{\lambda^2}$

Inversion of Transforms

矩母函数的逆

Inversion Property

The transform $M_X(s)$ associated with a random variable $X$ uniquely determines the CDF of $X$ , assuming that $M_X(s)$ is finite for all $s$ in some interval $[- a, a]$ , where $a$ is a positive number.

Its proof is beyond our scope.

There exist explicit formulas that allow us to recover the PMF or PDF of a random variable starting from the associated transform, but they are quite difficult to use. In practice, transforms are usually inverted by “pattern matching,” based on tables of known distribution-transform pairs.

Example 4.28

We are told that the transform associated with a random variable $X$ is
$=\frac{1}{4}e^{-s}+\frac{1}{2}+\frac{1}{8}e^{4s}+\frac{1}{8}e^{5s}$
Since $M (s)$ is a sum of terms of the form $e^{sx}$ , we can compare with the general formula
$\sum_xe^{sx}p_X(x)$ and infer that $X$ is a discrete random variable. The different values that $X$ can take are $- 1$ , $0$ , $4$ , and $5$ .
$P(X=-1)=\frac{1}{4},\ \ P(X=0)=\frac{1}{2},\ \ P(X=4)=\frac{1}{8},\ \ P(X=5)=\frac{1}{8}$
Generalizing from the last example, the distribution of a finite-valued discrete random variable can be always found by inspection of the corresponding transform.

The same procedure also works for discrete random variables with an infinite range, as in the example that follows.

Example 4.29. The Transform Associated with a Geometric Random Variable.

We are told that the transform associated with a random variable $X$ is of the form
$M(s)=\frac{pe^s}{1-(1-p)e^s}$ where $p$ is a constant in the range $0 < p < 1$ . We wish to find the distribution of $X$ . We recall the formula for the geometric series:
$\frac{1}{1-\alpha} =1+\alpha+\alpha^2+...$ which is valid whenever $|\alpha|< 1$ . We use this formula with $\alpha = (1 - p)e^s$ , and for $s$ sufficiently close to zero so that $1 - p)e^s< 1$ . We obtain
$M(s)=pe^s(1+(1 - p)e^s+(1 - p)^2e^{2s}+...)$
As in the previous example, we infer that this is a discrete random variable that takes positive integer values. The probability $P (X = k)$ is found by reading the coefficient of the term $e^{ks}$ . In particular, $P(X = k) = p(1 - p)^{k -1}, k = 1, 2, ...$ We recognize this as the geometric distribution with parameter $p$ .

Example 4.30. The Transform Associated with a Mixture of Two Distributions.

Let $X_1 , ... , X_n$ be continuous random variables with PDFs $f_{X_1}, ... , f_{X_n}$ . The value $y$ of a random variable $Y$ is generated as follows: an index $i$ is chosen with a corresponding probability $p_i$ , and $y$ is taken to be equal to the value of $X_i$ . Then,
$f_Y(y) = p_1f_{X_1}(y) +· · ·+ p_n f_{X_n}(y)$ and
$M_Y(s) =\int_{-\infty}^\infty e^{sx}f_Y(y)dy= p_1 M_{X_1}(s)+...+p_n M_{X_n} (s)$
The steps can be reversed. For example, we may be given that the transform associated with a random variable $Y$ is of the form
$\frac{1}{2}\cdot\frac{1}{2-s}+\frac{3}{4}\cdot\frac{1}{1-s}$ We can then rewrite it as
$\frac{1}{4}\cdot\frac{2}{2-s}+\frac{3}{4}\cdot\frac{1}{1-s}$ and recognize that $Y$ is the mixture of two exponential random variables with parameters 2 and 1, which are selected with probabilities $1 / 4$ and $3 / 4$ , respectively.

Problem 37.
A pizza parlor serves $n$ different types of pizza, and is visited by a number $K$ of customers in a given period of time, where $K$ is a nonnegative integer random variable with a known associated transform $M_K (s) = E[e^{sK} ]$ . Each customer orders a single pizza, with all types of pizza being equally likely, independent of the number of other customers and the types of pizza they order. Give a formula, in terms of $M_K(.)$ for the expected number of different types of pizzas ordered.

SOLUTION

Let $X$ be the number of different types of pizza ordered. Let $X_i$ be the random variable defined by
We have $X = X_1 +... + X_n$ , and by the law of iterated expectations,
$E[X] = E[E[X |K]=E[E[X_1 +... + X_n|K]]=nE[E[X_1|K]]$
Furthermore, since the probability that a customer does not order a pizza of type 1 is $(n - 1) / n$ , we have
$E[X_1|K = k] = 1-(\frac{n-1}{n})^k$ so that
$E[X_1|K] = 1-(\frac{n-1}{n})^K$ Thus, denoting
$=\frac{n- 1}{n}$ we have
$E[X]=nE[1-p^K]=n-nE[p^K]=n-nE[e^{Klogp}]=n-nM_K(logp)$

Problem 40.
Suppose that the transform associated with a discrete random variable $X$ has the form
$M(s)=\frac{A(e^s)}{B(e^s)}$ where $A (t)$ and $B (t)$ are polynomials of the generic variable $t$ . Assume that $A (t)$ and $B (t)$ have no common roots and that the degree of $A (t)$ is smaller than the degree of $B (t)$ . Assume also that $B (t)$ has distinct, real, and nonzero roots that have absolute value greater than $1$ . Then it can be seen that $M (s)$ can be written in the form
$M(s)=\frac{a_1}{1-r_1e^s}+...+\frac{a_m}{1-r_me^s}$ where $1/r_1, ..., 1 / r_m$ are the roots of $B (t)$ and the $a_i$ are constants that are equal to $lim_{e^s\rightarrow\frac{1}{r_i}}(1 -r_ie^s) M (s), i = 1, ... , m$ .

$(a)$ Show that the PMF of $X$ has the form
[Note: For large $k$ , the PMF of $X$ can be approximated by $a_{\overline i}r_{\overline i}^k$ , where ${\overline i}$ is the index corresponding to the largest $r_i|$ (assuming ${\overline i}$ , is unique).
$(b)$ Extend the result of part $(a)$ to the case where $M(s) = e^{bs}A(e^s )/ B(e^s)$ and $b$ is an integer.

SOLUTION

(a) We have for all $s$ such that $r_i|e^s<1$
$\frac{1}{1-r_ie^s}=1+r_ie^s+r_i^2e^{2s}+...$ Therefore
$M(s)=\sum_{i=1}^ma_i+(\sum_{i=1}^ma_ir_i)e^s+(\sum_{i=1}^ma_ir_i^2)e^{2s}+...$ We see that
$P(X=k)=\sum_{i=1}^ma_ir_i^k$ for $k\geq0$ , and $P (X = k) = 0$ for $k < 0$ .
$(b)$ In this case, $M (s)$ corresponds to the translation (平移) by $b$ of a random variable whose transform is $A(e^s )/ B(e^s )$ , so we have

Sums of Independent Random Variables

Transform methods are particularly convenient when dealing with a sum of random variables. The reason is that addition of independent random variables corresponds to multiplication of transforms. This provides an often convenient alternative to the convolution formula.

If $X_1 , ... , X_n$ is a collection of independent random variables, and
$Z = X_1+... + X_n$ The transform associated with $Z$ is, by definition,
$M_Z(s)=E[e^{sZ}]=E[e^{sX_1}e^{sX_2}...e^{sX_n}]$ Since $X_i$ and $X_j$ are independent ( $i\neq j$ ), $e^{sX_i}$ and $e^{sX_j}$ are independent random variables, for any fixed value of $s$ . Hence, the expectation of their product is the product of the expectations, and
$M_Z(s)=E[e^{sX_1}]E[e^{sX_2}]...E[e^{sX_n}]=M_{X_1}(s)M_{X_2}(s)...M_{X_n}(s)$

Example 4.31. The Transform Associated with the Binomial.

Let $X_1, ... , X_n$ be independent Bernoulli random variables with a common parameter $p$ . Then,
$M_{X_i}(s) = (1 - p) e^{0s} + pe^{1s} = 1 - p + pe^s,\ \ \ for\ all\ i$
The random variable $Z = X_1 +· · ·+ X_n$ is binomial with parameters $n$ and $p$ . The corresponding transform is given by
$M_Z(s)=(1 - p + pe^s)^n$

Example 4.32. The Sum of Independent Poisson Random Variables is Poisson.

Let $X$ and $Y$ be independent Poisson random variables with means $\lambda$ and $μ$ , respectively, and let $Z = X + Y$ . Then,
$M_X(s)=e^{\lambda(e^s-1)},\ \ \ \ M_Y(s)=e^{\mu(e^s-1)}$ and
$M_Z(s)=M_X(s)M_Y(s)=e^{(\lambda+\mu)(e^s-1)}$ Thus, the transform associated with $Z$ is the same as the transform associated with a Poisson random variable with mean $\lambda+μ$ . By the uniqueness property of transforms, $Z$ is Poisson with mean $\lambda+μ$ .

Similarly, The Sum of Independent Normal Random Variables is Normal.

Transforms Associated with Joint Distributions

Consider $n$ random variables $X_1, ... , X_n$ related to the same experiment. Let $s_1,...,s_n$ be scalar free parameters (无量纲实参). The associated multivariate transform is a function of these $n$ parameters and is defined by
$M_{X_1,...,X_n}(s_1,...,s_n)=E[e^{s_1X_1+...+s_nX_n}]$
The inversion property of transforms discussed earlier extends to the multivariate case. In particular. if $Y_1 ..... Y_n$ is another set of random variables and if $M_{X_1 , … ,X_n} (s_1 ..... s_n) = M_{Y_1 , … ,Y_n} (s_1 ..... s_n)$ for all $s_1 ..... s_n)$ belonging to some $n$ -dimensional cube with positive volume, then the joint distribution of $X_1, ...,X_n$ is the same as the joint distribution of $Y_1, ... , Y_n$ .

Chapter 4 (Further Topics on Random Variables): Transforms (矩母函数)

目录