缺失数据的极大似然估计:《Statistical Analysis with Missing Data》习题7.16

一、题目

a)极大似然估计

X X 为伯努利分布,并且 Pr ( X = 1 ) = 1 Pr ( X = 0 ) = π \text{Pr}(X = 1) = 1 - \text{Pr}(X = 0) = \pi ,并且在给定 X = j    ( j = 0 1 ) X = j\ \ (j=0,1) 时, Y Y 的分布为均值 μ j \mu_j ,方差 σ 2 \sigma^2

针对一份完整随机样本 ( x i y i ) i = 1 . . . n (x_i,y_i),i=1,...,n ,计算 ( π μ 0 μ 1 σ 2 ) (\pi,\mu_0,\mu_1,\sigma^2) 的极大似然估计并计算 Y Y 的边际均值与方差。

b)缺失数据的极大似然估计

假设现在 X X 是完整的观测,但 Y Y n r n-r 个值缺失,请使用第七章的方法,计算 Y Y 的边际均值与方差。

c)从后验分布中生成参数

当先验分布表现为 p ( π μ 0 μ 1 log σ 2 ) π 1 / 2 ( 1 π ) 1 / 2 p(\pi,\mu_0,\mu_1,\text{log}\sigma^2) \propto \pi^{1/2}(1-\pi)^{1/2} 的形式,描述如何从参数为 ( π μ 0 μ 1 σ 2 ) (\pi,\mu_0,\mu_1,\sigma^2) 的后验分布中抽出参数。

(注:前面的逗号均使用全角,后面公式中的逗号为半角,中文字中间的逗号为全角。)

二、解答

a)极大似然估计

写出联合密度函数,首先列出一个样本时的密度:
f ( x i , y i μ 0 , μ 1 , σ 2 , π ) = f ( x i π ) f ( y i x i , μ 0 , μ 1 , σ 2 , π ) = π x i ( 1 π ) 1 x i ( 1 2 π σ 2 exp { ( y i μ 1 ) 2 2 σ 2 } ) x i ( 1 2 π σ 2 exp { ( y i μ 0 ) 2 2 σ 2 } ) 1 x i \begin{aligned} & f(x_i,y_i|\mu_0,\mu_1,\sigma^2,\pi) \\ = & f(x_i | \pi) \cdot f(y_i|x_i, \mu_0,\mu_1,\sigma^2,\pi) \\ = & \pi^{x_i} \cdot (1 - \pi)^{1 - x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_1)^2}{2\sigma^2}\})^{x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_0)^2}{2\sigma^2}\})^{1 - x_i} \\ \end{aligned}

n n 个样本的联合密度函数:
f ( X , Y μ 0 , μ 0 , σ 2 , π ) = i = 1 n f ( x i π ) f ( y i x i , μ 0 , μ 1 , σ 2 , π ) = i = 1 n π x i ( 1 π ) 1 x i ( 1 2 π σ 2 exp { ( y i μ 1 ) 2 2 σ 2 } ) x i ( 1 2 π σ 2 exp { ( y i μ 0 ) 2 2 σ 2 } ) 1 x i \begin{aligned} & f(X,Y|\mu_0,\mu_0,\sigma^2,\pi) \\ = & \prod_{i = 1}^n f(x_i | \pi) \cdot f(y_i|x_i, \mu_0,\mu_1,\sigma^2,\pi) \\ = & \prod_{i = 1}^n \pi^{x_i} \cdot (1 - \pi)^{1 - x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_1)^2}{2\sigma^2}\})^{x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_0)^2}{2\sigma^2}\})^{1 - x_i} \\ \end{aligned}

对数似然函数:
ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) = ln i = 1 n f ( x i π ) f ( y i x i , μ 0 , μ 1 , σ 2 , π ) = ln i = 1 n π x i ( 1 π ) 1 x i ( 1 2 π σ 2 exp { ( y i μ 1 ) 2 2 σ 2 } ) x i ( 1 2 π σ 2 exp { ( y i μ 0 ) 2 2 σ 2 } ) 1 x i = i = 1 n { x i ln π + ( 1 x i ) ln ( 1 π ) x i 2 ln ( 2 π σ 2 ) x i ( y i μ 1 ) 2 2 σ 2 1 x i 2 ln ( 2 π σ 2 ) ( 1 x i ) ( y i μ 0 ) 2 2 σ 2 } = i = 1 n x i ln π + ( n i = 1 n x i ) ln ( 1 π ) n 2 ln ( 2 π σ 2 ) i = 1 n x i ( y i μ 1 ) 2 2 σ 2 i = 1 n ( 1 x i ) ( y i μ 0 ) 2 2 σ 2 } \begin{aligned} & \text{ln} f(X,Y|\mu_0,\mu_1,\sigma^2,\pi) \\ = & \text{ln} \prod_{i = 1}^n f(x_i | \pi) \cdot f(y_i|x_i, \mu_0,\mu_1,\sigma^2,\pi) \\ = & \text{ln} \prod_{i = 1}^n \pi^{x_i} \cdot (1 - \pi)^{1 - x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_1)^2}{2\sigma^2}\})^{x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_0)^2}{2\sigma^2}\})^{1 - x_i} \\ = & \sum_{i = 1}^n \{x_i\text{ln} \pi + (1 - x_i) \text{ln} (1 - \pi) - \frac{x_i}{2} \text{ln} (2\pi\sigma^2) -\frac{x_i(y_i - \mu_1)^2}{2\sigma^2} - \frac{1-x_i}{2} \text{ln} (2\pi\sigma^2) -\frac{(1-x_i)(y_i - \mu_0)^2}{2\sigma^2} \} \\ = & \sum_{i = 1}^n x_i\text{ln} \pi + (n - \sum_{i = 1}^n x_i) \text{ln} (1 - \pi) - \frac{n}{2} \text{ln} (2\pi\sigma^2) - \sum_{i = 1}^n \frac{x_i(y_i - \mu_1)^2}{2\sigma^2} - \sum_{i = 1}^n \frac{(1-x_i)(y_i - \mu_0)^2}{2\sigma^2} \} \\ \end{aligned}

对上式求偏导,使其等于 0 0 即可得极大似然估计
ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) μ 1 = ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) μ 0 = ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) σ 2 = ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) π = 0 \begin{aligned} \frac{\partial \text{ln} f(X,Y|\mu_0,\mu_1,\sigma^2,\pi)}{\partial \mu_1} = & \frac{\partial \text{ln}f(X,Y|\mu_0,\mu_1,\sigma^2,\pi)}{\partial \mu_0} \\ = & \frac{\partial \text{ln} f(X,Y|\mu_0,\mu_1,\sigma^2,\pi)}{\partial \sigma^2} \\ = &\frac{\partial \text{ln} f(X,Y|\mu_0,\mu_1,\sigma^2,\pi)}{\partial \pi} \\ = & 0 \end{aligned}

可解得:
π ^ = i = 1 n x i n μ 0 ^ = i = 1 n ( 1 x i ) y i i = 1 n ( 1 x i ) μ 1 ^ = i = 1 n x i y i i = 1 n x i σ 2 ^ = i = 1 n y i 2 n [ i = 1 n ( 1 x i ) y i ] 2 n i = 1 n ( 1 x i ) ( i = 1 n x i y i ) 2 n i = 1 n x i \begin{aligned} \hat{\pi} = & \frac{\sum_{i = 1}^n x_i}{n} \\ \hat{\mu_0} = & \frac{\sum_{i = 1}^n (1 - x_i) y_i}{\sum_{i = 1}^n (1 - x_i)}\\ \hat{\mu_1} = & \frac{\sum_{i = 1}^n x_i y_i}{\sum_{i = 1}^n x_i} \\ \hat{\sigma^2} = & \frac{\sum_{i = 1}^n y_i^2}{n} - \frac{[\sum_{i = 1}^n (1 - x_i) y_i]^2}{n\sum_{i = 1}^n (1 - x_i)} - \frac{(\sum_{i = 1}^n x_i y_i)^2}{n\sum_{i = 1}^n x_i} \\ \end{aligned}

由于 σ 2 ^ \hat{\sigma^2} 的计算相对麻烦,这里将其详细的计算过程写出:
ln f ( X , Y μ 0 , μ 1 , σ 2 , π ) σ 2 = 0 n 2 1 2 π σ 2 ^ 2 π + i = 1 n x i ( y i μ 1 ^ ) 2 2 ( σ 2 ^ ) 2 + i = 1 n ( 1 x i ) ( y i μ 0 ^ ) 2 2 ( σ 2 ^ ) 2 } = 0 i = 1 n x i ( y i i = 1 n x i y i i = 1 n x i ) 2 + i = 1 n ( 1 x i ) ( y i i = 1 n ( 1 x i ) y i i = 1 n ( 1 x i ) ) 2 = n σ 2 ^ i = 1 n y i 2 n [ i = 1 n ( 1 x i ) y i ] 2 n i = 1 n ( 1 x i ) ( i = 1 n x i y i ) 2 n i = 1 n x i = σ 2 ^ \begin{aligned} \frac{\partial \text{ln} f(X,Y|\mu_0,\mu_1,\sigma^2,\pi)}{\partial \sigma^2} = & 0 \\ -\frac{n}{2} \cdot -\frac{1}{2\pi \hat{\sigma^2}}\cdot 2\pi + \frac{ \sum_{i = 1}^n x_i(y_i - \hat{\mu_1})^2}{2(\hat{\sigma^2})^2} + \frac{\sum_{i = 1}^n (1-x_i)(y_i - \hat{\mu_0})^2}{2(\hat{\sigma^2})^2} \} = & 0 \\ \sum_{i = 1}^n x_i(y_i - \frac{\sum_{i = 1}^n x_i y_i}{\sum_{i = 1}^n x_i})^2 + \sum_{i = 1}^n (1-x_i)(y_i - \frac{\sum_{i = 1}^n (1 - x_i) y_i}{\sum_{i = 1}^n (1 - x_i)})^2 = & n\hat{\sigma^2} \\ \frac{\sum_{i = 1}^n y_i^2}{n} - \frac{[\sum_{i = 1}^n (1 - x_i) y_i]^2}{n\sum_{i = 1}^n (1 - x_i)} - \frac{(\sum_{i = 1}^n x_i y_i)^2}{n\sum_{i = 1}^n x_i} = & \hat{\sigma^2} \\ \end{aligned}

均值与方差为:
将随机变量 X X 求和掉,可求得 Y Y 的边际分布:
Y = ( 1 π ) Y 0 + π Y 1 Y = (1-\pi) Y_0 + \pi Y_1
其中:
Y 0 N ( μ 0 , σ 2 ) Y 1 N ( μ 1 , σ 2 ) \begin{aligned} Y_0 \sim & N(\mu_0, \sigma^2) \\ Y_1 \sim & N(\mu_1, \sigma^2) \\ \end{aligned}

对其求期望与方差:
E Y = ( 1 π ) μ 0 + π μ 1 EY = (1-\pi) \mu_0 + \pi \mu_1
V a r ( Y ) = ( 1 π ) 2 σ 2 + π 2 σ 2 Var(Y) = (1-\pi)^2 \sigma^2 + \pi^2 \sigma^2
Y Y 边际均值的估计为:
( 1 π ^ ) μ 0 ^ + π ^ μ 1 ^ (1-\hat{\pi}) \hat{\mu_0} + \hat{\pi} \hat{\mu_1}
边际方差的估计为:
( 1 π ^ ) 2 σ 2 ^ + π ^ 2 σ 2 ^ (1-\hat{\pi})^2 \hat{\sigma^2} + \hat{\pi}^2 \hat{\sigma^2}
将前面的估计得到的参数带入即可。

b)带缺失数据的极大似然估计

联合密度函数:
f ( X , Y o b s μ 0 , μ 1 , σ 2 , π ) = i = 1 r f ( x i , y i μ 0 , μ 1 , σ 2 , π ) i = r + 1 n f ( x i π ) = i = 1 r f ( x i π ) f ( y i x i , μ 0 , μ 1 , σ 2 , π ) i = r + 1 n f ( x i π ) = i = 1 n f ( x i π ) i = 1 r f ( y i x i , μ 0 , μ 1 , σ 2 , π ) = i = 1 n π x i ( 1 π ) 1 x i i = 1 r ( 1 2 π σ 2 exp { ( y i μ 1 ) 2 2 σ 2 } ) x i ( 1 2 π σ 2 exp { ( y i μ 0 ) 2 2 σ 2 } ) 1 x i \begin{aligned} & f(X,Y_{obs}|\mu_0,\mu_1,\sigma^2,\pi) \\ = & \prod_{i = 1}^r f(x_i, y_i | \mu_0,\mu_1,\sigma^2,\pi) \cdot \prod_{i = r+1}^n f(x_i|\pi) \\ = & \prod_{i = 1}^r f(x_i | \pi) f(y_i|x_i, \mu_0,\mu_1,\sigma^2,\pi) \cdot \prod_{i = r+1}^n f(x_i|\pi) \\ = & \prod_{i = 1}^n f(x_i|\pi) \cdot \prod_{i = 1}^r f(y_i|x_i, \mu_0,\mu_1,\sigma^2,\pi)\\ = & \prod_{i = 1}^n \pi^{x_i}(1 - \pi)^{1 - x_i} \cdot \prod_{i = 1}^r (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_1)^2}{2\sigma^2}\})^{x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_0)^2}{2\sigma^2}\})^{1 - x_i} \\ \end{aligned}

同样对上式求对数与偏导,使其等于 0 0 ,可解得:
π ^ = i = 1 n x i n μ 1 ^ = i = 1 r x i y i i = 1 r x i μ 0 ^ = i = 1 r ( 1 x i ) y i i = 1 r ( 1 x i ) σ 2 ^ = i = 1 r y i 2 r [ i = 1 r ( 1 x i ) y i ] 2 r i = 1 r ( 1 x i ) ( i = 1 r x i y i ) 2 r i = 1 r x i \begin{aligned} \hat{\pi} = & \frac{\sum_{i = 1}^n x_i}{n} \\ \hat{\mu_1} = & \frac{\sum_{i = 1}^r x_i y_i}{\sum_{i = 1}^r x_i} \\ \hat{\mu_0} = & \frac{\sum_{i = 1}^r (1 - x_i) y_i}{\sum_{i = 1}^r (1 - x_i)}\\ \hat{\sigma^2} = & \frac{\sum_{i = 1}^r y_i^2}{r} - \frac{[\sum_{i = 1}^r (1 - x_i) y_i]^2}{r\sum_{i = 1}^r (1 - x_i)} - \frac{(\sum_{i = 1}^r x_i y_i)^2}{r\sum_{i = 1}^r x_i} \\ \end{aligned}

同前面无缺失的情况, Y Y 边际均值的估计为:
( 1 π ^ ) μ 0 ^ + π ^ μ 1 ^ (1-\hat{\pi}) \hat{\mu_0} + \hat{\pi} \hat{\mu_1}
边际方差的估计为:
( 1 π ^ ) 2 σ 2 ^ + π ^ 2 σ 2 ^ (1-\hat{\pi})^2 \hat{\sigma^2} + \hat{\pi}^2 \hat{\sigma^2}
同样将前面的带缺失数据的极大似然估计得到的参数带入即可。

c)从后验分布中生成参数

后验分布:
f ( μ 0 , μ 1 , σ 2 , π X , Y o b s ) f ( X , Y o b s μ 0 , μ 1 , σ 2 , π ) f ( μ 0 , μ 1 , σ 2 , π ) i = 1 n π 1 2 + x i ( 1 π ) 3 2 x i i = 1 r ( 1 2 π σ 2 exp { ( y i μ 1 ) 2 2 σ 2 } ) x i ( 1 2 π σ 2 exp { ( y i μ 0 ) 2 2 σ 2 } ) 1 x i \begin{aligned} & f(\mu_0,\mu_1,\sigma^2,\pi|X,Y_{obs}) \\ \propto & f(X,Y_{obs}|\mu_0,\mu_1,\sigma^2,\pi) \cdot f(\mu_0,\mu_1,\sigma^2,\pi) \\ \propto & \prod_{i = 1}^n \pi^{\frac{1}{2} + x_i}(1 - \pi)^{\frac{3}{2} - x_i} \cdot \prod_{i = 1}^r (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_1)^2}{2\sigma^2}\})^{x_i} \cdot (\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\{-\frac{(y_i - \mu_0)^2}{2\sigma^2}\})^{1 - x_i} \\ \end{aligned}

我们可以类似书上的141页进行参数任意函数 g d g_d 的生成:

  1. 从参数为 ( n 2 + i = i n x i + 1 , 3 n 2 i = i n x i + 1 ) (\frac{n}{2}+\sum_{i = i}^n x_i+1,\frac{3n}{2}-\sum_{i = i}^n x_i+1) 的Beta分布中抽取 b t b_{t} ;从自由度为 2 n 2 2n-2 的卡方分布中抽取 x t x_{t} ;从标准正态分布中抽取相互独立的 z 0 z_0 z 1 z_1
  2. 计算 ϕ ( d ) = ( π ( d ) , μ 1 ( d ) , μ 0 ( d ) , σ 2 ( d ) ) \phi^{(d)} = (\pi^{(d)}, \mu_1^{(d)}, \mu_0^{(d)}, {\sigma^2}^{(d)}) :(其中 σ 2 ^ , μ 1 ^ , μ 0 ^ \hat{\sigma^2}, \hat{\mu_1}, \hat{\mu_0} 均为上一问所求)
    π ( d ) = b t σ 2 ( d ) = n σ 2 ^ / x t μ 1 ( d ) = μ 1 ^ + z 0 ( σ 2 ( d ) / i = 1 r x i ) 1 / 2 μ 0 ( d ) = μ 0 ^ + z 1 ( σ 2 ( d ) / i = 1 r ( 1 x i ) ) 1 / 2 \begin{aligned} \pi^{(d)} &= b_{t}\\ {\sigma^2}^{(d)} &= n \hat{\sigma^2} / x_t\\ \mu_1^{(d)} &= \hat{\mu_1} + z_0 ({\sigma^2}^{(d)} / \sum_{i = 1}^r x_i) ^ {1/2}\\ \mu_0^{(d)} &= \hat{\mu_0} + z_1 ({\sigma^2}^{(d)} / \sum_{i = 1}^r (1-x_i)) ^ {1/2}\\ \end{aligned}

猜你喜欢

转载自blog.csdn.net/weixin_41929524/article/details/84643845
今日推荐