线性回归模型求解推导

线性回归模型推导
线性拟合模型:
h θ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 h θ ( x ) = ∑ i = 0 n θ i x i = θ T x ⋯ ① \begin{aligned}h_θ(x)&=θ_0+θ_1x_1+θ_2x_2 \\h_θ(x)&=\displaystyle \sum_{i=0}^nθ_ix_i=θ^Tx\cdots①\end{aligned} hθ(x)hθ(x)=θ0+θ1x1+θ2x2=i=0nθixi=θTx

误差,真实值和预测值之间存在的差异ε
对于每个样本:
y i = θ T x i + ε i ⋯ ② y_i=θ^Tx_i+ε_i\cdots② yi=θTxi+εi

假设:误差 ε i ε_i εi是独立并肯有相同的分布,并且服从均值为0方差为 θ 2 θ^2 θ2的高斯分布

预测值与误差由于服从高斯分布:
p ( ε i ) = 1 2 π σ e x p ( − ( ε i ) 2 2 σ 2 ) p(ε_i)=\frac{1}{\sqrt{2\pi}σ}exp(-\frac{(ε_i)^2}{2σ^2}) p(εi)=2π σ1exp(2σ2(εi)2)

由①代入②式:
p ( y i ∣ x i ; θ ) = 1 2 π σ e x p ( − ( y i − θ T x i ) 2 2 σ 2 ) p(y_i|x_i;θ)=\frac{1}{\sqrt{2\pi}σ}exp(-\frac{(y_i-θ^Tx_i)^2}{2σ^2}) p(yixi;θ)=2π σ1exp(2σ2(yiθTxi)2)

使用似然函数,求解最优参数
似然函数:
L ( θ ) = ∏ i = 1 m p ( y i ∣ x i ; θ ) = ∏ i = 1 m 1 2 π σ e x p ( − ( y i − θ T x i ) 2 2 σ 2 ) \displaystyle L(θ)=\prod_{i=1}^mp(y_i|x_i;θ)=\prod_{i=1}^m\frac{1}{\sqrt{2\pi}σ}exp(-\frac{(y_i-θ^Tx_i)^2}{2σ^2}) L(θ)=i=1mp(yixi;θ)=i=1m2π σ1exp(2σ2(yiθTxi)2)

函数变换为对数似然:
l o g L ( θ ) = l o g ∏ i = 1 m 1 2 π σ e x p ( − ( y i − θ T x i ) 2 2 σ 2 ) = ∑ i = 1 m l o g 1 2 π σ − 1 σ 2 1 2 ∑ i = 1 m ( y i − θ T x i ) 2 \displaystyle \begin{aligned} logL(θ) &=log\prod_{i=1}^m\frac{1}{\sqrt{2\pi}σ}exp(-\frac{(y_i-θ^Tx_i)^2}{2σ^2}) \\ &=\sum_{i=1}^mlog\frac{1}{\sqrt{2\pi}σ}-\frac{1}{σ^2}\frac{1}{2}\sum_{i=1}^m(y_i-θ^Tx_i)^2\end{aligned} logL(θ)=logi=1m2π σ1exp(2σ2(yiθTxi)2)=i=1mlog2π σ1σ2121i=1m(yiθTxi)2

目标:求似然函数的最大值,最小二乘法
领:
J ( θ ) = 1 2 ∑ i = 1 m ( y i − θ T x i ) 2 \displaystyle J(θ)=\frac{1}{2}\sum_{i=1}^m(y_i-θ^Tx_i)^2 J(θ)=21i=1m(yiθTxi)2
化简
J ( θ ) = 1 2 ∑ i = 1 m ( h θ ( x i ) − y i ) = 1 2 ( X θ − y ) T ( X θ − y ) \displaystyle J(θ)=\frac{1}{2}\sum_{i=1}^m(h_θ(x_i)-y_i)=\frac{1}{2}(Xθ-y)^T(Xθ-y) J(θ)=21i=1m(hθ(xi)yi)=21(Xθy)T(Xθy)

求偏导:
▽ θ J ( θ ) = ▽ θ ( 1 2 ( X θ − y ) T ( X θ − y ) ) = ▽ θ ( 1 2 ( θ T X T − y T ) ( X θ − y ) ) = ▽ θ ( 1 2 ( θ T X T X θ − θ T X T y − y T X θ + y T y ) ) = 1 2 ( 2 X T X θ − X T y − ( y T X ) T ) = X T X θ − X T y \begin{aligned} \triangledown_{\theta}J(\theta)&=\triangledown_\theta(\frac{1}{2}(X\theta-y)^T(X\theta-y)) \\&=\triangledown_\theta(\frac{1}{2}(\theta^TX^T-y^T)(X\theta-y)) \\&=\triangledown_\theta(\frac{1}{2}(\theta^TX^TX\theta-\theta^TX^Ty-y^TX\theta+y^Ty)) \\&=\frac{1}{2}(2X^TX\theta-X^Ty-(y^TX)^T) \\&=X^TX\theta-X^Ty \end{aligned} θJ(θ)=θ(21(Xθy)T(Xθy))=θ(21(θTXTyT)(Xθy))=θ(21(θTXTXθθTXTyyTXθ+yTy))=21(2XTXθXTy(yTX)T)=XTXθXTy

当偏导为0时最小, θ = ( X T X ) − 1 X T y \theta=(X^TX)^{-1}X^Ty θ=(XTX)1XTy

评估方法
最常用的评估项 R 2 R^2 R2: 1 − 残 差 平 方 和 总 方 差 项 1-\frac{残差平方和}{总方差项} 1 解释因变量的度量值
1 − ∑ i = 1 m ( y ^ i − y i ) 2 ∑ i = 1 m ( y i − y ˉ ) 2 1- \frac{\displaystyle \sum_{i=1}^m(\widehat{y}_i-y_i)^2}{\sum_{i=1}^m(y_i-\text{\={y}})^2} 1i=1m(yiyˉ)2i=1m(y iyi)2

猜你喜欢

转载自blog.csdn.net/rankiy/article/details/103381376