机器学习之感知器学习算法原理及公式推导

Perceptron Learning Algorithm

一、条件

样本集线性可分

二、原理

寻找一个超平面/直线将两类样本分开

h ( x i 1 , x i 2 , ⋯   , x i d ) = sign ( ∑ j = 1 d w j x i j − θ ) h(x_{i1},x_{i2},\cdots,x_{id}) = \text{sign}(\sum\limits_{j=1}^d w_jx_{ij} -\theta) h(xi1,xi2,,xid)=sign(j=1dwjxijθ) , i = 1 , 2 , ⋯   , n i=1,2,\cdots,n i=1,2,,n

w j w_j wj 可看成生物神经元的权重, x i j x_{ij} xij 可看成生物神经元的刺激, θ \theta θ 为阈值。当 ∑ j = 1 d w j x i j > θ \sum\limits_{j=1}^d w_jx_{ij}>\theta j=1dwjxij>θ 时神经元兴奋, ∑ j = 1 d w j x i j < θ \sum\limits_{j=1}^d w_jx_{ij} <\theta j=1dwjxij<θ 时神经元抑制。因此该方法称为感知器学习算法

x i 0 = 1 x_{i0}=1 xi0=1 , w 0 = θ w_0=\theta w0=θ , i = 1 , 2 , ⋯   , n i = 1,2,\cdots,n i=1,2,,n x ⃗ i = [ 1 x i 1 x i 2 ⋯ x i n ] T \vec x_i = \begin{bmatrix} 1&x_{i1}&x_{i2}&\cdots&x_{in} \end{bmatrix}^T x i=[1xi1xi2xin]T , w ⃗ = [ θ w 1 w 2 ⋯ w n ] T \vec w = \begin{bmatrix} \theta&w_1&w_2&\cdots&w_n \end{bmatrix}^T w =[θw1w2wn]T h ( x ⃗ i ) = sign ( w ⃗ T ⋅ x ⃗ i ) h(\vec x_i)=\text{sign}(\vec w^T\cdot\vec x_i) h(x i)=sign(w Tx i)

  1. 构造损失函数

    1. L ( h ) = ∑ i = 1 n I ( h ( x ⃗ i ) ≠ y i ) L(h) = \sum\limits_{i=1}^n\mathbb{I}(h(\vec x_i) \neq y_i) L(h)=i=1nI(h(x i)=yi)

      即当前假设下被错分样本的个数,但此函数不连续,难以用数学方法求最优值
    2. L ( w ⃗ ) = − ∑ x ⃗ ∈ y w ⃗ T ⋅ x ⃗ L(\vec w) = -\sum\limits_{\vec x\in}y\vec w^T \cdot\vec x L(w )=x yw Tx

      当样本被错分时, y y y w ⃗ T ⋅ x ⃗ \vec w^T\cdot\vec x w Tx 异号
  2. 求损失函数取最小值 0 时对应的假设 h h h

猜你喜欢

转载自blog.csdn.net/qq_52554169/article/details/130888793