Neural Network(3)-Supervised learning, Perception and Decision plane

1. 监督学习

1.1 数据以及目标

Data: A set of data records (also called examples, instances or cases) described by

  • k attributes: A1, A2, … Ak.
  • a class: Each example is labelled with a pre-defined class

Goal: To learn a classification modelfrom the data that can be used to predict the classes of new (future, or test) cases/instances
用已有的数据建立一个分类模型,用于预测未知数据的类
Eaxmple:
银行贷款数据 每一行为一个data record
在这里插入图片描述
建立分类机制用于:
在这里插入图片描述

1.2 监督学习与非监督学习

监督学习:

  1. 每个数据都给定了类标签(Pre-defined class label)
  2. 测试数据同样被归进这些类里
  3. 目的是建立分类机制并把新的数据分类

非监督学习:

  1. 数据的类标签是未知的
  2. 目的是找到数据间潜在的类或者集群(Cluster)建立联系

1.3 监督学习的过程

两步

  1. Learning: Learn a model using the training data
  2. Testing:Test the model usingunseentest datato assess the model accuracy
    在这里插入图片描述
    这里给出了一种精确度的定义
    在这里插入图片描述

关于学习的假设:
The distribution of training examples is identical to the distribution of test examples (including future unseen examples).
训练或学习数据以及测试数据应是在同一条件下收集的,有同样的均值离散值

  • In practice, this assumption is often violated to certain degree.
  • Strong violations will clearly result in poor classification accuracy.
  • To achieve good accuracy on the test data, training examples must be sufficiently representative of the test data.

如果训练数据不能充分代表测试数据则分类结果将会很差

2. 感知器

Perceptrons are neural networks that change with “experience” using error-correcting rule
According to the rule, weight of a response unit changes when it makes erroneous response to stimuli presented to the network.
error-correcting rule

Error-Correcting Rule

2.1 最简单的感知器(two-layer)

2.11 基本概念

在这里插入图片描述

  • The two layers are fully interconnected
    虽然所有输出都连接了所有输入但是由于权重不一样输出也很可能不一样
  • Processing elementsof the perceptron are the abstract neurons

2.12 加权和或instant state的计算

在这里插入图片描述
• There is a special bias input unit number 0in the input layer.
• The bias unit always produces inputs of the fixed values of +1.
• The input of bias unit functions as a constant value in the sum.
• The bias unit connection to output unit j has a weight
adjusted in the same way as all the other weights

2.13 Output

• The output value of the output unit j depends on whether the weighted sum is above or below the unit’s threshold value.
[外链图片转存失败,源站可能有防盗在这里插入!链机制,建描述]议将图片上https://传(imblog.csdnimg.cn/202-V1DQX8211634694.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzQyMTQxOTQz,size_16,color_FFFFFF,t_70)a)]
• 输出向量的定义在这里插入图片描述

2.14 Perceptron Learning

– Weight wji of connections between the two layers are – --change daccording to perceptron learning rule,
– The process of weights adjustment is called perceptron learning (or training)
– Error
在这里插入图片描述
感知器学习法则两步走
在这里插入图片描述
在这里插入图片描述
错误收集法则为学习的基础也是感知器之所以为感知器的基础
在这里插入图片描述
以上所示的Delta rule 为经典的错误收集法则但不是唯一一个!

  1. 感知器的权重变化除非错误与输入均不为零此时感知器的权重会使新的输出更接近原有输出。
  2. 对于任何线性可分的数据集简单感知器保证能找到一个解(在有限部内)
    3.

2.15 感知器的网络表现(Network Performance)

The best performance of the network corresponds to the minimum of the RMS error, and we adjust the weights of connections in order to get that minimum.

root-mean-square (RMS) error value

在这里插入图片描述
在这里插入图片描述
第一个求和算的是每个输出神经元对于一个训练pattern来说的错误
第二个求和算的是每个训练pattern造成的error求和

  • As the target output values tjp and n p and n o numbers are
    constants, the RMS error is a function of the instant output
    values Xjp only
  • In turn, the instant outputs Xjp are functions of the input values aip, which are also constants, and of the weights of connections wji

在这里插入图片描述
在这里插入图片描述
So the performance of the network measured by the RMS error also is function of the weights of connections only
在这里插入图片描述

  • Initially, the adaptable weights are all set to small random values, and the network does not perform very well.
  • As weights are adjusted during training, performance improves; when the error rate is low enough, training stops and the network is said to have converged.
    在这里插入图片描述

2.16 Perceptron Convergence

  1. 如果有一组权重能让感知器满足所有输入的输出,则训练中会找到这一组权重 (Rosenblatt, 1962)
  2. Eventually performance stops improving, and the RMS error does not get smaller regardless of number of iterations. That means the network has failed to learn all of the answers correctly.
  3. If the training is successful, the perceptron is said to have gone throughthe supervised learning, and is able to classify patterns similar to those of the training set.

2.17 Perceptron As a Classfier

A. 对于d 维度的输入向量来说 我们有:

  1. d 维度的权重
  2. 一个bias-对于输入为1的权重 上节讲的special input
  3. 一个激活函数 Threshold Activation Function
    在这里插入图片描述
    上图中输入是二维的 x1 和 x2 还有一个特别输入(它的权重是bias).

B. 训练的目的是:

在这里插入图片描述

C. Decision boundary

在这里插入图片描述
W·X=0为分界线,英文 W·X 算的是 S 而当阶跃函数为激活函数时, S>0 以及 S<0 各对应了 一类所以理所当然S=0 既 W·X=0为分界线 W.X is a hyperplane, which in 2d is a straight line
同理如果不存在一条分界线将所有数据集分开则没有一组W完美分类
W·X同时可以写成 W0+W1X1+W2X2
在这里插入图片描述
A perceptron represents a hyperplane decision surfacein d- dimensional space, for example, a line in 2D, a plane in 3D
The equation of the hyperplane is w·xT= 0 因为阈值=0
This is the equation for points in x-space that are onthe boundary

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
bias的作用就是为了调节决策平面
在这里插入图片描述

原创文章 28 获赞 44 访问量 3814

猜你喜欢

转载自blog.csdn.net/qq_42141943/article/details/105601986