试用AdaBoost算法学习一个强分类器
训练数据集
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
x |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
y |
1 |
1 |
1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
-1 |
解:
初始化数据权值分布
D1=(w1,1,w1,2,…,w1,10)w1,i=0.1,i=1,2,…,10
对于
m=1,
(a)在权值分布为
D1的训练数据上,计算阈值
ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
ν |
0.5 |
1.5 |
2.5 |
3.5 |
4.5 |
5.5 |
6.5 |
7.5 |
8.5 |
分类误差率 |
0.5 |
0.4 |
0.3 |
0.4 |
0.5 |
0.4 |
0.5 |
0.4 |
0.3 |
阈值取
ν=8.5时分类误差率最低,故基本分类器为
G1(x)={1,−1,x<8.5x≥8.5
(b)
G1(x)在训练数据集上的误差率
e1=P(G1(xi)̸=yi)=0.3
©计算
G1(x)的系数:
α1=21loge11−e1=0.4236
(d)更新训练数据的权值分布:
D2=(w2,1,w2,2,…,w2,10)
w2,i=Z1w1,iexp(−α1yiG1(xi)),i=1,2,…,10
D2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857)
f1(x)=α1G1(x)=0.4236G1(x)
(e)分类器
sign[f1(x)]在训练数据集上有3个误分点
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
G1(x) |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
-1 |
f1(x) |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
-0.4236 |
sign[f1(x)] |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
-1 |
y |
1 |
1 |
1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
-1 |
对
m=2,
(a)在权值分布为
D2的训练数据上,计算阈值
ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,
em=∑Gm(xi)̸=yiwmi
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
ν |
0.5 |
1.5 |
2.5 |
3.5 |
4.5 |
5.5 |
6.5 |
7.5 |
8.5 |
分类误差率 |
0.357 |
0.286 |
0.214 |
0.381 |
0.452 |
0.286 |
0.358 |
0.429 |
0.5 |
阈值取
ν=2.5时分类误差率最低,故基本分类器为
G2(x)={1,−1,x<2.5x≥2.5
(b)
G2(x)在训练数据集上的误差率
e2=P(G2(xi)̸=yi)=0.214
©计算
G2(x)的系数:
α2=21loge21−e2=0.6496
(d)更新训练数据的权值分布:
D3=(w3,1,w3,2,…,w3,10)
w3,i=Z1w2,iexp(−α2yiG2(xi)),i=1,2,…,10
D3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056,0.16666675,0.16666675,0.16666675,0.04545452)
f2(x)=0.4236G1(x)+0.6496G2(x)
(e)分类器
sign[f2(x)]在训练数据集上有3个误分点
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
G1(x) |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
-1 |
G2(x) |
1 |
1 |
1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
α1G1(x) |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
-0.4236 |
α2G2(x) |
0.6496 |
0.6496 |
0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
sign[f2(x)] |
1 |
1 |
1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
y |
1 |
1 |
1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
-1 |
对
m=3
(a)在权值分布为
D3的训练数据上,计算阈值
ν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,
em=∑Gm(xi)̸=yiwmi
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
ν |
0.5 |
1.5 |
2.5 |
3.5 |
4.5 |
5.5 |
6.5 |
7.5 |
8.5 |
分类误差率 |
0.409 |
0.455 |
0.5 |
0.394 |
0.288 |
0.182 |
0.348 |
0.485 |
0.318 |
阈值取
ν=5.5时分类误差率最低,故基本分类器为
G2(x)={−1,1,x<5.5x≥5.5
(b)
G3(x)在训练数据集上的误差率
e3=P(G3(xi)̸=yi)=0.7520
(d)更新训练数据的权值分布:
D4=(w4,1,w4,2,…,w4,10)
w4,i=Z1w3,iexp(−α3yiG3(xi)),i=1,2,…,10
D4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478,0.10185189,0.10185189,0.10185189,0.125)
f3(x)=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)
(e)分类器
sign[f3(x)]在训练数据集上有0个误分点
序号 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
G1(x) |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
-1 |
G2(x) |
1 |
1 |
1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
G3(x) |
-1 |
-1 |
-1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
1 |
α1G1(x) |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
0.4236 |
-0.4236 |
α2G2(x) |
0.6496 |
0.6496 |
0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
-0.6496 |
α3G3(x) |
-0.7520 |
-0.7520 |
-0.7520 |
-0.7520 |
-0.7520 |
-0.7520 |
0.7520 |
0.7520 |
0.7520 |
0.7520 |
sign[f3(x)] |
1 |
1 |
1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
-1 |
y |
1 |
1 |
1 |
-1 |
-1 |
-1 |
1 |
1 |
1 |
-1 |
于是最终的分类器为
G(x)=sign[f3(x)]=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)