python:使用sklearn 计算 precision、recall、F1 score(多分类)

目录

参考:

0. 评估指标

1. 使用sklearn计算accuracy(accuracy_score)、混淆矩阵(confusion_matrix)

2. 使用sklearn计算精确率precision、召回率recall、F1 score


参考:

https://blog.csdn.net/kan2281123066/article/details/103237273  代码:利用sklearn 计算 precision、recall、F1 score

https://blog.csdn.net/blythe0107/article/details/75003890  代码:sklearn的precision_score, recall_score, f1_score使用

https://blog.csdn.net/Urbanears/article/details/105033731  二分类和多分类问题下的评价指标详析(Precision, Recall, F1,Micro,Macro)

https://blog.csdn.net/wf592523813/article/details/95202448  二分类和多分类问题的评价指标总结

https://zhuanlan.zhihu.com/p/147663370  知乎:多分类模型Accuracy, Precision, Recall和F1-score的超级无敌深入探讨

https://zhuanlan.zhihu.com/p/59862986  知乎:详解sklearn的多分类模型评价指标

https://www.zhihu.com/question/51470349  知乎:对多分类数据的模型比较选择,应该参考什么指标?

0. 评估指标

  • 精确度:precision,正确预测为正的,占全部预测为正的比例,TP / (TP+FP)
  • 召回率:recall,正确预测为正的,占全部实际为正的比例,TP / (TP+FN)
  • F1-score:精确率和召回率的调和平均数,2 * precision*recall / (precision+recall)


1. 使用sklearn计算accuracy(accuracy_score)、混淆矩阵(confusion_matrix)

    举的这个例子,标签label为5类,是个多分类问题

  • accuracy_score
  • confusion_matrix
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score, recall_score, f1_score


actual = data3['label']     #真实的类别标签(data3是个dataframe,label是其中的一列)
predicted = data3['Label']  #预测的类别标签

# 计算总的精度
acc = accuracy_score(actual, predicted)
print(acc)

# 计算混淆矩阵
confusion_matrix(actual, predicted)


'''
输出结果:

0.8969907407407407
array([[1032,    1,    0,    0,    0],
       [   3,  965,   17,    0,   15],
       [   0,   10,   42,    0,    6],
       [   0,   11,    3,    0,    0],
       [   0,   98,  103,    0,  286]])
'''

2. 使用sklearn计算精确率precision、召回率recall、F1 score

     参数average有5个选项:{‘micro’微平均, ‘macro’宏平均, ‘samples’, ‘weighted’, ‘binary’},默认是default=’binary’二分类

p = precision_score(actual, predicted, average='micro')
p2 = precision_score(actual, predicted, average='macro')
p3 = precision_score(actual, predicted, average='weighted')
#p4 = precision_score(actual, predicted, average='samples')

r = recall_score(actual, predicted, average='micro')
r2 = recall_score(actual, predicted, average='macro')
r3 = recall_score(actual, predicted, average='weighted')

f1score = f1_score(actual, predicted, average='micro')
f1score2 = f1_score(actual, predicted, average='macro')
f1score3 = f1_score(actual, predicted, average='weighted')
#f1score4 = f1_score(actual, predicted, average='samples')

print(p,p2,p3)
print(r,r2,r3)
print(f1score,f1score2,f1score3)

'''
输出结果:
0.8969907407407407 0.614528783337003 0.9212413779093402
0.8969907407407407 0.6550877741326565 0.8969907407407407
0.8969907407407407 0.6041619746930149 0.8986671343868164
'''

猜你喜欢

转载自blog.csdn.net/weixin_39450145/article/details/115284725