Two-category index calculation

加载预训练模型计算测试数据集的LogLoss、AUC和EER,需要根据具体场景选择相应的计算方法。以下是三种常见的方法:

1. 计算LogLoss:

```python
import torch
from torch.utils.data import DataLoader
import torch.nn.functional as F

# 加载模型和测试数据集
model = torch.load('pretrained_model.pth')
test_data = YourTestData(...)
test_loader = DataLoader(test_data, batch_size=64)

# 计算测试数据集的LogLoss
model.eval()
test_loss = 0
with torch.no_grad():
    for data, target in test_loader:
        output = model(data)
        test_loss += F.nll_loss(output, target).item()  # 累加每个批次的loss
        
test_loss /= len(test_loader)  # loss取平均值
print('Test set: Average Loss: {:.4f}'.format(test_loss))

In the above code, the function provided by PyTorch is used F.nll_loss()to calculate the LogLoss of the test data set. When calculating, it is necessary to accumulate the loss of each batch, and finally divide the sum by the number of samples in the test set to obtain the average Loss.

  1. Calculate AUC:
import torch
from torch.utils.data import DataLoader
from sklearn.metrics import roc_auc_score

# 加载模型和测试数据集
model = torch.load('pretrained_model.pth')
test_data = YourTestData(...)
test_loader = DataLoader(test_data, batch_size=64)

# 计算测试数据集的预测概率和真实标签
model.eval()
y_true, y_score = [], []
with torch.no_grad():
    for data, target in test_loader:
        output = torch.sigmoid(model(data))  # 使用sigmoid转换到0~1之间的概率
        y_true.extend(target.tolist())  # 将真实标签添加到列表中
        y_score.extend(output.tolist())  # 将预测概率添加到列表中
        
# 计算AUC
auc = roc_auc_score(y_true, y_score)
print('Test set: AUC = {:.4f}'.format(auc))

In the above code, the function provided by sklearn.metrics is used roc_auc_score()to calculate the AUC of the test data set. When calculating, you need to get the predicted probability and true label of each sample. You can use torch.sigmoid()the probability of converting the output to between 0 and 1, and then add them to the two lists. Finally use roc_auc_score()the function to calculate the AUC.

  1. Calculate EER:
import torch
import numpy as np
from scipy.optimize import brentq
from sklearn.metrics import roc_curve


def calculate_eer(y_true, y_score):
    fpr, tpr, thresholds = roc_curve(y_true, y_score, pos_label=1)
    eer = brentq(lambda x: 1. - x - np.interp(x, fpr, tpr), 0., 1.)
    return eer * 100

# 加载模型和测试数据集
model = torch.load('pretrained_model.pth')
test_data = YourTestData(...)
test_loader = DataLoader(test_data, batch_size=64)

# 计算测试数据集的预测概率和真实标签
model.eval()
y_true, y_score = [], []
with torch.no_grad():
    for data, target in test_loader:
        output = torch.sigmoid(model(data))  # 使用sigmoid转换到0~1之间的概率
        y_true.extend(target.tolist())  # 将真实标签添加到列表中
        y_score.extend(output.tolist())  # 将预测概率添加到列表中
        
# 计算EER
eer = calculate_eer(y_true, y_score)
print('Test set: EER = {:.2f}%'.format(eer))

In the above code, a function is customized calculate_eer()to calculate the EER of the test data set. First use the functions provided by sklearn.metrics roc_curve()to calculate FPR and TPR, and get the threshold. Then use the function provided by scipy.optimize brentq()to solve the threshold corresponding to ERR, and finally multiply the solution result by 100 to get the percentage form of EER.
``

Guess you like

Origin blog.csdn.net/qq_43663979/article/details/130508336