[MATLAB Issue 77] MATLAB code implementation of dimensionality reduction/feature sorting/data processing regression/classification problems based on MATLAB proxy model algorithm [under update]

[MATLAB Issue 77] MATLAB code implementation of dimensionality reduction/feature sorting/data processing regression/classification problems based on MATLAB proxy model algorithm

This article introduces a collection of feature sorting methods based on the libsvm proxy model algorithm, including:
1. Sorting based on the prediction accuracy of each feature (libsvm proxy model)
2. Feature sorting based on the correlation coefficient corr (libsvm proxy model)
3. svmrfe_ker (two classifications) ) [Subsequent update]
4. Feature sorting svmrfe_ori (two classifications) based on SVM-RFE recursive feature elimination [Subsequent update]

1. Multi-input single-output multi-classification problem

Data settings:
categorical data, 12 inputs, 1 output, 4 categories, 357 samples

classdata=xlsread('数据集C.xlsx');
X=classdata(:,1:end-1)';% 输入变量
Y=classdata(:,end);%输出标签
[X, ps_input] = mapminmax(X, 0, 1);
X=X';
ptrain_per=0.7;%训练比例 
trainIdx = randperm(size(X,1),ceil(size(X,1)*ptrain_per));%训练样本编号
testIdx = setdiff(1:size(X,1),trainIdx);%测试样本编号
K=10;%10
cvObj = cvpartition(Y(testIdx),'k',K);
userdata.cvObj = cvObj;
userdata.ft = X(testIdx,:);%测试集输入
userdata.target = Y(testIdx);%测试集输出

nSel = size(X,2);%选择的特征数量 ,可以小于等于变量特征数

1. Sorting based on the prediction accuracy of each feature (libsvm proxy model)

That is, each variable is used as an input feature, and the features are sorted by the ten-fold average error rate.
The cumulative contribution is 0.9

Insert image description here
Insert image description here

2. Feature ranking based on correlation coefficient corr (libsvm proxy model)

Fitness function - the average R2 of the test set is: 0.88588
Insert image description here
Insert image description here

2. Multi-input single-output regression problem

Data settings:
categorical data, 7 inputs and 1 output, 107 samples

%%  清空环境变量
warning off             % 关闭报警信息
close all               % 关闭开启的图窗
clear                   % 清空变量
clc                     % 清空命令行

%%  导入数据
res = xlsread('数据集.xlsx');

%%  划分训练集和测试集


ptrain_per=0.7;%训练比例 
trainIdx = randperm(size(res,1),ceil(size(res,1)*ptrain_per));%训练样本编号
testIdx = setdiff(1:size(res,1),trainIdx);%测试样本编号

P_train = res(trainIdx, 1: 7)';
T_train = res(trainIdx, 8)';
M = size(P_train, 2);

P_test = res(testIdx, 1: 7)';
T_test = res(testIdx, 8)';
N = size(P_test, 2);

%%  数据归一化
[p_train, ps_input] = mapminmax(P_train, 0, 1);
p_test = mapminmax('apply', P_test, ps_input);

[t_train, ps_output] = mapminmax(T_train, 0, 1);
t_test = mapminmax('apply', T_test, ps_output);
K=10;%10
cvObj = cvpartition(Y(testIdx),'k',K);
userdata.cvObj = cvObj;
userdata.ft = X(testIdx,:);%测试集输入
userdata.target = Y(testIdx);%测试集输出

nSel = size(X,2);%选择的特征数量 ,可以小于等于变量特征数

1. Sorting based on the prediction accuracy of each feature (libsvm proxy model)

That is, each variable is used as an input feature, and the features are sorted by the ten-fold average error rate.
The cumulative contribution is 0.9
Insert image description here
Insert image description here

2. Feature ranking based on correlation coefficient corr (libsvm proxy model)

Insert image description here
Insert image description here

3. Code acquisition

CSDN private message reply "Issue 77" to get the download method.

Guess you like

Origin blog.csdn.net/qq_29736627/article/details/133191376