Mathematical Modeling - Simulated Annealing Optimal Projection Pursuit

Mathematical Modeling - Simulated Annealing Optimal Projection Pursuit

Tip: After the article is written, the table of contents can be automatically generated. How to generate it can refer to the help document on the right


foreword

  When considering the comprehensive evaluation, we used their own subjective and objective methods to solve the weight. The calculation of the objective weight depends on the distribution of the data itself. Sometimes there will be various irresistible unexpected situations. Among them, the entropy It is mentioned in the explanation of rights law that sometimes the importance and distribution of data are not necessarily related. The projection pursuit method is mainly based on the variance of the evaluation results and the mutual absolute error to adjust the weights to obtain the best projection that can reflect the original data (which can be understood as an importance value), and the evaluation value obtained in this way is more referential.


1. What is projection pursuit?

  Projection pursuit is a kind of statistical method for processing and analyzing high-dimensional data. Its basic idea is to project high-dimensional data onto low-dimensional (1-3 dimensional) subspaces, and find out the structures or characteristics that reflect the original high-dimensional data. Projection, in order to achieve the purpose of research and analysis of high-dimensional data. In 1974, Friedman and Tukey of Stanford University in the United States named the method Projection Pursuit for the first time, that is, projection pursuit.
  Projection pursuit (PP for short) is a new and valuable new technology developed by the international statistical community in the mid-1970s. It is an interdisciplinary subject of statistics, applied mathematics and computer technology. It is an emerging statistical method for analyzing and processing high-dimensional observation data, especially non-normal nonlinear high-dimensional data. It achieves the purpose of researching and analyzing high-dimensional data by projecting high-dimensional data onto low-dimensional subspaces and finding projections that can reflect the structure or characteristics of the original high-dimensional data. It has the advantages of robustness, anti-interference and high accuracy, so it is widely used in many fields.
  The above explanation comes from Baidu Encyclopedia. In the vernacular, it is to project multi-dimensional data onto a plane and space, and it is more appropriate to observe the projection.

2. What is simulated annealing

  Simulated annealing is an optimization algorithm, and its essential idea is that its starting point is based on the similarity between the annealing process of solid matter in physics and general combinatorial optimization problems. The simulated annealing algorithm is derived from the principle of solid annealing. It is a probability-based algorithm. The solid is heated to a sufficiently high temperature and then allowed to cool slowly. When heated, the internal particles of the solid become disordered with the temperature rise, and the internal energy increases. Large, and the particles gradually become orderly when cooling slowly, reaching an equilibrium state at each temperature, and finally reaching the ground state at room temperature, and the internal energy is reduced to the minimum.
  The simulated annealing algorithm itself is relatively easy to understand. Here I put some articles from other bloggers to help you understand. I also learned through these learning platforms.
Zhihu (intelligent algorithm): https://zhuanlan.zhihu.com/p/266874840
Station B (Qingfeng): https://www.bilibili.com/video/BV1hK41157JL?spm_id_from=333.337.search-card.all. click

3. Simulated annealing optimized projection pursuit

  Steps 1, 2, and 3 belong to the construction of projection pursuit. Step 4 begins with simulated annealing to find the optimal solution. It needs to be understood before it can be understood. If you don’t want to read it, you can apply it directly.

1. Data preprocessing

  You can find that the first step in calculating weights is mostly data preprocessing, including forwardization and normalization.

For positive indicators (bigger is better), the normalization is as follows:
xi = xi − min ⁡ xi max ⁡ xi − min ⁡ xi x_i=\frac{x_i-\min x_i}{\max x_i-\min x_i}xi=maxximinxiximinxi
For negative indicators (the smaller the better), the normalization is as follows:
xi = max ⁡ xi − xi max ⁡ xi − min ⁡ xi x_i=\frac{\max x_i- x_i}{\max x_i-\min x_i}xi=maxximinximaxxixi
For intermediate indicators (the closer to a certain value, the better), the normalization is as follows:
xi = 1 − ∣ xi − xbest ∣ max ∣ xi − xbest ∣ x_i=1-\frac{ {|x_i-x_{best}|} }{max|x_i-x_{best}|}xi=1maxxixbestxixbest
  where xbest x_{best}xbestFor
the interval index (the closer to a certain interval, the better), the normalization is as follows:
insert image description here
  where M = max ( a − min ( xi ) , max ( xi ) − b ) M=max{(a-min{ (x_i)},max{(x_i)-b})}M=max(amin(xi),max(xi)b )
The common normalizations are the above types, and others, such as the optimal multi-quantile point, will not be explained.

2. Projection to low dimension

  Choose a variety of angles to observe the indicator data, so as to fully mine and reflect the best projection vector of the data characteristics. Let a = ( a 1 , a 2 , . . . , an ) a=(a_1,a_2,...,a_n)a=(a1,a2,...,an) is an n-dimensional unit vector, representing the direction vector of n index projections, then theiiThe linear projection of i samples on one-dimensional space is:
Z i = ∑ i = 1 nxijaj Z_i=\sum_{i=1}^n{x_{ij}a_j}Zi=i=1nxijaj

3. Construct the indicator function of the projection

  Let the projection value of the data be Z i Z_iZi, based on projection pursuit, the overall distribution characteristics of information and the distribution characteristics of local projection points are obtained by constructing projection index functions. Generally, the overall distribution of information should be spread as much as possible, and the distribution of local projection points should be Dense, and finally use the projection index function to represent its maximum product.
max ⁡ G ( a ) = S a ⋅ B a \max\text{\ }G\left( a \right) =Sa\cdot Bamax G(a)=S aB a
  where Ba is the local density of the projection value Zi, and the standard deviation of the projection value Zi is Sa, so Ba and Sa can be expressed as: B
a = ∑ i = 1 n ∑ j = 1 m ( R ij − rij ) u ( R ij − rij ) Ba=\sum_{i=1}^n{\sum_{j=1}^m{\left( R_{ij}-r_{ij} \right) u\left( R_{ij} -r_{ij} \right)}}Ba=i=1nj=1m(Rijrij)u(Rijrij) S a = ∑ i = 1 n ( Z i − Z ˉ ) / ( n − 1 ) Sa=\sqrt{\sum_{i=1}^n{\left( Zi-\bar{Z} \right)}/\left( n-1 \right)} S a=i=1n( Day iZˉ)/(n1)
  In the above formula, the picture is the mean value of Zi, u is the unit step function, and its function is that when the value is greater than 0, the value is 1; when the value is less than 0, the value is 0. R is the aperture of the window for calculating the local density. The selection of the radius must contain as many projection points as possible, otherwise the average deviation of the sliding will be too large. rij r_{ij}rijThe formula for distance is: rij = Z i − Z j r_{ij}=Z_i-Z_jrij=ZiZj

4. Optimization of projection direction

  The simulated annealing algorithm is used to optimize the objective function in order to obtain the best projection that can reflect the original data.
The objective function is set to:
max ⁡ G ( a ) = S a ⋅ B a \max\text{\,\,}G\left( a \right) =Sa\cdot BamaxG(a)=S aBa s t . { a > 0 ∑ j = 1 m a j 2 st.\left\{ \begin{array}{l} a>0\\ \sum_{j=1}^m{a_{j}^{2}}\\ \end{array} \right. st.{ a>0j=1maj2
  The basic principle of simulated annealing is to cool the high-temperature particles slowly and naturally, and finally reach thermal equilibrium at a specific temperature, and can reach the lowest energy state E(i). E(i) obeys the following rules.
1) If E(i)≥E(j), the state is accepted to be transformed into the next state
2) If E(i)<E(j), the state has a certain probability of being accepted, the probability is:
μ = E ( i ) − E ( j ) KT \mu =\frac{E\left( i \right) -E\left( j \right)}{KT}m=KTE(i)E(j)
Among them, K is Boltzmann's constant, and T is the temperature of the particle.
Based on this rule, the objective function is transformed by the simulated annealing algorithm into:
{ y 1 = sqrt ( b 2 sumb 2 ) b is ( 0 , 1 ) random number for k = 1 : n b ( k ) = rand ( ) to generate a new solution P = { 1 exp ⁡ ( delta − e / T ) delta − e > 0 delta − e ≤ 0 T 0 = q ⋅ T 0 \left\{ \begin{array}{l} y_1=sqrt\left( \frac{ b_2}{sumb_2} \right) \ b\text{for}\left( 0,1 \right) \text{random number}\\ for\ k=1:n\ \ b\left( k \right) = rand\left( \right) \ \text{generate new solution}\\ P=\left\{ \begin{array}{l} 1\\ \exp \left( delta-e/T \right)\\ \ end{array}\begin{array}{c} delta-e>0\\ delta-e\le 0\\ \end{array} \right.\\ T_0=q\cdot T_0\\ \end{array} \right. \ y1=sqrt(sum b _ _2b2) b for(0,1)random numberfor k=1:n  b(k)=rand() produce new solutionsP={ 1exp(deltae/T)deltae>0deltae0T0=qT0 

5. Solve the evaluation value of the projection

Projection evaluation value:
Z i ∗ = ∑ j = 1 maj ∗ ⋅ xij Z_{i}^{*}=\sum_{j=1}^m{a_{j}^{*}\cdot x_{ij}}Zi=j=1majxij

4. Code

clear
clc
%导入数据,每列为指标,每行为样本数据,计算每个样本投影评价值
x=[0.81   0.00   0.37   0.00   0.15   0.00   0.97 
0.14   0.49   0.00   1.00   1.00   0.59   0.97 
0.57   0.43   0.11   0.97   0.00   0.73   0.83 
1.00   0.40   0.69   0.50   0.88   1.00   0.60 
0.73   0.26   1.00   0.00   0.88   1.00   0.63 
0.00   0.74   0.29   0.32   0.45   1.00   0.00 
0.84   0.46   0.26   0.71   0.97   0.64   0.50 
0.11   1.00   0.37   0.06   0.39   0.50   0.73 
0.27   0.09   0.49   0.29   0.94   0.86   0.40 
0.70   0.26   0.69   0.38   0.18   0.45   1.00 ];
%矩阵维度计算,目的在于方便后面的计算
[a,b]=size(x);
disp('指标类型:正向指标为1,负向指标为2,中心指标为3,区间指标为4')
disp('如第一个指标为区间指标,第二个为负向指标,第三个为正向指标,输入[4,2,1]')
ank=input('请输入指标类型');
max1 = max(x);
min1 = min(x);
%这里的最大最小为什么要乘以个0.98和0.02呢,这里小编直接说结果把说下吧,
%如果不分别乘以0.98和0.02,那么最终的r矩阵可能会出现0,从而干扰结果
%这里严格意义上不能称为归一化,因为最后结果不再01之间
%每种归一化方法都是论文的不同点(创新点)
%这里有其他的归一化方法,想了解可以联系我(QQ:2892053776)
for i=1:b
    if ank(i)==1
        x(:,i)=(0.98*max1(i)-x(:,i))/(0.98*max1(i)-0.02*min1(i));
    elseif ank(i)==2
        x(:,i)=(x(:,i)-0.02*min1(i))/(0.98*max1(i)-0.02*min1(i));
    elseif ank(i)==3
        best=input(['请输入第',i,'个指标的最优值']);
        M = 0.98*max(abs(x(:,i)-best));
        x(:,i) = 1 - abs(x(:,i)-best) / M;
    else
        best=input(['请输入第',i,'个指标的最优区间,如区间上限为2,下限为1则输入:[1,2]']);
        a=best(1);b=best(2);
        M = 0.98*max([a-min(x(:,i)),max(x(:,i))-b]);
        for j = 1: a
            if x(j,i) < a
               x(j,i) = 1-(0.02*a-x(j,i))/M;
            elseif x(j,i) > b
               x(j,i) = 1-(x(j,i)-0.98*b)/M;
            else
               x(j,i) = 1;
            end
        end
    end
end
tic
for k=1:a
    %退火寻找最优投影方向
    temperature=100;%初始温度
    iter=100;%迭代次数
    L=1;%用于记录迭代的次数
    n=size(x,2);%指标个数
    c=suiji(n);%随机生成初始值,在遗传算法中就相当于初始种群
    p=c;
    y=Target(x,c);
    while temperature>0.01
        for i=L:iter
            c1=suiji(n);%这里为什么还要随机呢,目的在于避免算法陷入局部最优值这一缺陷
            y1=Target(x,c1);%计算目标函数值
            delta_e=y1-y;
            if delta_e>0
                y=y1;
                p=c1;
            else
                if exp(delta_e/temperature)>rand()
                    y=y1;
                    p=c1;
                end
            end
        end
        L=L+1;
        temperature=temperature*0.99;
    end
    w(k)=y;
    e(k,:)=p;
end
toc
%求得各样本投影值 r
c=e(find(w==max(w)),:)%c记录的是每个指标的权重值,也就是通过对数据分析,赋予了指标一个参比性的一个值
%这里的find(w==max(w)是什么意思呢,是找到w矩阵中最后一个最大值的位置,目的也是为了寻找最优的结果
%e记录的是经L次迭代生成的一个参比性值的矩阵
for i=1:a
    for j=1:b
        r(i,j)=sum(x(i,j).*c(j));%每行每列的评价值
    end
end
sum(r,2)%各样本评价值

function a=suiji(n)
%初始化,随机给予每个指标任意权重
for k=1:n
    b(k)=rand();
end
temp=sum(b.^2);
a=sqrt((b.^2)./temp);
end
function y=Target(x,a)
%计算目标函数值,见模型步骤3
[m,n]=size(x);
for i=1:m
    s1=0;
    for j=1:n
        s1=s1+a(j)*x(i,j);
    end
    z(i)=s1;
end
Sz=std(z);%方差
R=0.1*Sz;
s3=0;
for i=1:m
    for j=1:m
        r=abs(z(i)-z(j));
        t=R-r;
        if t>=0
            u=1;
        else
            u=0;
        end
        s3=s3+t*u;
    end
end
Dz=s3;
y=Sz*Dz;
end

Summarize

  The above is what I want to talk about today. This article only briefly introduces the use of simulated annealing optimization projection pursuit, and projection pursuit can be used in combination with other optimization algorithms. Theoretically, the effects should be similar, but the actual running results are not. There are thousands of differences, and the number of iterations is different, and the results will also be very different. If you are interested, you can try it.
  The writing of this article draws on the works of many blogs on the Internet. It is definitely impossible to check the repetition, so be careful about plagiarism.

Guess you like

Origin blog.csdn.net/weixin_52952969/article/details/125455615