神经网络中,正则化L1与L2的区别、如何选择以及代码验证

所谓的正则效果就是:
数学上具备修补项的某些特性。

讲人话,到底什么是正则化?
就是让我们本科时学过的拉格朗日法求极值得到的解集具有某些特征。

L1:(拉普拉斯分布的指数项)
结果会比较稀疏(接近0,或者很大),
好处是更快的特征学习,让很多W为0
但是正则效果可能不太明显;

L2:(高斯分布的指数项)
L2对于不重要的特征会减小W,但是不会为0

我们应该如何选择L1还是L2?
一般是根据先验分布的不同选择不同的正则化项(其实高斯分布和拉普拉斯分布长得差不多)
Google的说法是:
L1 regularization can’t help with multicollinearity.
L2 regularization can’t help with feature selection.
讲人话:
当你想要抽取规则的时候,L1优先
当你想要特征之间进行线性组合的时候,L2优先

为什么L1具有稀疏性?
这个东西网上几乎没有博客是讲清楚的,
还记得本科时学的拉格朗日不?
这里使用
https://stats.stackexchange.com/questions/45643/why-l1-norm-for-sparse-models
中的一个图来说明:
在这里插入图片描述
上面图中的椭圆就是未经正则化的原loss函数,
绿色的就是约束,解最终在绿色的区域的边上产生。
上面有个地方没有讲准确,就是,
这里其实使用的是“广义拉格朗日”(处理不等式约束),
本科时我们学过的是“狭义的拉格朗日(处理等式约束)”
所以L2能产生稀疏解不?也可以,但是概率比较小,因为约束是一个圆圈嘛。

好了,扯了这么多,代码呢?
代码可以使用《python深度学习》第四章的第三个实验
神经网络结构是10000X16X16X1
为了快速出结果,设置epochs=1
L1正则时的权重输出如下:

输出权重 [array([[-4.27828636e-05, -1.00246782e-03, -2.79264990e-04, ...,
        -3.85033316e-04, -3.40257306e-04, -3.55066732e-08],
       [ 2.15981118e-02,  3.79165774e-03, -6.72453083e-03, ...,
         2.53116563e-02,  4.29332331e-02, -1.85270631e-03],
       [ 2.85863448e-02,  1.66764148e-02, -6.34254003e-03, ...,
         1.40961567e-02,  2.53925007e-02,  1.04373496e-04],
       ...,
       [-3.83841514e-04,  5.83783374e-04, -6.23644795e-04, ...,
        -1.19826291e-04,  1.29003369e-04, -5.33740851e-04],
       [-6.40181359e-04,  6.27052214e-04, -6.12081552e-04, ...,
        -9.73617774e-04, -3.70911177e-04,  1.00261578e-03],
       [-6.49964553e-04,  2.80193461e-04, -4.07341809e-04, ...,
         9.82345082e-04, -7.55024375e-04,  7.67573947e-05]], dtype=float32), array([ 0.01352753,  0.00530624,  0.01403685, -0.00621689,  0.00346374,
        0.01224227, -0.01186973,  0.00608102,  0.00767745,  0.02525727,
        0.00554247,  0.00680919,  0.00823556,  0.02523253,  0.02550968,
       -0.00733239], dtype=float32), array([[ 2.97359854e-01,  6.56183460e-04, -4.73043948e-01,
        -1.17567085e-01, -3.42536233e-02, -3.03927213e-01,
         4.69646633e-01, -3.84592921e-01,  1.52946264e-01,
        -1.82628393e-01,  3.07190239e-01,  1.88732699e-01,
        -3.68719488e-01, -3.30251426e-01, -3.66007872e-02,
         4.12766099e-01],
       [ 1.22636884e-01,  9.78616104e-02, -2.44927496e-01,
        -3.78500260e-02, -3.29815060e-01,  4.54631686e-01,
         1.32869394e-03, -2.15873808e-01,  1.01626828e-01,
        -1.52611211e-01, -3.60170454e-01, -3.46550457e-02,
         3.55746113e-02,  3.10409755e-01,  3.07094425e-01,
        -3.89622569e-01],
       [ 4.18785542e-01,  6.49755746e-02,  2.65271336e-01,
        -1.81596532e-01, -2.55371511e-01, -8.37184638e-02,
         4.29974437e-01, -1.55283764e-01, -3.39388162e-01,
        -3.18841726e-01, -4.97105066e-03, -2.07916439e-01,
        -1.47543848e-01, -8.37940574e-02,  3.37905467e-01,
         3.27208400e-01],
       [ 1.04914896e-01, -3.00677449e-01,  2.32164890e-01,
         1.62189871e-01, -1.11904912e-01, -1.14806369e-02,
        -3.23227465e-01, -1.23150116e-02,  2.32810229e-01,
         2.10369080e-01,  1.51899308e-01,  2.40044445e-01,
         1.14793181e-01,  5.89926494e-04, -8.19776803e-02,
         7.19810778e-04],
       [-3.87102336e-01,  3.51326197e-01,  9.02353227e-02,
        -2.63564795e-01, -3.27613801e-01, -2.86400300e-02,
        -1.87998384e-01,  3.43739748e-01,  2.73346812e-01,
         2.66616821e-01,  3.51429433e-02,  4.56109941e-01,
         5.48761450e-02, -3.60661447e-01, -3.88115913e-01,
         2.51187414e-01],
       [-2.37962544e-01,  1.66401789e-01, -3.98593396e-01,
         1.65419161e-01,  3.33086133e-01,  4.77736555e-02,
         2.00005323e-01, -2.52376407e-01, -2.90598810e-01,
        -1.85996607e-01, -2.25491524e-02,  1.13793194e-01,
         1.65100321e-01, -6.65912463e-04,  5.77541031e-02,
        -3.25353086e-01],
       [-1.40810832e-01, -2.48465851e-01, -1.19345643e-01,
        -8.56471481e-04, -2.67849237e-01, -1.44852057e-01,
        -9.15314704e-02,  1.34784952e-01, -1.29481718e-01,
        -1.04500920e-01, -1.77888229e-01, -1.47721738e-01,
        -2.19401658e-01, -2.23744530e-02, -2.98361719e-01,
         1.45486742e-01],
       [ 1.69071302e-01, -3.72374713e-01,  2.83467352e-01,
        -1.03206985e-01,  3.67821902e-01, -1.43115878e-01,
         1.25592351e-01, -3.89090292e-02, -2.01085940e-01,
         1.77833766e-01, -2.91119248e-01,  3.61348659e-01,
        -3.43382619e-02, -3.96245480e-01,  3.98543626e-01,
         4.63600516e-01],
       [ 4.63620007e-01, -2.45612651e-01,  3.48520666e-01,
         1.46613419e-01,  1.65358827e-01, -2.95230269e-01,
         4.20761257e-01, -2.00932339e-01, -1.33652672e-01,
         2.92670336e-02, -1.22803524e-01,  2.40687251e-01,
         3.18130404e-01, -6.91166497e-04, -3.25402856e-01,
        -1.30906135e-01],
       [ 1.62128076e-01, -1.19411573e-01,  3.45981359e-01,
        -3.86496191e-04,  4.05329019e-01,  1.49058387e-01,
         4.43916738e-01,  5.18011861e-02, -3.05147499e-01,
        -3.65549386e-01, -2.54479855e-01, -1.22571457e-02,
         1.56393483e-01,  5.07648513e-02,  2.26654470e-01,
        -3.36109191e-01],
       [-3.12535584e-01, -2.30290424e-02,  6.98565692e-02,
        -1.50468856e-01, -2.78825819e-01,  9.92865711e-02,
        -3.34635884e-01,  3.57187033e-01,  2.54794866e-01,
         1.91722021e-01,  5.36262877e-02,  1.83799900e-02,
        -5.85136586e-04,  3.57504547e-01, -2.61918098e-01,
         2.01858550e-01],
       [ 1.80302829e-01,  3.65201116e-01,  2.03263357e-01,
         1.17282532e-01,  1.65266767e-01, -4.04994518e-01,
        -3.51655126e-01,  3.97830069e-01, -7.66607746e-02,
        -9.62971300e-02,  1.73393369e-01, -2.00297937e-01,
         7.74533255e-04, -1.40481442e-01,  2.14320533e-02,
         3.77951324e-01],
       [-2.46189404e-02,  2.93494880e-01,  3.59376967e-01,
         3.20476014e-04,  3.01101089e-01,  3.21090758e-01,
        -3.75274122e-01,  9.95393726e-04,  2.46108666e-01,
         2.64105260e-01, -1.19236402e-01,  3.77319247e-01,
         6.48521120e-04,  3.39984924e-01,  2.55425870e-01,
         2.54246205e-01],
       [ 5.12674786e-02, -1.02096912e-03, -3.70046735e-01,
        -3.52790147e-01, -1.98903963e-01, -1.82327494e-01,
         3.54469061e-01,  1.71051875e-01,  3.73468578e-01,
         1.66834593e-01,  2.45054252e-07,  4.23564501e-02,
         9.42573650e-04, -1.89804733e-02,  4.39227995e-04,
         9.95820481e-03],
       [ 2.57087111e-01, -2.79899389e-01,  2.73097128e-01,
        -3.69274199e-01,  1.06317475e-01,  3.90571177e-01,
         1.57478735e-01,  2.42957503e-01,  4.03050303e-01,
        -3.74355882e-01, -2.04208896e-01,  1.89841297e-02,
        -3.78889889e-01,  2.43642956e-01,  2.69247919e-01,
         1.17503397e-01],
       [ 1.01470791e-01, -2.11673021e-01, -1.81737795e-01,
         3.25044870e-01, -1.60212040e-01,  2.00224802e-01,
         5.87655418e-03, -3.10205370e-01,  2.09311340e-02,
        -2.71605730e-01, -3.22293550e-01,  7.38748312e-02,
         6.16738871e-02, -8.31133649e-02, -1.60038099e-02,
        -7.09989516e-04]], dtype=float32), array([ 1.9303737e-02, -2.9504906e-02, -2.6506849e-02,  1.6427160e-03,
        2.9263936e-02,  5.9134695e-03,  2.8158128e-02,  2.4705507e-02,
        1.2207930e-02, -4.5786786e-05,  4.7801528e-05, -2.5754545e-02,
        6.4422688e-03,  2.7185101e-02,  2.4097716e-02,  3.2040365e-02],
      dtype=float32), array([[ 0.33899024],
       [ 0.09035949],
       [-0.40873846],
       [ 0.44341677],
       [ 0.55033386],
       [-0.48756945],
       [ 0.35685787],
       [-0.34343976],
       [-0.5151725 ],
       [-0.15856893],
       [ 0.01221188],
       [-0.31893036],
       [-0.2622632 ],
       [-0.29004556],
       [ 0.08608972],
       [ 0.29274988]], dtype=float32), array([0.02345355], dtype=float32)]

我们可以看到,有很多个权重是e-4,也就是说小于0.1,
所以L1的稀疏性是什么意思呢,
不是网上说的很多权重为0,
而是很多权重接近0.

然后使用同样的代码,再进行L2正则化,然后看下输出的权重

输出权重 [array([[ 5.4510499e-19, -1.1375406e-14,  1.2810104e-09, ...,
         6.6694220e-13,  1.7195195e-21,  1.1844387e-18],
       [ 2.2681307e-02,  2.8639721e-02, -5.0795679e-03, ...,
         3.2248314e-02,  3.5097659e-02,  1.9943751e-02],
       [ 2.9682485e-02,  3.8339857e-02, -3.8643787e-03, ...,
        -7.3956484e-03,  1.5394368e-02,  6.4378373e-02],
       ...,
       [-2.2290610e-03, -5.2631828e-03, -1.1894613e-02, ...,
        -3.2023021e-03,  9.8316949e-03,  4.5698625e-03],
       [-6.3545518e-03, -2.8249058e-03, -1.4715969e-02, ...,
         3.9712125e-03, -2.1713993e-03,  6.4099208e-05],
       [ 4.3839724e-03,  7.1036047e-04, -6.1749844e-03, ...,
         3.3779065e-03,  3.6998792e-04, -2.5457949e-03]], dtype=float32), array([0.01308027, 0.02083343, 0.01824512, 0.01143504, 0.00499259,
       0.01486909, 0.01095271, 0.00404395, 0.02463059, 0.00872665,
       0.00801992, 0.00815683, 0.01039271, 0.01561781, 0.01411563,
       0.04340347], dtype=float32), array([[ 2.93407321e-01, -1.98400989e-02, -4.40114766e-01,
        -1.11424305e-01, -7.31087476e-02, -3.04403722e-01,
         4.64353442e-01, -3.29900682e-01,  1.63630053e-01,
        -1.84131727e-01,  3.08276862e-01,  1.94891691e-01,
        -4.41589683e-01, -3.05707157e-01, -4.41319868e-02,
         4.10211772e-01],
       [ 1.44314080e-01,  1.14107765e-01, -3.18847924e-01,
        -8.83864984e-02, -2.84857243e-01,  4.43122834e-01,
         3.62090170e-02, -2.15172529e-01,  9.76731330e-02,
        -1.61776304e-01, -3.61122280e-01, -5.60393780e-02,
         8.91952366e-02,  3.17961156e-01,  3.24503183e-01,
        -3.71593475e-01],
       [ 4.25948858e-01,  7.04302564e-02,  2.81940609e-01,
        -1.77070782e-01, -2.74864286e-01, -1.06579565e-01,
         4.36641574e-01, -1.12034686e-01, -3.45022917e-01,
        -3.19837213e-01, -1.43970661e-02, -2.16923535e-01,
        -2.31076464e-01, -8.27731341e-02,  3.60185146e-01,
         3.36787492e-01],
       [ 1.15281835e-01, -3.01896662e-01,  2.39346668e-01,
         1.83167055e-01, -1.16130240e-01, -2.12356411e-02,
        -3.89987141e-01, -2.43074540e-02,  3.24033946e-01,
         2.12604478e-01,  1.53906882e-01,  3.26046437e-01,
         2.10126624e-01,  8.62302035e-02, -1.64832115e-01,
         1.51150580e-02],
       [-3.90144706e-01,  3.52188319e-01,  2.51630321e-02,
        -3.43495667e-01, -2.54216045e-01, -3.87258083e-02,
        -1.94808662e-01,  2.56020427e-01,  2.74487942e-01,
         2.68538356e-01,  4.05583121e-02,  4.54750240e-01,
         1.47770867e-01, -3.62259477e-01, -3.83709610e-01,
         2.63715029e-01],
       [-2.85299212e-01,  1.69331729e-01, -3.38647544e-01,
         2.34549761e-01,  2.80789793e-01,  8.91473368e-02,
         1.77124396e-01, -2.13072211e-01, -2.61840492e-01,
        -1.87434465e-01, -2.91305147e-02,  1.61795199e-01,
         2.20224589e-01,  2.58004293e-02,  5.37811071e-02,
        -3.69709074e-01],
       [-1.43994346e-01, -2.49894559e-01, -2.04025045e-01,
         9.80285257e-02, -2.77742773e-01, -2.26973757e-01,
        -8.67364928e-02,  1.37876272e-01, -2.02594623e-01,
        -1.06818587e-01, -1.79687336e-01, -2.55178958e-01,
        -2.99385250e-01, -5.59285395e-02, -2.94983864e-01,
         2.53982246e-01],
       [ 1.84192955e-01, -3.73058200e-01,  2.97143936e-01,
        -1.03991538e-01,  2.90101379e-01, -1.68928549e-01,
         1.39721408e-01, -1.06037809e-02, -2.18328491e-01,
         1.80070952e-01, -2.92274624e-01,  3.65887821e-01,
        -1.29578754e-01, -3.81524444e-01,  3.92549694e-01,
         4.74777371e-01],
       [ 4.72008854e-01, -2.47345746e-01,  3.59616429e-01,
         2.52750129e-01,  1.19450204e-01, -3.10099781e-01,
         4.30306435e-01, -1.62968546e-01, -1.40832901e-01,
         3.73812504e-02, -1.25185639e-01,  2.44938642e-01,
         3.32485586e-01,  3.53299305e-02, -3.30613941e-01,
        -1.35950983e-01],
       [ 1.85852900e-01, -9.83352214e-02,  3.41978699e-01,
         8.67165811e-03,  3.46845627e-01,  1.29153579e-01,
         4.64443177e-01,  1.15139224e-02, -3.25556934e-01,
        -3.66379201e-01, -2.55769938e-01, -2.08554789e-02,
         1.51786834e-01,  5.86780272e-02,  2.55553305e-01,
        -3.25718910e-01],
       [-3.11689675e-01, -2.17617527e-02,  7.72571983e-03,
        -2.31704265e-01, -2.14245647e-01,  9.61495414e-02,
        -3.27444166e-01,  2.67990142e-01,  2.50948817e-01,
         1.95652723e-01,  5.79224415e-02,  9.13931709e-03,
         2.56390348e-02,  3.59594494e-01, -2.66508758e-01,
         2.20254958e-01],
       [ 1.94646284e-01,  3.66057843e-01,  2.14457333e-01,
         2.24085152e-01,  1.05342574e-01, -4.27612185e-01,
        -3.49920005e-01,  3.75871032e-01, -1.05346240e-01,
        -9.78173241e-02,  1.75176308e-01, -2.12537095e-01,
        -8.19739625e-02, -1.40039310e-01,  7.61785209e-02,
         3.92508149e-01],
       [-4.72818352e-02,  2.94093102e-01,  2.90557832e-01,
         1.40494900e-04,  2.79832035e-01,  3.35276634e-01,
        -3.79788160e-01, -7.71822548e-03,  2.58355170e-01,
         2.66037405e-01, -1.21681616e-01,  3.90908629e-01,
         6.70772269e-02,  3.55733871e-01,  2.57470012e-01,
         2.55136728e-01],
       [ 3.27138193e-02, -2.80077597e-06, -3.42442542e-01,
        -4.03933018e-01, -1.91840082e-01, -1.61959320e-01,
         3.39495540e-01,  1.39288634e-01,  4.23164964e-01,
         1.70850694e-01, -1.47289789e-08,  9.98789370e-02,
         7.75708482e-02, -7.56203895e-03, -3.06240357e-02,
        -6.50095474e-03],
       [ 2.32675731e-01, -2.35386729e-01,  2.33843103e-01,
        -4.32860196e-01,  7.58191794e-02,  4.14948165e-01,
         1.41167402e-01,  1.70569643e-01,  4.25751954e-01,
        -3.75422835e-01, -2.05775797e-01,  6.12163655e-02,
        -3.75813574e-01,  2.67257035e-01,  2.52756864e-01,
         1.00201644e-01],
       [ 1.61049694e-01, -2.18448177e-01, -2.63321877e-01,
         4.01718348e-01, -1.76443890e-01,  2.42350683e-01,
         7.09695518e-02, -3.05766135e-01,  6.77920505e-02,
        -2.73409277e-01, -3.23337317e-01,  1.05839022e-01,
         1.21519454e-01, -8.19710717e-02, -6.41441718e-02,
         1.70101207e-02]], dtype=float32), array([ 0.01596233, -0.00421972, -0.0345025 ,  0.03105383,  0.00776252,

我们可以看到,其实也有很多是e-2取值的,但是L2正则化的情况下,你基本看不到e-4的,所以说,
L1比L2“更容易”导致权重的稀疏性,
注意:
并非只有L1能导致稀疏性。

猜你喜欢

转载自blog.csdn.net/appleyuchi/article/details/86531198