背景

利用caffe实现一下repvgg的转化，借此学习一些东西
https://github.com/imistyrain/RepVGG-caffe

一、网络分析

在这里插入图片描述

见图1 ，RepVGG_A0里面是两个stage。这两个stage是两种类型的stage，分别是2分支和3分支的stage。整个RepVGG_A0就是这两种类型
网络仅在索引为 [0, 1, 3, 7, 21]的stage是2分支（1×1卷积 + 3×3卷积），其他都是3分支（identity分支+1×1卷积 + 3×3卷积）

二、方法

RepVGG_A0的融合由最底层到最上层可以分为：
- BN融合
- 2分支：融合1×1卷积 with 3×3卷积
- 3分支：融合 identity with 1×1卷积 with 3×3卷积。其中**identity **转化为3×3卷积以后和BN融合

这里面只放核心的代码，BN融合

# Fuse BN to Conv 
def fuse_bn(conv_ori, bn, scale, identity=False):
    print("Begin BN fusion ... ")
    
    if not identity:
        conv_weight = conv_ori[0].data.copy()  # shape (out_channels, in_channels, kernel, H, W)
    else:
        conv_weight = conv_ori.copy()  

        
    print("Conv kernel shape {}".format(conv_weight.shape))
    
    conv_weight_reshape = conv_weight.reshape(conv_weight.shape[0],-1)
    
    bn_mean = bn[0].data.copy() # shape (out_channels,)
    bn_var = bn[1].data.copy() # shape (out_channels,)
    num_bn_samples = bn[2].data.copy()
    
    if num_bn_samples[0] == 0:
        num_bn_samples[0] = 1
    
    scale_scale = scale[0].data.copy() # shape (out_channels,)
    scale_bias = scale[1].data.copy() # shape (out_channels,)
   
    new_weight = np.matmul(np.diag(scale_scale / np.sqrt(bn_var / num_bn_samples[0]  + EPS)), conv_weight_reshape)
    new_bias = scale_bias - (bn_mean/num_bn_samples[0]) * (scale_scale / np.sqrt(bn_var / num_bn_samples[0] + EPS))
    
    
    new_weight = new_weight.reshape(conv_weight.shape).astype(np.float32)
    new_bias = new_bias.astype(np.float32)
    
    if new_weight.shape[0] != new_bias.shape[0]:
        raise Exception("Dims missing after BN fusion. Weights shape {}, Bias shape {}".format(new_weight.shape, new_bias.shape))
        
    return new_weight, new_bias

三、教训

一开始结果不一致，后来定位到了问题是BN融合问题。
Caffe的BN分为两个部分，归一化BN和缩放Scale。其中Scale没有问题就是γ和β这两个参数，但是BN里面有3个参数，预期应该是两个参数（i.e. running_mean和running_var）。一开始没有使用第3个参数，因此导致了错误。BN里面的第一个参数是滑动累计mean，第二参数是 滑动累计var，第三个参数是滑动累计系数 sum。
简单的来说，做前向时候，实际使用的
- running_mean = 滑动累计mean / 滑动累计系数 sum
- runing_var = 滑动累计 **var **/ 滑动累计系数 sum

开发日志

2021-04-12-周一
- 总体框架已经完成，但是最终的输出不一致
2021-04-13-周二
- 定位到问题在于BN融合
- 问题是我没有弄懂caffe的BN层的参数…

RepVGG的caffe实现

背景

一、网络分析

二、方法

三、教训

开发日志

目录

背景

一、网络分析

二、方法

三、教训

开发日志

猜你喜欢

目录

热门文章