NLP进阶之(七)膨胀卷积神经网络


理论来自 Multi-scale context aggregation by dilated convolutions ICLR 2016
作者将 代码贡献于github
针对语义分割问题 semantic segmentation,这里使用 dilated convolutions 得到multi-scale context 信息来提升分割效果。

1. Dilated Convolutions 膨胀卷积神经网络

dilated convolutions:
首先来看看膨胀卷积 dilated convolutions,
fig1

  • 图(a):就是一个常规的3x3卷积,1-dilated convolution得到F1F1的每个位置的卷积感受眼3x3=9
  • 图(b):在F1的基础上,进行一个2-dilated convolution,注意它的点乘位置,不是相邻的3x3,得到了F2,F2的每个位置的 卷积感受眼7x7=49
  • 图©:在F2的基础上,进行一个4-dilated convolution,得到了F3F3的每个位置的卷积感受眼15×15=225,注意这里dilated convolution的参数数量是相同的,都是 3x3=9
    fig2
    从上图中可以看出,卷积核的参数个数保持不变,卷积感受眼的大小随着dilation rate参数的增加呈指数增长。

1.2 动态理解

N.B.: Blue maps are inputs, and cyan maps are outputs.
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

1.2.2 转置卷积动画

N.B.: Blue maps are inputs, and cyan maps are outputs.

1.2.3 理解

shape of input : [batch, in_height, in_width, in_channels]
shape of filter : [filter_height, filter_width, in_channels, out_channels]

with tf.variable_scope("idcnn" if not name else name):
            #shape=[1*3*120*100]
            shape=[1, self.filter_width, self.embedding_dim,
                       self.num_filter]
            print(shape)
            filter_weights = tf.get_variable(
                "idcnn_filter",
                shape=[1, self.filter_width, self.embedding_dim,
                       self.num_filter],
                initializer=self.initializer)
            layerInput = tf.nn.conv2d(model_inputs,
                                      filter_weights,
                                      # 上下都是移动一步
                                      strides=[1, 1, 1, 1],
                                      padding="SAME",
                                      name="init_layer",use_cudnn_on_gpu=True)
            self.layerInput_test=layerInput
            finalOutFromLayers = []
            
            totalWidthForLastDim = 0
            # 第一次卷积结束后就放入膨胀卷积里面进行卷积
            for j in range(self.repeat_times):
                for i in range(len(self.layers)):
                    #1,1,2:1是步长,2就是中间插了一个孔
                    dilation = self.layers[i]['dilation']
                    isLast = True if i == (len(self.layers) - 1) else False
                    with tf.variable_scope("atrous-conv-layer-%d" % i,
                                           reuse=True
                                           if (reuse or j > 0) else False):
                        #w 卷积核的高度,卷积核的宽度,图像通道数,卷积核个数
                        w = tf.get_variable(
                            "filterW",
                            shape=[1, self.filter_width, self.num_filter,
                                   self.num_filter],
                            initializer=tf.contrib.layers.xavier_initializer())
                        if j==1 and i==1:
                            self.w_test_1=w
                        if j==2 and i==1:
                            self.w_test_2=w                            
                        b = tf.get_variable("filterB", shape=[self.num_filter])
                        conv = tf.nn.atrous_conv2d(layerInput,
                                                   w,
                                                   rate=dilation,
                                                   padding="SAME")
                        self.conv_test=conv 
                        conv = tf.nn.bias_add(conv, b)
                        conv = tf.nn.relu(conv)
                        if isLast:
                            finalOutFromLayers.append(conv)
                            totalWidthForLastDim += self.num_filter
                        layerInput = conv
            finalOut = tf.concat(axis=3, values=finalOutFromLayers)
            keepProb = 1.0 if reuse else 0.5
            finalOut = tf.nn.dropout(finalOut, keepProb)
            finalOut = tf.squeeze(finalOut, [1])
            finalOut = tf.reshape(finalOut, [-1, totalWidthForLastDim])
            self.cnn_output_width = totalWidthForLastDim
            return finalOut

2. Dilated Convolutions 优点

3. 应用

扩张卷积在图像分割、语音合成、机器翻译、目标检测中都有应用。

猜你喜欢

转载自blog.csdn.net/qq_35495233/article/details/86638098
今日推荐