Why can a convolution layer * 1 instead of whole connection layer

Network in Network This paper proposes 1 1 convolution layer, then the question is, why can use 1 1 convolution layer instead of the fully connected layer

Suppose the current input tensor dimensions are 6 × 6 × 32, convolution kernel dimension 1 × 1 × 32, taking a certain input tensor position (FIG yellow area) and calculates the convolution kernel. Indeed it can be seen if the 1 × 1 × 32 32 viewed as a convolution kernel weights W, 1 × 1 × 32 input tensor calculation section is input x, then each operation is equivalent to a convolution process Wx , multiple convolution kernel is more neurons, the equivalent of a fully connected network.

In summary, 1 × 1 can be seen as a convolution of the input tensor is input into a 1 × 1 × 32 having x, they share variable convolution kernel (corresponding fully connected network weights) W is fully connected The internet.
Here Insert Picture Description

Obviously a convolution W 1 * 1 time to traverse the image, each pixel value of the image equivalent to the point W * the convolution kernel, which is connected with the whole layer is multiplied by a weight W meaning, so equivalents.

To summarize:
the role of full connectivity layer that can connect local features convolution get up, considering the entire image.

When the number of nodes the number of channel 1 * 1 layer is equal to the full convolution connection layer, the connection layer can be seen as full, wherein the high spatial dimensions and each element corresponds to the width of the sample, corresponding to the channel characteristics.

Then there are people like to ask, the role of convolution 1 * 1

1 * 1 convolution effect

First, to achieve channel characteristic dimension L and dimension reduction
  by controlling the number of channels reaches a convolution kernel the size of the zoom. While cell layer only changes the height and width, the number of channels can not be changed.

Second, increasing the nonlinear
  described above, the convolution process 1 × 1 corresponds to convolution calculation process fully connected layers, and also adding a nonlinear activation function, thereby increasing the non-linear network so that the network can be expressed more complex features.

Third, reduce the convolution kernel parameters (simplified model)
  in Inception Network, since the need for more convolution operation, a large amount of calculation, the effect of reducing the amount of calculation can be ensured by introducing a 1 × 1 simultaneously. Specific examples can be quantified by comparing the following

3.3.1 - 1 × 1 Convolution convolution operation without introducing
  
Here Insert Picture Description
3.3.2 - 1 × 1 incorporated convolution convolution operation
  Here Insert Picture Description

A total amount calculation is required (28 × 28 × 16) × (1 × 1 × 192) + (28 × 28 × 32) × (5 × 16 × 16) ≈ 12.4 million, significantly less than the volume without introducing 1 × 1 calculating an amount of the product of the convolution process. I think its nature as can be appreciated, an important feature extracted by the input tensor convolution 1 × 1 (corresponding to dimension reduction), then the number can be reduced (reduction ratio introduced by the 1 × 5 × 5 convolution calculation 1 convolution required additional computation of the lot).

Instead of fully connected layers with a layer of the convolution * 1 advantages:

1, the image does not change the spatial structure

Full connection layer will destroy the spatial structure of the image, and a convolution layer * 1 does not destroy the structure of the image space.

2, the input may be of any size

Full-size input connection layer is fixed, because the number of parameters depends on the full image size of the connection layer. The size of the input layer is the convolution arbitrary because the number of parameters independent of the image size of the convolution kernel.

Reference blog:
https://blog.csdn.net/wydbyxr/article/details/84061410
https://blog.csdn.net/qq_32172681/article/details/95971492
https://www.cnblogs.com/tianqizhi/p /9665436.html

Published 79 original articles · won praise 149 · views 30000 +

Guess you like

Origin blog.csdn.net/ding_programmer/article/details/104109895