【论文笔记】RefineDet: Single-Shot Refinement Neural Network for Object Detection

& Paper Overview

 

 

Papers Address: https://arxiv.org/abs/1711.06897

Code Address: https://github.com/sfzhang15/RefineDet

 & Summary and personal views

       In this paper, based on single-shot refinement neural network detector, composed by ARM and ODM. ARM for filtering negative anchor to reduce the search space of the classifier, while adjusting the position of the anchor and the size of roughly a better return for subsequent initialization; a preceding ODM obtained from ARM refinement anchor as inputs return precise target location and size, while the predicted multi-class multi-label response. The entire network using a multi-tasking loss of end to end training. RefineDet in PASCAL VOC 2007, PASCAL VOC 2012 MS COCO data set and achieved the optimal detection accuracy while efficiently. After plans to use RefineDet to detect other specific goals, such as pedestrians, vehicles and face, while the introduction of attention mechanism to get better results.

       By observation of the entire structure, and similar structures RefineDet FPN forth herein, with the main difference that the FPN, FPN subnet does not enter the feature classification and regression FIG bottom-top direction, and in this way the RefineDet got Refined anchor, in addition is the use of the FPN TCB module features the integration process some differences. 

& Contribution

1, one-stage proposed object detection frame using the ARM and ODM;

2, the design features of the ARM TCB will transfer to ODM in order to solve the prediction accurate target positioning, size and category labels challenging task;

3, RefineDet in PASCAL VOC2007, PASCAL VOC2012 MS COCO dataset and the acquired optimum performance. 

& Problems to be solved

Problem : to achieve superior two-stage precision approach and maintain the one-stage method compared to the efficiency .

Analysis :

One-stage method of detection accuracy is generally lower than the two-stage method, which is a major reason for the category imbalance .

Recently the method for solving this problem such as: Kong et al used a priori certain limit to reduce the search space in the target convolution characteristics; Lin et redefined cross entropy criteria makes training more attention loss and reduce well difficult cases right sample weight classification; Zhang et max-out label design mechanisms to reduce the imbalance caused by false positive samples.

This article view is that two-stage method compared to the one-stage method has three main advantages:

  • Using two-stage sampling structure with heuristic to solve the imbalance problem category
  • Two-step cascade goal box regression parameters
  • Using the two-stage target characterization

& Framework and main methods

1、  Main Architecture:

 

The entire network has two main components, ARM and ODM, wherein some of the negative anchor ARM filtered through a coarse and a foreground background score adjust the position and the size of the anchor-regression loss, relative to obtain refined anchor. ODM will be refined anchor accurate classification and regression operation as input. Wherein transmitting to the ARM TCB features into the ODM.

2、  TCB(Transfer Connection Block)

 

Mainly to the upper layer to the current layer, wherein the fusion, the use of high_level characterized deconv obtain the same dimensions of the current layer, using the element-wise sum method, then the result is input to a convolutional layer (hereinafter referred to: This convolution the active layer is to ensure that the resolution for the detected characteristics ?). 

3、  Two-step Cascade Regression

  Regression is a step of one-stage methods currently used, i.e., to predict the target location and size of the feature based on different layers of different scales, and in this way is particularly challenging scenario small target detection is inaccurate . In this paper, two-step cascade return policy: first the location and size of the anchor in the ARM adjusts in order to return to ODM's better to initialize.

       Each feature is defined on the n Boxes FIG anchor cell, cell prediction Refined FIG anchor for each feature with respect to the offset of the original 4 and the two anchor confidence probability score for representing the foreground class. After obtaining refined anchor, corresponding to its incoming ODM FIG response characteristics to classify and precise positioning. 

4、  Loss Function 

  Where the subscript i represents the anchor, P i and X i are the i-th coordinate of the anchor is refined probability and a target anchor the ARM, C i and T i are the predicted ODM categories and corresponding coordinates. L I * is the ground truth category, G I * is the location and size. N ARM and N ODM are the number of anchor ARM and ODM CKS. L B is a cross-entropy loss or log loss, L m is softmax loss. When N ARM or N ODM is 0, the corresponding entry is zero.

5、  Experiment

(1) RefineDet compared with other methods in PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO dataset.

 

The results are shown in Table enumeration method of optimal performance RefineDet exceeded.

(2) Ablation: the three methods mentioned in the text are negative anchor filtering, role two-step cascade regression, transfer connection block comparison.

 

& Reflection and inspiration

Overall, the main innovation of this paper first used the two-step cascade regression method to spread stage, able to refine the operation of the original anchor. The overall structure is quite similar to the FPN.

 

 

Guess you like

Origin www.cnblogs.com/fanzhongjie/p/11861032.html