How does 'route' layer , 'yolo' layer work in yolov3?

How does 'route' layer work in yolov2,v3?







How does 'yolo' layer work in yolov3?


in cfg file


[yolo]
mask = 6,7,8
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=1
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1



[yolo]
mask = 3,4,5
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=1
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1

random=1


[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=1
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1

Every layer has to know about all of the anchor boxes but is only predicting some subset of them. This could probably be named something better but the mask tells the layer which of the bounding boxes it is responsible for predicting. The first yolo layer predicts 6,7,8 because those are the largest boxes and it's at the coarsest scale. The 2nd yolo layer predicts some smallers ones, etc.


other people's description:

Route layer:
This layer concatenate a list of previous layers together. The previous layers must be of the same width and height.
So, if we have a layer A,B,C with dimensions Ax*Ay*Ac and Bx*By*Bc and Cx*Cy*Cc and a route layer D that links to A,B and C.
D will then be of dimension Ax*Ay*(Ac+Bc+Cc) and simply contain the copied feature maps from the A,B,C layers.

Shortcut layer:
Seems like the residual network operator. It adds the feature maps of the previous layer with another specified layer.
Using a different stride when the width and heights do not match. 
So, if we have a layer A,B with dimensions Ax*Ay*Ac and Bx*By*Bc and a shortcut layer D that links to A and B (B is the previous layer),
D will then be of dimension Bx*By*Bc and simply contain the added feature maps from the A,B layers where D(0) = A(0)+B(0), D(1) = A(1)+B(1),. ... (where D(i) is feature map i of D)

You don't have a layer that can resize the feature maps ? (i.e. rescaling Ax1*Ay1*Ac to Ax2*Ay2*Ac)

猜你喜欢

转载自blog.csdn.net/honk2012/article/details/80535580