Before we introduced:
- Theoretical part: [Pedestrian detection] miss rate versus false positives per image (FPPI) Past and Present (Theory)
- Source code interpretation part: [Pedestrian detection] miss rate versus false positives per image (FPPI) past and present (actual combat-1)
Today we will use our own data to draw the FPPI chart
(the first to the sixth are in the previous article, so this article starts directly from the seventh)
Seven, prepare gt and dt
In the original author's code, the format of annotations is vbb, which is really unfriendly. . . Fortunately, through the source code solution, we have found a breakthrough!
As long as we know dts
and gts
formats, and then load(matfile)
in the form of alternative loadDt
and loadGt
can
1. Analysisdts
dts
It is a 1*N cell, and N represents the type of algorithm. For example, the figure below shows that there are 6 algorithms.
Each cell contains a 1*M cell, and M represents the number of pictures. For example, the picture above shows 4024 pictures.
The following figure shows dts
a second cellular content (i.e., the second detection result algorithm)
data dimensions of each column are not the same, because the outcome of each picture is detected it is not the same. For example, 3x5 double means 3 pedestrians are detected in the picture, 4x5 double means 4 pedestrians are detected in the picture
Let’s just click and look at it. Each row represents a pedestrian label, and each column represents [x y w h score]
: the x coordinate of the upper left corner, the y coordinate of the upper left corner, the width of the box, the height of the box, and the confidence level.
Due to different algorithms, the score range of different algorithms may be different, some will be greater than 1, or even negative, but this will not have an impact
note! This is dts
equivalent to the output during visualization!
After non-maximum suppression processing
, the threshold of confidence in visualization is
2. Analysisgts
Since we only use one data set, it gts
is a 1*1 cell. Each cell contains a 1*M cell, and M represents the number of pictures. For example, the following figure shows that there are 4024 pictures. This dts
is the same as the routine, but it should be noted that the order of the 4024 pictures should be dts
matched.
Click to open the cell and continue to look. It shows the ground truth of each picture. Some pictures are not marked with pedestrians, so they will be empty.
Just click on a picture and continue to look at it. The structure is dts
very similar. Each row is a detection box, and each column represents [x y w h score]
: the x coordinate of the upper left corner, the y coordinate of the upper left corner, the width of the box, the height of the box, and the confidence level.
[We only need to replace gts and dts with our own results, to be written later]