[Target detection] IOU introduction

0. What is IOU

The full name of IOU is Intersection over Union.
IoU is a measure of how accurately a corresponding object is detected in a particular dataset. IoU is a simple measurement standard, as long as the task of obtaining a prediction range (bounding boxex) in the output can be measured with IoU.
In target recognition, a certain ratio of our predicted frame to the actual frame is the IOU

1. Calculation of IOU

This is the calculation method of IOU, which is calculated by comparing the intersection area of ​​two boxes with the overall area.
insert image description here

1.1 Disadvantages of basic IOU

  • If the two boxes do not intersect, by definition, IoU=0, which cannot reflect the distance between the two (coincidence). At the same time, because loss=0, there is no gradient feedback, and learning and training cannot be performed.
  • IoU cannot accurately reflect the degree of coincidence between the two.

1.2 Advantages of Basic IOU

  • It can be said that it can reflect the detection effect of the predicted detection frame and the real detection frame.
  • Another good feature is scale invariance, that is, scale invariant. In the regression task, the most direct indicator for judging the distance between the predict box and gt is IoU. (Satisfying non-negativity; identity; symmetry; triangle inequality)

2. GIOU

GIOU comes from the following paper:
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
For any two boxes A and B, first find a minimum box C that can enclose them. Then calculate the ratio of the area of ​​C \ (A ∪ B) to the area of ​​C, Note: The area of ​​C \ (A ∪ B) is the area of ​​C minus the area of ​​A∪B. Then subtract this ratio from the IoU values ​​of A and B to get GIoU.
insert image description here
insert image description here

3 TODAY

DIoU is more in line with the target frame regression mechanism than GIou, taking into account the distance between the target and the anchor, the overlap rate and the scale, so that the target frame regression becomes more stable, and it will not diverge during training like IoU and GIoU And other issues.
insert image description here
where bbb b g t b^{gt} bg t , respectively represent the center point of the predicted frame and the real frame, andρ \rhoρ stands for calculating the Euclidean distance between two center points. ccc represents the diagonal distance of the smallest closure area that can contain both the predicted frame and the real frame.

4 CIOU

The paper considers that the aspect ratio of the three elements of bbox regression has not been considered in the calculation, so CIoU is further proposed on the basis of DIoU. The penalty term is as follows:
insert image description here
while vvv is used to measure the similarity of the aspect ratio, defined as
insert image description here
the complete CIoU loss function definition:
insert image description here

5 EIOU

Although CIOU Loss considers the overlapping area, center point distance, and aspect ratio of the bounding box regression. However, the difference in the aspect ratio reflected by v in the formula is not the real difference between the width and height and their confidence, so sometimes it will hinder the model from effectively optimizing the similarity. In response to this problem, some scholars disassembled the aspect ratio on the basis of CIOU, proposed EIOU Loss, and added Focal to focus on high-quality anchor frames. This method comes from an article in 2021 "Focal and Efficient IOU Loss for Accurate Bounding Box Regression》

6 WIOU

wiseIOU comes from the following paper
Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism
Paper download address
Abstract of the article : Target detection is the core problem of computer vision, and its detection performance depends on the design of the loss function. The bounding box loss function is an important part of the target detection loss function, and its good definition will bring significant performance improvements to the target detection model. Most of the research in recent years has focused on strengthening the fitting ability of the bounding box loss, assuming that the examples in the training data are of high quality. However, we noticed that the target detection training set contains low-quality examples. If the regression of bounding boxes to low-quality examples is blindly strengthened, it will obviously jeopardize the improvement of model detection performance. Focal-EIoU v1 was proposed to solve this problem, but due to its static focusing mechanism, the potential of the non-monotonic focusing mechanism has not been fully exploited. Based on this viewpoint, we propose a dynamic non-monotonic focusing mechanism and design Wise-IoU (WIoU). The dynamic non-monotonic focusing mechanism uses "outlier" instead of IoU to evaluate the quality of anchor boxes and provides a sensible gradient gain allocation strategy. While reducing the competitiveness of high-quality anchor boxes, this strategy also reduces the harmful gradients produced by low-quality examples. This enables WIoU to focus on normal-quality anchor boxes and improves the overall performance of the detector. When applying WIoU to the state-of-the-art single-stage detector YOLOv7, the AP-75 on the MS-COCO dataset is improved from 53.03% to 54.50%.
wise-Iou is divided into 3 versions, among which the third version is the best.

3.5.1 wise-IOU v1

Because the training data inevitably contains low-quality examples, geometric measures such as distance and aspect ratio will intensify the penalty for low-quality examples and degrade the generalization performance of the model. A good loss function should weaken the penalty of geometric metrics when the anchor box and the target box coincide well, but too much intervention training will make the model have better generalization ability. On this basis, we construct distance attention according to the distance metric, and get WIoU v1 with two layers of attention mechanism:
insert image description here

3.5.2 wise-IOU v2

insert image description here

3.5.3 wise-IOU v3

The outlier is defined to describe the quality of the anchor box, which is defined as: a
insert image description here
small outlier means a high-quality anchor box, and we assign a small gradient gain to it so that the bounding box regresses to focus on anchor boxes of average quality. Assigning small gradient gains to anchor boxes with large outliers will effectively prevent low-quality examples from producing large harmful gradients. We use β \betaβ constructs a non-monotonic focusing factor and applies it to WIoU v1:
insert image description here

insert image description here

7 The article of good IOU found later is being updated

Guess you like

Origin blog.csdn.net/qq_43471945/article/details/129032719