DIOU

DIOU DIOU D I O U

Insert picture description here
Insert picture description here

diou = IOU - d^2/c^2

d^2:表示预测框和真实框中心点之间的距离distance1

c^2:表示包裹两个框的最小框的左上角和右下角距离distance2

distance1越小,distance2越大,diou越大,一定程度就是两个框接近且重合度高

LDiou = 1-diou,loss越小越好,所以diou越大越好
所以就是要distance1越小,distance2越大
所以在真实框固定的情况下,需要预测框中心点向真实框中心点靠近(distance1越小)


假设预测框中心点向真实框中心点靠近发生了,如果要(distance2越大)只能扩大预测的宽高,即范围,这样的话是符合要求的。

但是distance1的变大,也会导致distance2变大,这里不就矛盾了吗?

The advantages of DIoU are as follows:
1. Similar to GIoU loss, DIoU loss can still provide a moving direction for the bounding box when it does not overlap with the target box.
2. DIoU loss can directly minimize the distance between two target boxes, while GIOU loss optimizes the area between the two target boxes, so it converges much faster than GIoU loss.
3. For the situation that contains two boxes in the horizontal and vertical directions, DIoU loss can make the regression very fast, while GIoU loss almost degenerates into IoU loss

Distance-IoU (DIoU) loss

Loss(diou) = 1 - IOU - d^2/c^2
def bboxes_diou(boxes1,boxes2):
    '''
    cal DIOU of two boxes or batch boxes
    :param boxes1:[xmin,ymin,xmax,ymax] or
                [[xmin,ymin,xmax,ymax],[xmin,ymin,xmax,ymax],...]
    :param boxes2:[xmin,ymin,xmax,ymax]
    :return:
    '''

    #cal the box's area of boxes1 and boxess
    boxes1Area = (boxes1[...,2]-boxes1[...,0])*(boxes1[...,3]-boxes1[...,1])
    boxes2Area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    #cal Intersection
    left_up = np.maximum(boxes1[...,:2],boxes2[...,:2])
    right_down = np.minimum(boxes1[...,2:],boxes2[...,2:])

    inter_section = np.maximum(right_down-left_up,0.0)
    inter_area = inter_section[...,0] * inter_section[...,1]
    union_area = boxes1Area+boxes2Area-inter_area
    ious = np.maximum(1.0*inter_area/union_area,np.finfo(np.float32).eps)

    #cal outer boxes
    outer_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    outer_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    outer = np.maximum(outer_right_down - outer_left_up, 0.0)
    outer_diagonal_line = np.square(outer[...,0]) + np.square(outer[...,1])

    #cal center distance
    boxes1_center = (boxes1[..., :2] +  boxes1[...,2:]) * 0.5
    boxes2_center = (boxes2[..., :2] +  boxes2[...,2:]) * 0.5
    center_dis = np.square(boxes1_center[...,0]-boxes2_center[...,0]) +\
                 np.square(boxes1_center[...,1]-boxes2_center[...,1])

    #cal diou
    dious = ious - center_dis / outer_diagonal_line

    return dious

Insert picture description here

  1. IoU: From the curve of IoU error, we can find that the closer the anchor is to the edge, the greater the error, and those anchors that do not overlap with the target frame basically cannot return.
  2. GIoU: From the GIoU error curve, we can find that for some anchors that do not overlap, GIoU performs better than IoU. However, because GIoU still relies heavily on IoU, the error is large in the two vertical directions and it is basically difficult to converge. This is the reason why GIoU is unstable.
  3. DIoU: From the curve of DIoU error, we can find that DIoU can achieve better regression for anchors with different distances, directions, areas and proportions.

Guess you like

Origin blog.csdn.net/qq_41375318/article/details/114627857