Different IoU Losses for Faster and Accurate Object Detection
Learn Generalized IoU, Distance IoU, and Complete IoU Loss used in State of the art object detection algorithms
Object detection, which includes two sub-tasks: object classification and object localization.

Object Localization relies on a bounding box regression (BBR) module to localize objects.
Bounding Box Regression
Bounding-box regression is a popular technique in object detection algorithm used to predict target objects' location using rectangular bounding boxes. It aims to refine the location of a predicted bounding box.
Bounding box regression uses overlap area between the predicted bounding box and the ground truth bounding box referred to as Intersection over Union (IOU) based losses.
Intersection over Union
IoU loss only works when the predicted bounding boxes overlap with the ground truth box. IOU loss would not provide any moving gradient for non-overlapping cases.
The convergence speed of the IOU loss is slow.

The equation for IoU and IoU loss is shown below.

IoU loss fails when predicted, and ground truth boxes do not overlap.
Generalized IoU(GIoU) Loss
GIoU loss maximizes the overlap area of the ground truth and predicted bounding box. It increases the predicted box's size to overlap with the target box by moving slowly towards the target box for non-overlapping cases.

In the above formula for GIoU loss, C is the smallest box covering the predicted and ground truth-bound boxes, which act like a penalty term moving the predicted box closer to the target ground truth box.

As shown in the figure above, GIoU loss initially increases the predicted bounding box's size and slowly moves towards the ground truth. To overlap the predicted box to the ground truth box takes several iterations, especially when the bounding boxes have a horizontal and vertical orientation.
GIoU loss achieves better precision than MSE loss and IoU loss.
GIoU loss solves vanishing gradients for non-overlapping cases but has slow convergence and inaccurate regression, especially for the boxes with extreme aspect ratios.
Distance IoU Loss
The Distance IoU is the normalized distance between the center point of the predicted and ground truth boxes. Distance loss helps with faster convergence and accurate regression.

d represents the euclidian distance between the center point of the predicted and ground truth boxes, and C is the diagonal length of the smallest enclosing box covering two boxes

DIoU loss is invariant to the scale of regression problem, and like GIoU loss, DIoU loss also provides the moving directions for predicted bounding boxes for non-overlapping cases.
Unlike GIoU loss, DIoU loss directly minimizes the distance between predicted and ground truth-bound boxes and converges much faster than GIoU even when the ground truth boxes have horizontal and vertical orientations.

DIoU when employed as a criterion for non-maximum suppression (NMS), gives robust results with occlusions.
Complete IoU Loss
CIoU loss bounding box regression uses three geometric factors.
- Overlap area between the predicted box and the ground truth bounding box-IOU loss
- The central point between the predicted box and the ground truth bounding box-DIoU loss
- An aspect ratio of the predicted box and the ground truth box
As CIoU loss uses complete geometric factors, it converges faster than GIoU loss. It improves average precision (AP) and average recall (AR) for object detection and segmentation.
CIoU loss is an aggregation of the overlap area, distance, and aspect ratio, respectively, referred to as Complete IOU loss.

S is the overlap area denoted by S=1-IoU
D is the normalized distance Iou loss between the center point of the predicted and ground truth boxes.
V is the consistency of the aspect ratio.
All S, V, and D are invariant to the regression scale and are normalized to values between 0 and 1.
CIoU loss, like GIoU loss and DIoU loss, moves the predicted bounding box towards the ground truth bounding box for non-overlapping cases.
CIoU loss needs fewer iterations to converges than GIoU loss. CIoU loss makes regression very fast with extreme aspect ratios.

CIoU loss is applied in YOLO v3, Yolo v4, SSD, and Faster RCNN.
The below figure shows the regression error sum curves of different loss functions for different iterations.

IoU loss only works for the cases when the predicted bounding box overlaps with target boxes.
GIoU loss helps with non-overlapping cases by increasing the predicted box's size to overlap with the ground truth by slowly moving towards the ground truth. GIoU loss converges very slowly with a large number of iterations and proper learning rates.
GIoU still has large errors for cases with extreme aspect ratios.
CioU loss uses geometric measures for bounding box regression which helps with faster convergence and better performance than IoU and GIoU losses.
DIoU loss also converges faster with better performance than IoU and GIoU losses with lesser iterations.
Summary:
The different IoU losses are about the convergence speed and localization accuracy.
- Generalized IoU (GIoU) increases the size of the predicted box to overlap with the target box by moving slowly towards the target box, suffers from the problems of slow convergence. GIoU gives inaccurate regression in case of extreme aspect ratios.
- Distance-IoU (DIoU) loss uses the normalized distance between the predicted box and ground truth and converges much faster in training than IoU and GIoU losses.
- CIoU loss is an aggregation of the overlap area(IoU), distance(DIoU), and aspect ratio. It converges faster with fewer iterations compared to IoU loss and GIoU loss.
References:
YOLOv4: Optimal Speed and Accuracy of Object Detection
Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression
https://github.com/Zzh-tju/CIoU
Focal and Efficient IOU Loss for Accurate Bounding Box Regression






