avatarJuneta Tao

Summary

The webpage discusses advancements in object detection that focus on accurately localizing rotated or oriented objects in images, with methods that include rotated bounding boxes, transformed feature representations, Gaussian-based loss functions, and improved positive sample selection strategies.

Abstract

The content of the webpage delves into the limitations of traditional upright bounding boxes for object detection, which often include unnecessary background information. It introduces the concept of rotated bounding boxes as a solution, which involves regressing an additional orientation angle (theta) to more precisely localize objects. The page highlights several approaches to enhance the detection of rotated objects, such as modifying existing detection frameworks like RetinaNet and FCOS to include theta, and proposing new feature representations that are better suited for rotated objects. Notably, it discusses the use of a 2-D Gaussian distribution to represent rotated bounding boxes and the application of Kullback-Leibler Divergence (KLD) as a regression loss to improve detection performance. Additionally, the page covers the importance of positive sample selection in detection accuracy, presenting methods like Adaptive Training Sample Selection (ATSS) and Shape-adaptive Selection and Measurement (SASM) for more effective sample selection and quality estimation. The content also touches on the Gliding Vertex method for estimating vertex distances on bounding box boundaries, applicable to both rigid objects and more complex datasets like pedestrian detection.

Opinions

  • The traditional upright bounding boxes are deemed inefficient for object detection due to their inclusion of excessive background, which necessitates the development of rotated bounding box techniques.
  • Feature representation in existing detection frameworks is considered inadequate for rotated objects, prompting the need for methods that transform feature representation to align with the orientation of the objects.
  • The use of Gaussian-based loss functions, particularly KLD, is presented as a significant improvement over traditional overlapping loss calculations, with demonstrated effectiveness across multiple datasets and frameworks.
  • Positive sample selection is recognized as a critical factor in the performance disparity between anchor-based and anchor-free methods, with ATSS and SASM being proposed as solutions to enhance sample selection and quality estimation.
  • The Gliding Vertex method is suggested as a versatile approach for estimating vertex distances, showing promise beyond rigid objects and in diverse datasets.

Rotated/Oriented Object Detection

Up-right bounding boxes include large portion of background. Rotated bounding boxes can be used to localise target objects more accurate. To be more precise, then instance segmentation framework can be considered. However, pixel-wise prediction (segmentation) frameworks are often much slower. Rotated/Oriented bounding box then used to locate the objects.

Apart from (x,y,w,h), a orientation theta is also regressed to rotate the upright bounding box. There are different way to define the orientation, e.g. OpenCV (oc), Long Edge 90 (le90), Long Edge 135 (le135).

from [1]

Rotate Feature

With existing upright bounding box detection frame work, e.g. RetinaNet, FCOS, RepPoint, it is possible to detect rotated bbox by adding theta. However, the feature representation for classification and regression is not suitable for the rotated objects.

[2,3,4] proposed to rotated/transform feature representation to improve classification and localisation performance. ROI Trans[2] propose to learn rotation parameters for upright RoIs, and apply spatial transformations on RoI feature before feed to classification and regression head.

From [2,3]

Oriented RCNN proposed to directly find rotated anchors with learned parameters.

from [5]

Gassian Based Loss

Instead of use rotated bbox to calculate overlapping loss, [6,7,8] proposed to convert the rotated bounding box into a 2-D Gaussian distribution. [6] calculate the Kullback-Leibler Divergence (KLD) between the Gaussian distributions as the regression loss. Impressive improvement on multiple dataset and frameworks.

KLD vs GWD vs L1 from [6]

KFIOU[8] propose to use Kalman Filter to obtain the overlapping area between prediction and groundtruth, which is currently the state-of-the-art.

from [8]

Positive Sample Selection

The selection of positive samples lead to performance gap between anchor based and anchor free methods. ATSS[9] propose an Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object.

from[9]

SASM[10] propose novel flexible shape-adaptive selection (SA-S) and shape-adaptive measurement (SA-M) strategies for oriented object detection, which comprise an SA-S strategy for sample selection and SA-M strategy for the quality estimation of positive samples.

from [10]

Others

Gliding Vertex [11] proposed to estimate the vertex gliding distance on four upright bbox boundaries. Apart from the rigid objects, e.g. ships, textures, pedestrian datasets are also used for testing.

  1. https://github.com/open-mmlab/mmrotate
  2. Ding, Jian, et al. “Learning roi transformer for oriented object detection in aerial images.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
  3. Han, Jiaming, et al. “Align deep features for oriented object detection.” IEEE Transactions on Geoscience and Remote Sensing 60 (2021): 1–11.
  4. Han, Jiaming, et al. “Redet: A rotation-equivariant detector for aerial object detection.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  5. Xie, Xingxing, et al. “Oriented R-CNN for object detection.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
  6. Yang, Xue, et al. “Learning high-precision bounding box for rotated object detection via kullback-leibler divergence.” Advances in Neural Information Processing Systems 34 (2021): 18381–18394.
  7. Yang, Xue, et al. “Rethinking rotated object detection with gaussian wasserstein distance loss.” International Conference on Machine Learning. PMLR, 2021.
  8. Yang, Xue, et al. “Rethinking rotated object detection wsith gaussian wasserstein distance loss.” International Conference on Machine Learning. PMLR, 2021.
  9. Zhang, Shifeng, et al. “Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
  10. Hou, Liping, et al. “Shape-adaptive selection and measurement for oriented object detection.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. №1. 2022.
  11. Xu, Yongchao, et al. “Gliding vertex on the horizontal bounding box for multi-oriented object detection.” IEEE transactions on pattern analysis and machine intelligence 43.4 (2020): 1452–1459.
Object Detection
Oriented Object Detection
Gaussian Distribution
Recommended from ReadMedium