Machine Learning Notes - SAHI: A Brief Reading of Slice-Assisted Super-Inference and Fine-tuning of Small Target Detection

1. Brief introduction

        The detection of small and distant objects in a scene is a major challenge in object detection. Because small objects in images lack sufficient details, making them difficult for traditional detectors to detect. Here the authors propose an open-source framework called Slice-Assisted Hyper-Inference (SAHI), which provides a general slice-assisted inference and fine-tuning pipeline for small object detection. The proposed technique is general as it can be applied on top of any available object detector without any fine-tuning.

        The proposed technique has been integrated with Detectron2, MMDetection and YOLOv5 models.

2. Difficulty in detecting small targets

        Various current object detection algorithms, such as Faster RCNN, YOLO, SSD, RetinaNet, EfficientDet, etc. Typically, these models are trained on the COCO dataset. It is a large dataset with various object categories and annotations, making it popular for training object detectors. However, it turns out that these models are not particularly good at detecting small objects.

        The main reasons are roughly as follows:

        Limited receptive field: The receptive field refers to the spatial extent of an input image that affects the output of a specific neuron or filter in a convolutional neural network (CNN). In normal object detectors, the receptive field may be limited, which means that the network may not have sufficient understanding of the contextual information around smaller objects. Therefore, detectors may struggle to accurately detect and localize these objects due to insufficient receptive fields.

        Low feature detail: Object detectors typically rely on learned features in CNN architectures to recognize objects. However, the inherent limitations of feature representations may hinder the detection of smaller objects, as the learned features may not adequately capture subtle and complex details.

        Excessive variation in object scale: Small objects exhibit

Guess you like

Origin blog.csdn.net/bashendixie5/article/details/131100123