DeepFusionMOT 3D real-time tracking based on camera and lidar fusion

demo:

tracking_video2

3D Multi-Target Tracking Framework Based on Camera-LiDAR Fusion Based on Depth Correlation

paper:https://arxiv.org/abs/2202.12100 

code:https://github.com/wangxiyang2022/DeepFusionMOT 

2d:RRC

3d:PointRCNN

Abstract: On the one hand, many 3D multiple object tracking (MOT) works focus on tracking accuracy while ignoring computational speed, usually by designing rather complex cost functions and feature extractors. On the other hand, some methods pay too much attention to computation speed at the expense of tracking accuracy.

To address these issues, this paper proposes a robust and fast camera-lidar fusion-based MOT method that achieves a good balance between accuracy and speed. Based on the characteristics of cameras and lidar sensors, an effective depth correlation mechanism is designed and embedded. The association mechanism realizes the tracking of the target in the two-dimensional area when the distance is far away and is only detected by the camera, and when the target appears in the lidar field of view, the acquired three-dimensional information is used to update the two-dimensional trajectory. Smooth fusion of 2D and 3D trajectories is achieved. Extensive experiments based on typical datasets demonstrate that our proposed method outperforms state-of-the-art MOT methods in both tracking accuracy and processing speed.

Most MOT methods are designed under the framework of tracking and detection, mainly including two steps: 1) object detection, 2) data association. 

 Project the trajectory in the current frame to the next frame, find matching points directly in the mapped area and calculate the corresponding cost function, thereby reducing the search area and computational cost.

The common problems that exist now are as follows: 1) Existing camera-based MOT methods usually lack the depth information required for 3D tracking. Although some methods use stereo cameras to obtain distance information and then realize three-dimensional tracking, the calculation is relatively large, and the accuracy of depth information is not as good as that of lidar sensors. On the other hand, lidar-based tracking methods cannot accurately track distant targets due to the lack of pixel information. Most of the existing tracking methods based on camera and LiDAR fusion are designed with complex feature extractors, so these methods usually need to run on GPU, which is not easy to realize real-time application. 2) Most methods do not make full use of visual data and point cloud data in the camera-lidar process.

Visualization results:

 

Guess you like

Origin blog.csdn.net/weixin_64043217/article/details/128808000