Motivation: Why did the authors want to address this problem?
- Restoring 3D poses of multiple people in a single image remains a challenging problem
Contribution: What did the author accomplish in this paper (innovative points)?
-
Solve the 3D multiperson pose estimation (3D-MPPE) problem using a top-down structure
-
proposed a general framework
the 3D localization of persons: used for root depth estimation and 2D coordinate estimation of roots.
It is suggested (read) in [1] that the depth of human roots can be estimated by adjusting the projected area with a correction factor . In this paper, a more effective learning-based method is proposed, specifically, the projection area of a person may be affected by multiple factors, including the person's depth, height, pose, and even mutual occlusion, rather than a single factor. Therefore the previously proposed correction factor can be decomposed into multiple factors to better estimate the depth of a person's roots. Therefore, this paper designs a 3D localization network to predict these decomposed factors individually. Because the depth of a person is inversely proportional to the projected area, once these factors are obtained, the depth of the person can be calculated above the detected bounding boxrelative 3D human pose estimation:
A multi-scale feature fusion module is proposed and an attention mechanism is introduced in the task of relative 3D human pose estimation [2]. This design enables the network to integrate multi-scale information during upsampling, while enhancing effective information and suppressing invalid information.
own opinion
- There is no introduction from 2D pose to 3D pose, but relative 3D pose and absolute depth are generated, and finally absolute 3D human pose is generated
references
[1] Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
[2] Coordinate attention for efficient mobile network design