GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
-
Main Idea
-
Propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.
-
Extensive experiments show that the approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets, yet real-time and robust.
-
Establishing 2D-3D correspondences whilst computing the final 6D pose estimate in a fully differentiable way.
-
Propose to learn the PnP optimization, exploiting the fact that the correspondences are organized in image space, which gives a significant boost in performance, outperforming all prior works.
-
-
Contribution
-
Revisit the key ingredients in direct 6D pose regression and observe that by choosing appropriate representations for the pose parameters, methods based ondirect regression show competitive performance compared with state-of-the-art correspondence-based indirect methods.
-
Propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to boost the performance of direct 6D pose regression via leveraging the geometric guidance from dense correspondence-based intermediate representations.
-
-
Model
-
Given an RGB image I, our GDR-Net takes the zoomed-in RoI (Dynamic Zoom-In for training, off-the-shelf detections for testing) as input and predicts several intermediate geometric features.
-
The Patch-PnP directly regresses the 6D object pose from Dense Correspondences (M 2D-3D ) and Surface Region Attention (M SRA ).
-
-
Data and Metrics
-
Dataset
- YCB-Video
- Occlusion LineMOD
-
Evaluation Metrics
- ADD-S
- n°, n cm
-
-
Result
1. Result on the YCB-Video Dataset
2. Result on the Occlusion LineMOD Dataset
-
Limitation and Futur work
- Futur work
- Extend our work to more challenging scenarios, such as the lack of annotated real data and unseen object categories or instances.
- Futur work
- pdf | code | Presentation