PR-GCN: A Deep Graph Convolutional Network with Point Refinement for 6D Pose Estimation
-
Main Idea
-
Proposes a novel deep learning approach, namely Graph Convolutional Network with Point Refinement [PR-GCN], to simultaneously address the issues belw in a unified way.
- (1) ineffective representation of depth data.
- (2) insufficient integration of different modalities.
-
Introduces the Point Refinement Network (PRN) to polish 3D point clouds, recovering missing parts with noise removed.
-
Introduces the Multi-Modal Fusion Graph Convolutional Network (MMF-GCN) is presented to strengthen RGB-D combination, which captures geometry-aware intermodality correlation through local information propagation in the graph convolutional network.
-
Extensive experiments are conducted on three widely used benchmarks [ LM, LM-O and YCB-V datasets], and state- of-the-art performance is reached.
-
PRN and MMF-GCN modules are well generalized to other frameworks.
-
-
Contribution
-
Propose the PR-GCN approach to 6D pose estimation by enhancing depth representation and multi-modal combination.
-
Present the PRN module with a regularized multi-resolution regression loss for point-cloud refinement. To the best of our knowledge, it is the first that applies 3D point generation to this task.
-
Develop the MMF-GCN module to capture local geometry-aware inter-modality correlation for RGB-D fusion.
-
PRN) to improve the quality of depth representation, together with a Multi-Modal Fusion Graph Convolution Network (MMF-GCN) to fully explore local geometry-aware inter-modality correlations for sufficient combination.
-
Model
-
Given an RGB-D image, it first localizes objects on RGB images and generates their raw 3D point clouds.
-
PRN generates refined 3D points to polish shape clues [improve the quality of depth representation].
-
MMF-GCN integrates multi-modal features by propagating local geometry-aware information and leveraging refined 3D points [fully explore local geometry-aware inter-modality correlations for sufficient combination].
-
6D pose is finally inferred based on the feature delivered by MMF-GCN.
-
-
Data and Metrics
-
Dataset
- YCB-Video
- LINEMOD
- Occlusion LineMOD
-
Evaluation Metrics
- Average Distance (ADD)
- ADD-Symmetric (ADD-S)
Note : Average Distance (ADD) and ADD-Symmetric (ADD-S), designed for general objects and symmetric objects, respectively.
-
-
Result
1. Result on the YCB-Video Dataset
2. Result on the LINEMOD Dataset
3. Result on the Occlusion LineMOD Dataset
-
Limitation and Futur work
- pdf | code | Presentation