DPOD: 6D Pose Object Detector and Refiner

Arch

  • Main Idea

    • A novel deep learning method [DPOD] for 3D object detection and 6D pose estimation from RGB images (Dense Pose Object Detector).

    • End-to-end pipeline integrating a detector and pose estimator based on dense correspondences.

    • Estimates dense multi-class 2D-3D correspondence maps between an input image and available 3D models.

    • 6DoF pose is computed via PnP and RANSAC by given the correspondences, and using a custom deep learning-based refinement scheme for RGB pose refinement.

    • The evaluation on both synthetic and real training data demonst a superior results and high-quality 6D poses before and after refinement.

    • Dense correspondences computed by the method allow for more robust and accurate 6D pose estimation.

    • DPOD is precise and work in real-time.


  • Contribution

    1. Prproposed the Dense Pose Object Detector (DPOD) method that regresses multi-class object masks and dense 2D-3D correspondences between image pixels and corresponding 3D models.

    2. Proposed pose refinement approach also performs very well and allows for achieving a pose accuracy and having a simpler and more lightweight backbone architecture. [Faster, Simpler to train and able to trained on Synthetic and real data].


  • Model

    1. Given an input RGB image, the correspondence block, featuring an encoder-decoder neural network, regresses the object ID mask and the correspondence map.

    2. The latter one provides with explicit 2D-3D correspondences, whereas the ID mask estimates which correspondences should be taken for each detected object.

    3. The respective 6D poses are then efficiently computed by the pose block based on PnP+RANSAC.

    model


  • Data and Metrics

    • Dataset

      • LINEMOD
      • Occlusion LineMOD
    • Evaluation Metrics

      • ADD

  • Result

1. Result on the LINEMOD Dataset

res

2. Result on the Occlusion LineMOD Dataset

res


  • Limitation and Futur work