More about HKUST
LEARNING RIGID OBJECT POSE ESTIMATION
PhD Thesis Proposal Defence Title: "LEARNING RIGID OBJECT POSE ESTIMATION" by Mr. Yisheng HE Abstract: Rigid object pose estimation aims to predict the target object’s orientation, position, and size. It is a significant component of various real-world applications, including but not limited to robotic manipulation, augmented reality, and autonomous driving. Traditional algorithms for this problem utilize hand-crafted features to extract the correspondence between images and object mesh models. However, they suffer from limited performance in challenging scenarios, e.g., changing illumination conditions and scenes with heavy occlusion. In this thesis, we leverage deep learning techniques to advance the rigid object pose estimation. First, we decompose learning-based object pose estimation into two sub-modules, the representation learning backbone for feature extraction from RGBD inputs and the subsequent output representation for pose estimation. For representation learning, we introduce a full-flow bidirectional fusion network to combine the complementary information residing in the RGB and depth images. Features with rich semantic and geometric information are extracted for precise regression of different downstream tasks. For output representation, we introduce a 3Dkeypoint- based algorithm by joint instance semantic segmentation and 3D keypoint detection. The pose parameters are then estimated within a least-squares fitting manner. Our 3D-keypointbased formulation fully leverages the geometric constraint of the rigid object and is easy for a network to learn and optimize. Second, we study a few-shot open-set 6D pose estimation problem. Our goal is to eliminate the two limitations of learning-based pose estimation algorithms: the close-set assumption and their reliance on high-fidelity object CAD models. The proposed few-shot 6D pose estimation problem is to estimate the 6D pose of an unknown object given a few support views of xv it. We propose a large-scale photorealistic dataset (ShapeNet6D) for network pre-training and introduce a dense prototype matching network to estimate pose parameters. We also establish a benchmark to facilitate future research on this new challenging problem. Finally, we propose a self-supervised framework for category-level object pose and size estimation. Our goal is to enable learning-based algorithms to eliminate the reliance on time- and labor-consuming manual labels. We propose a label-free method that learns to enforce the geometric consistency between the category template mesh and observed object point cloud under a self-supervision manner. Specifically, given the category template mesh and the observed scene object point cloud, we propose to leverage differentiable shape deformation, registration, and rendering to enforce geometric consistency for self-supervision. Date: Friday, 8 July 2022 Time: 4:00pm - 6:00pm Zoom Meeting: https://hkust.zoom.us/j/4536985718 Committee Members: Dr. Qifeng Chen (Supervisor) Prof. Long Quan (Chairperson) Dr. Dan Xu Prof. Ling Shi (ECE) **** ALL are Welcome ****