A deep learning framework for model-free 6 degree of freedom object tracking
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this work we address the challenging task of 6 degree of freedom (DoF), model-free object tracking. We propose a new deep learning framework that explores the merit of using weakly supervised semantic segmentation as part of the object tracking pipeline. Our framework approaches the task by considering object poses in their 7D representation: a 3D vector to represent position and a unit quaternion to represent orientation. We present a novel CNN architecture, coined VGGSibs, used to regress the predicted pose. We collect a data set of several common items and evaluate our framework on both a test set withheld from our training data and on an “in the wild” set collected in a significantly different environment. Our approach achieves an average error of 7.92 cm and 21.98 degrees on the test set and 21.83 cm and 83.79 degrees on in the wild data, demonstrating that our framework generalizes reasonably well to test data that is from a similar distribution as the training data. In ablation experiments, we test our framework without the use of segmentation as a baseline. Our full framework outperforms the baseline significantly on in the wild data, thus demonstrating that the use of semantic segmentation improves the generalization performance of the framework when deployed in new environments.