A lightweight semantic slam solution for small form factor autonomous vehicles
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As the interest in autonomous vehicles continues to increase, much effort has gone into developing technologies that enable vehicles to independently and reliably navigate in poorly mapped environments. Nonetheless, reliable navigation in GPS-denied scenarios or through indoor environments continues to present many challenges. Further, the very nature of these scenarios often necessitates certain size, weight, power, and cost (SWaP-C) constraints on the vehicle. This places an additional computational burden on a system’s ability for accurate navigation and path-planning. To this end, this work first proposes the design and development of a unique RGB-LiDAR computer vision pipeline for low SWaP-C 3D object bounding box estimation. This thesis then details the design and implementation of this pipeline into a Simultaneous Localization and Mapping (SLAM) solution for small form factor autonomous vehicles. Computer vision, in conjunction with the proliferation of neural networks, has been an asset of increasing promise within the endeavor for autonomous navigation. With a clever architecture of the sensor suite, RGB and LiDAR data can be aligned and fused in such a way to identify and produce valuable information about surrounding objects. Computationally lightweight algorithms have been developed that reliably estimate not only the range and bearing to the object but also the object’s physical dimensions in the form of a 3-dimensional bounding box. In unknown or GPS-constrained environments, SLAM is a common method for providing an autonomous navigation solution. Among computer vision methods, the great bulk of SLAM solutions employ the tracking of low context features such as points, corners, or edges sensed in the environment around the vehicle. The relatively heavy computational burden of such algorithms renders implementation onto small form factor vehicles exceptionally challenging. In contrast, the ability to track high-context features from the environment, such as complete objects, provides two distinct benefits. First, this provides a means of reducing the dimensionality of the SLAM problem, thereby lending itself nicely to design with an Extended Kalman Filter (EKF) architecture. This further assists the low SWaP-C objective. Secondly, this naturally produces information about and a map of the environment with a higher semantic value. This has large implications and benefits in its ability to feed valuable information into higher level autonomy and decision making.