Acoustic sensing on smart devices

Mao, Wenguang

Acoustic sensing on smart devices

Access full-text files

MAO-DISSERTATION-2019.pdf (5.18 MB)

Date

2019-09-18

Authors

Mao, Wenguang

Abstract

Smart devices, such as smartphones, smartwatches, and smart speakers, become increasingly popular and change people's daily life in a profound way. However, the usability and functionality of these devices are still limited since there lack of effective sensing techniques to understand users and environments. Particularly, the motion and shape of a target, such as a user's hand or a nearby object, are the most important information to understand user behaviors or gain knowledge about environments, but the existing sensors on smart devices cannot capture the fine-grained motion of a target or the shape of an object under darkness or occlusion.

To solve this problem, we study novel sensing techniques using acoustic signals. Specifically, we use speakers to play inaudible signals. During the propagation, the signals will be affected by the motion and shape of a target object in a deterministic way. By collecting and analyzing these signals, we can infer such information about the target object. We explore acoustic signals since they are beneficial to achieve high sensing accuracy due to low propagation speed. They are also widely available since most smart devices are equipped with speakers and microphones.

In this dissertation, we develop four acoustic sensing systems on smart devices. We first propose two motion tracking systems. They are called device-based tracking and device-free tracking, respectively, according to whether a user needs to carry a device (e.g., a smartphone or smart watch) for tracking purpose. Based on tracking techniques, we design a novel application that allows a drone to automatically follow a user for video taping. Beside motion tracking, we also design acoustic imaging system to sense the object shape under darkness or occlusion. We elaborate these systems as below.

First, we develop a device-based motion tracking system that turns a smartphone into a motion controller. It provides a new way to interact with and control video games, VR/AR headsets, and smart appliances. In this system, we use multiple speakers to play inaudible sounds. Based on received signals, the smartphone estimates its distance to various speakers and then derives its position. The key technique in our system is a distributed Frequency Modulated Continuous Waveform (FMCW) that can accurately estimate the distance between a pair of unsynchronized speaker and microphone. Moreover, we incorporate MUltiple SIgnal Classification (MUSIC) with FMCW to minimize the interference caused by multipath. We further develop an optimization framework to combine FMCW estimation with Doppler shifts and Inertial Measurement Unit (IMU) measurements to enhance the accuracy. We implement the system on a mobile phone and demonstrate that it achieves the millimeter level tracking accuracy with experiments.

Second, we develop a device-free motion tracking system that allows users to control smart speakers using their hand gestures. To this end, we develop a novel recurrent neural network (RNN) based system that uses speakers and microphones to realize accurate room-scale tracking. Our system jointly estimates the propagation distance and angle-of-arrival (AoA) of signals reflected by the hand, based on AoA-distance profiles generated by 2D MUSIC. We design a series of techniques to significantly enhance the profile quality under low SNR. We feed the profiles in a recent history to our RNN to estimate the distance and AoA. In this way, we can exploit the temporal structure among consecutive profiles to remove the impact of noise, interference and mobility. We implement the system on a smart speaker development platform. To our best knowledge, this is the first acoustic device-free tracking system working at the room scale.

Third, based on the tracking techniques, we develop a novel system that allows a drone to automatically follow a user for video taping indoors. This not only reduces the efforts of human photographers, but also supports video taping in situations where otherwise not possible. To this end, we apply the tracking techniques to determine the relative position between the drone and user and develop a controller based on model predictive control (MPC) framework to control the drone. We solve the practical challenges of applying MPC in our system, including drone modeling, user movement prediction, and integrating the tracking information for control feedback. We implement the system on a commercial drone and show that it can follow a user effectively and maintain a specified following distance and orientation.

Fourth, we develop a novel acoustic imaging system only using a smartphone. It is an attractive alternative to camera based imaging under darkness and obstruction. Our system is based on Synthetic Aperture Radar (SAR). To image an object, a user moves a smartphone along a predefined trajectory to mimic a virtual sensor array. SAR based imaging poses several new challenges in our context, including strong interference, trajectory errors due to hand jitters, and severe speaker and microphone distortion. We address these challenges by developing a 2-stage interference cancellation scheme, a new algorithm to compensate trajectory errors, and an effective method to minimize the impact of signal distortion. Based on these techniques, we implement the first acoustic imaging system on a commercial smartphone.