Side information, robustness, and self supervision in imitation learning
Imitation learning refers to a family of learning algorithms enabling the learning agents to learn directly from demonstrations provided by experts, practitioners, and users. While imitation learning methods have been successfully applied to many robot learning and autonomous driving problems, existing imitation learning methods still perform poorly for certain types of problems and suffer from lack of robustness. Most existing imitation learning algorithms are designed to learn purely from demonstrations, however, there are several other sources of information that could be used to improve learning from demonstrations. We identify the following shortcomings in imitation learning algorithms pertaining to their lack of use of other available sources of information: First, most existing imitation learning algorithms are oblivious to potentially existing domain knowledge and side information. Second, they are oblivious to the possibility of using the sparse rewards provided by the environment, which might guide learning and ease the requirement of access to informative demonstrations. And third, they are oblivious to the fact that, in physically embodied applications, reward functions and policies have certain underlying structures, such as Lipschitzness. In order to achieve robust and fast imitation learning, in this dissertation, we propose novel imitation learning algorithms that exploit the above-mentioned sources of information and priors which have been ignored in existing imitation learning algorithms.