Interactive learning from implicit human feedback : the EMPATHIC framework
MetadataShow full item record
Reactions such as gestures and facial expressions are an abundant, natural source of signal emitted by humans during interactions. An autonomous agent could leverage an interpretation of such implicit human feedback to improve its task performance at no cost to the human, contrasting with traditional agent teaching methods based on demonstrations or other intentionally provided guidance. In this thesis, we first define the general problem of learning from implicit human feedback, and propose a data-driven framework named EMPATHIC as a solution, which includes two stages: (1) mapping implicit human feedback to corresponding task statistics; and (2) learning a task with the constructed mapping. We first collect a human facial reaction dataset while participants observe an agent execute a sub-optimal policy for a prescribed training task. Then, we train a neural network to instantiate and demonstrate the ability of the EMPATHIC framework to (1) infer reward ranking of events from offline human reaction data in the training task; (2) improve the online agent policy with live human reactions as they observe the training task; and (3) generalize to a novel domain in which robot manipulation trajectories are evaluated by the learned reaction mappings.