Leveraging program synthesis for robust long-term robot autonomy via interactive learning and adaptation

dc.contributor.advisorBiswas, Joydeep (Assistant professor)
dc.contributor.committeeMemberGuha, Arjun
dc.contributor.committeeMemberNiekum, Scott
dc.contributor.committeeMemberChaudhuri, Swarat
dc.creatorHoltz, Jarrett
dc.date.accessioned2022-09-12T23:45:42Z
dc.date.available2022-09-12T23:45:42Z
dc.date.created2022-05
dc.date.issued2022-03-15
dc.date.submittedMay 2022
dc.date.updated2022-09-12T23:45:43Z
dc.description.abstractFor autonomous robots to become as pervasive in uncontrolled human environments and our everyday lives as they are on campuses and in warehouses, they need to be deployable by end-users for various tasks. End-user deployment of autonomous robots over the long term requires robust behaviors that can leverage fundamental robot capabilities to achieve diverse goals while subject to various domains and user preferences. Achieving this goal requires a system for designing and adapting behaviors that is intuitive, data-efficient, easy to integrate, and can handle changes in user-imposed requirements over long deployments. State-of-the-art approaches to designing robot behaviors broadly fall into three categories: reinforcement learning, inverse reinforcement learning, and learning from demonstration. State-of-the-art approaches for these techniques widely leverage deep neural networks (DNNs) as function approximators to represent either the complete behavior, an optimal reward function, or both a value function and the behavior in Actor-Critic approaches. DNNs are a powerful tool for function approximation that have been the catalyst for significant successes across a wide range of learning applications. While DNN-based approaches are broadly applicable, they suffer from three key weaknesses when used for end-user robot behavior design and adaptation: 1) DNNs are black-box behavior representations and thus are opaque to the user and difficult to understand or verify, 2) learning with DNNs is extremely data-intensive, often requiring that data be collected in simulation, and 3) DNN behaviors are difficult to adapt and sensitive to changing domains or user- preferences, such as when transferring from simulation to the real world. In this thesis, we present approaches to leverage program synthesis as an alternative function approximator for learning from demonstration to approximate behaviors and reward functions, respectively. Program synthesis as a function approximator addresses some limitations of DNN-based approaches by yielding human-readable behavior representations that are amenable to program repair and parameter optimization for adaptation, and that can leverage the well-structured space of programs to learn behaviors in a data-efficient manner. However, due to two primary factors, existing state-of-the-art synthesis approaches are insufficient to learn general robot programs. First, these approaches are not designed to handle non-linear real arithmetic, vector operations, or dimensioned quantities, all commonly found in robot programs. Second, synthesis techniques are primarily limited by their ability to scale with the search space of potential programs, such that synthesis of many reasonably complex behaviors is intractable for existing approaches. To address the goal of end-user-guided robot behavior learning and adaption, We present Physics Informed Programs Synthesis (PIPS) as part of a learning from demonstration and adaptation approach to lifelong robot learning. Towards this goal, this thesis presents the following contributions: 1) An algorithm for PIPS that addresses limitations of program synthesis for robotics by reasoning about physical quantities, 2) algorithms for LfD leveraging PIPS to learn robot behaviors as human-readable programs, 3) an approach to guiding lifelong robot learning by leveraging the structure of programmatic policies and demonstrations, 4) program repair and synthesis techniques for adapting these learned policies from iterative user guidance, and finally, 5) extensive evaluation results in the social robot navigation domain across simulated and real-world deployments that compare PIPS-based learning to DNN-based and traditional approaches.
dc.description.departmentComputer Science
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2152/115645
dc.identifier.urihttp://dx.doi.org/10.26153/tsw/42543
dc.language.isoen
dc.subjectRobotics
dc.subjectProgram synthesis
dc.subjectLearning from demonstration
dc.subjectSocial robot navigation
dc.titleLeveraging program synthesis for robust long-term robot autonomy via interactive learning and adaptation
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Sciences
thesis.degree.disciplineComputer Science
thesis.degree.grantorThe University of Texas at Austin
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
HOLTZ-DISSERTATION-2022.pdf
Size:
29.93 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description: