Quantifying grasp quality using an inverse reinforcement learning algorithm

Horn, Matthew William

Quantifying grasp quality using an inverse reinforcement learning algorithm

Access full-text files

HORN-THESIS-2017.pdf (30.01 MB)

Date

2017-05

Authors

Horn, Matthew William

Abstract

This thesis considers the problem of using a learning algorithm to recognize when a mechanical gripper and sensor combination has achieved a robust grasp. Robotic hands are continuously evolving with finer motor control and higher degrees of freedom which can complicate the ability of an operator to determine if a gripper has achieved a successful grasp. Robots working in hazardous environments especially need confirmation of a successful grasp as the cost of failure is often higher than in traditional factory environments. The object set found in a nuclear environment is the focus of this effort. Objects in this environment are typically expensive (or one-of-a-kind), rigid, radioactive (or toxic), dense, and susceptible to dents, scratches, and oxidation. To validate the robustness of a grasp option, an online inverse reinforcement learning approach is evaluated as a method to quantify grasp quality. This approach is applied to an industrial-grade under-actuated robotic hand equipped with 36 pressure sensors. An expert trains the inverse reinforcement learning algorithm to generate a reward function which scores each grasp so - when combined with fuzzy logic - provides a general success or fail along with a confidence level. Utilizing the trained inverse reinforcement learning algorithm in a glovebox environment reduces the number of potential failing and untrustworthy grasps by scoring executed grasps and rejecting grasps that are similar to prior failed grasps while allowing further execution of movement when a grasp has been scored highly. The trained algorithm incorrectly classified grasps of insufficient quality less than 5% of the time in experimental hardware tests, showing that the algorithm can be applied to the glovebox environment to improve grasp safety. Thus the combination of grasp selection and pressure sensor validation provides a more efficient, robust, and redundant method to assure items can be safely handled during remote automation processes.