Browsing by Subject "Monte Carlo tree search"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Digital twinning of well construction operations for improved decision-making(2020-11-17) Saini, Gurtej Singh; Oort, Eric Van; Verdin, Carlos T; Pyrcz, Michael J; Ashok, Pradeepkumar; Pournazari , ParhamWell construction is a highly technical, inherently unpredictable, and non-holonomic multi-step process with vast state and action spaces, that requires complex decision-making and action planning at every step. Action planning demands a careful evaluation of the vast action-space against the system’s long-term objective. Current human-centric decision-making introduces a degree of bias, which can result in reactive rather than proactive decisions. This can lead from minor operational inefficiencies all the way to catastrophic health, safety, and environmental issues. A system that can automatically generate an optimal action sequence from any given state to meet an operation’s objectives is therefore highly desirable. Moreover, an intelligent agent capable of self-learning can offset the computation and memory costs associated with evaluating the action space. This dissertation details the development of such intelligent planning systems for well construction operations by utilizing digital twinning, reward shaping and reinforcement learning techniques. To this effect, a methodology for structuring unbiased purpose-built sequential decision-making systems for well construction operations is proposed. This entails formulating the given operation as a Markov decision process (MDP), which demands carefully defining states and action values, defining goal states, building a digital twin to model the process, and appropriately shaping reward functions to measure feedback. An iterative method for building digital twins, which are vital components of this MDP structure, is also developed. Finally, a simulation-based search decision-time planning algorithm, the Monte Carlo tree search (MCTS), is adapted and utilized for learning and planning. The developed methodology is demonstrated by building and utilizing a finite-horizon decision-making system with discrete state- and action-space for hole cleaning advisory during well construction. A digital twin integrating hydraulics, cuttings transport, and rig-state detection models is built to simulate hole cleaning operations, and a non-sparse reward function to quantify state-action transitions is defined. Finally, the MCTS algorithm, enhanced by a well-designed heuristic function tailored for hole cleaning operations, is utilized for action planning. The plan (action sequence) output by the system, results in significant performance improvement over the original decision maker’s actions, as quantified by the long-term reward and the final system stateItem On-demand coordination of multiple service robots(2017-05) Khandelwal, Piyush; Stone, Peter, 1971-; Grauman, Kristen; Niekum, Scott; Thomaz, Andrea; Veloso, ManuelaResearch in recent years has made it increasingly plausible to deploy a large number of service robots in home and office environments. Given that multiple mobile robots may be available in the environment performing routine duties such as cleaning, building maintenance, or patrolling, and that each robot may have a set of basic interfaces and manipulation tools to interact with one another as well as humans in the environment, is it possible to coordinate multiple robots for a previously unplanned on-demand task? The research presented in this dissertation aims to begin answering this question. This dissertation makes three main contributions. The first contribution of this work is a formal framework for coordinating multiple robots to perform an on-demand task while balancing two objectives: (i) complete this on-demand task as quickly as possible, and (ii) minimize the total amount of time each robot is diverted from its routine duties. We formalize this stochastic sequential decision making problem, termed on-demand multi-robot coordination, as a Markov decision Process (MDP). Furthermore, we study this problem in the context of a specific on-demand task called multi-robot human guidance, where multiple robots need to coordinate and efficiently guide a visitor to his destination. Second, we develop and analyze stochastic planning algorithms, in order to efficiently solve the on-demand multi-robot coordination problem in real-time. Monte Carlo Tree Search (MCTS) planning algorithms have demonstrated excellent results solving MDPs with large state-spaces and high action branching. We propose variants to the MCTS algorithm that use biased backpropagation techniques for value estimation, which can help MCTS converge to reasonable yet suboptimal policies quickly when compared to standard unbiased Monte Carlo backpropagation. In addition to using these planning algorithms for efficiently solving the on-demand multi-robot coordination problem in real-time, we also analyze their performance using benchmark domains from the International Planning Competition (IPC). The third and final contribution of this work is the development of a multi-robot system built on top of the Segway RMP platform at the Learning Agents Research Group, UT Austin, and the implementation and evaluation of the on-demand multi-robot coordination problem and two different planning algorithm on this platform. We also perform two studies using simulated environments, where real humans control a simulated avatar, to test the implementation of the MDP formalization and planning algorithms presented in this dissertation.