Browsing by Subject "Curriculum learning"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Curriculum learning in reinforcement learning(2021-05-06) Narvekar, Sanmit Santosh; Stone, Peter, 1971-; Niekum, Scott; Mooney, Raymond; Brunskill, EmmaIn recent years, reinforcement learning (RL) has been increasingly successful at solving complex tasks. Despite these successes, one of the fundamental challenges is that many RL methods require large amounts of experience, and thus can be slow to train in practice. Transfer learning is a recent area of research that has been shown to speed up learning on a complex task by transferring knowledge from one or more easier source tasks. Most existing transfer learning methods treat this transfer of knowledge as a one-step process, where knowledge from all the sources are directly transferred to the target. However, for complex tasks, it may be more beneficial (and even necessary) to gradually acquire skills over multiple tasks in sequence, where each subsequent task requires and builds upon knowledge gained in a previous task. This idea is pervasive throughout human learning, where people learn complex skills gradually by training via a curriculum. The goal of this thesis is to explore whether autonomous reinforcement learning agents can also benefit by training via a curriculum, and whether such curricula can be designed fully autonomously. In order to answer these questions, this thesis first formalizes the concept of a curriculum, and the methodology of curriculum learning in reinforcement learning. Curriculum learning consists of 3 main elements: 1) task generation, which creates a suitable set of source tasks; 2) sequencing, which focuses on how to order these tasks into a curriculum; and 3) transfer learning, which considers how to transfer knowledge between tasks in the curriculum. This thesis introduces several methods to both create suitable source tasks and automatically sequence them into a curriculum. We show that these methods produce curricula that are tailored to the individual sensing and action capabilities of different agents, and show how the curricula learned can be adapted for new, but related target tasks. Together, these methods form the components of an autonomous curriculum design agent, that can suggest a training curriculum customized to both the unique abilities of each agent and the task in question. We expect this research on the curriculum learning approach will increase the applicability and scalability of RL methods by providing a faster way of training reinforcement learning agents, compared to learning tabula rasa.Item Gradient-based optimization and implicit regularization over non-convex landscapes(2020-12-11) Wu, Xiaoxia (Shirley); Ward, Rachel, 1983-; Bottou, Léon; Israel , Arie; Sanghavi, Sujay; Tsai, Yen-HsiLarge-scale machine learning problems can be reduced to non-convex optimization problems if state-of-the-art models such as deep neural networks are applied. One of the most widely-used algorithms is the first-order iterative gradient-based algorithm, i.e., (stochastic) gradient descent method. Two main challenges arise from understanding the gradient-based algorithm over the non-convex landscapes: the convergence complexity and the algorithm's solutions. This thesis aims to tackle the two challenges by providing a theoretical framework and empirical investigation on three popular gradient-based algorithms, namely, adaptive gradient methods [39], weight normalization [138] and curriculum learning [18]. For convergence, the stepsize or learning rate plays a pivotal role in the iteration complexity. However, it depends crucially on the (generally unknown) Lipschitz smoothness constant and noise level on the stochastic gradient. A popular stepsize auto-tuning method is the adaptive gradient methods such as AdaGrad that update the learning rate on the fly according to the gradients received along the way; Yet, the theoretical guarantees to date for AdaGrad are for online and convex optimization. We bridge this gap by providing theoretical guarantees for the convergence of AdaGrad for smooth, non-convex functions; we show that it converges to a stationary point at the (log(N)/ √N) rate in the stochastic setting and at the optimal (1/N) rate in the batch (non-stochastic) setting. Extensive numerical experiments are provided to corroborate our theory. For the gradient-based algorithm solution, we study weight normalization (WN) methods in the setting of an over-parameterized linear regression problem where WN decouples the weight vector with a scale and a unit vector. We show that this reparametrization has beneficial regularization effects compared to gradient descent on the original objective. WN adaptively regularizes the weights and converges close to the minimum ℓ₂ norm solution, even for initializations far from zero. To further understand the stochastic gradient-based algorithm, we study the continuation method -- curriculum learning (CL) -- inspired by cognitive science that humans learn from simple to complex order. CL has proposed ordering examples during training based on their difficulty, while anti-CL proposed the opposite ordering. Both CL and anti-CL have been suggested as improvements to the standard i.i.d. training. We set out to investigate the relative benefits of ordered learning in three settings: standard-time, short-time, and noisy label training. We find that both orders have only marginal benefits for standard benchmark datasets. However, with limited training time budget or noisy data, curriculum, but not anti-curriculum ordering, can improve the performance.Item Modeling human motor learning traits using reinforcement learning(2021-08-12) Masetty, Bharath; Deshpande, Ashish D.; Stone, Peter, 1971-Learning a new motor skill is a complex process that requires extensive training and practice. Several theories from motor learning, neuroscience, education, and game design suggest that curriculum-based training may be the key to efficient skill acquisition. However, traditional methods for designing such training curricula often result in time-consuming, costly, and potentially ineffective motor skill learning. Systematizing and automating the curriculum generation process may improve humans' motor skill learning process. This work is a stepping stone towards the long-term goal of automating the curriculum generation process for human motor skill learning. Recent advances in artificial intelligence have introduced curriculum learning using reinforcement learning, which has enabled impressive speed-ups in artificial agents' abilities to learn complex tasks. This thesis draws its inspiration from a two-stage hierarchical model of curriculum learning consisting of two learning agents: a student agent that learns a given task and a teacher agent that learns the optimal curriculum for training the student agent. The core idea of this thesis is to bring the two-stage curriculum learning approach to design a curriculum for human motor skill acquisition. To accomplish this, we must replace the student agent with a model of human learning, which poses three main challenges: (1) it is not straightforward to accurately represent humans skill level or state of knowledge; (2) unlike artificial agents, limits exist on human training time and repetitions; and (3) human learning cannot be paused or externally controlled. In this thesis, we address these challenges by creating an artificial representation of human motor learning behavior. Our model of human motor learning is developed in the context of a specific motor task called Reach Ninja. We first model Reach Ninja as a Markov decision process (MDP) to enable RL agents to learn the Reach Ninja task. Using human demonstrations, we then identify the necessary constraints to limit the performance of the RL agent on the Reach Ninja MDP, which brings the learning behavior close to that of humans. The resultant approximate model demonstrates pre-training and post-training performance similar to that of humans. We then design a static curriculum capable of effectively training the artificial agent in our approximate model and test whether the same static curriculum can induce a similar learning behavior in humans. Preliminary tests with human subjects show that training with the same static curriculum did not improve learning efficiency compared to training directly on the target task. Finally, we discuss the methodology for learning a dynamic curriculum based on our model of Reach Ninja and human motor learningItem Selective machine learning for stock market prediction(2024-05) Barcelona, Jacob ; Plaxton, C. Greg; Muthuraman, KumarApplying state-of-the-art machine learning models to predicting stock returns has been a common focus of research for practitioners. However, these models often face challenges due to the inclusion of a wide universe of stocks, leading to performance degradation caused by significant noise. In this study, we apply two mechanisms from Selective Machine Learning and Curriculum Learning to enhance the robustness of our model to noise and improve overall performance. Additionally, we explore several avenues for future research in selective machine learning within the domain of finance.