Browsing by Subject "Evolutionary computation"
Now showing 1 - 11 of 11
- Results Per Page
- Sort Options
Item Adaptive representations for reinforcement learning(2007-05) Whiteson, Shimon Azariah; Stone, Peter, 1971-In reinforcement learning, an autonomous agent seeks an effective control policy for tackling a sequential decision task. Unlike in supervised learning, the agent never sees examples of correct or incorrect behavior but receives only a reward signal as feedback. One limitation of current methods is that they typically require a human to manually design a representation for the solution (e.g. the internal structure of a neural network). Since poor design choices can lead to grossly suboptimal policies, agents that automatically adapt their own representations have the potential to dramatically improve performance. This thesis introduces two novel approaches for automatically discovering high-performing representations. The first approach synthesizes temporal difference methods, the traditional approach to reinforcement learning, with evolutionary methods, which can learn representations for a broad class of optimization problems. This synthesis is accomplished via 1) on-line evolutionary computation, which customizes evolutionary methods to the on-line nature of most reinforcement learning problems, and 2) evolutionary function approximation, which evolves representations for the value function approximators that are critical to the temporal difference approach. The second approach, called adaptive tile coding, automatically learns representations based on tile codings, which form piecewise-constant approximations of value functions. It begins with coarse representations and gradually refines them during learning, analyzing the current policy and value function to deduce the best refinements. This thesis also introduces a novel method for devising input representations. In particular, it presents a way to find a minimal set of features sufficient to describe the agent’s current state, a challenge known as the feature selection problem. The technique, called Feature Selective NEAT is an extension to NEAT, a method for evolving neural networks used throughout this thesis. While NEAT evolves both the topology and weights of a neural network, FS-NEAT goes one step further by learning the network’s inputs too. Using evolution, it automatically and simultaneously determines the network’s inputs, topology, and weights. In addition to introducing these new methods, this thesis presents extensive empirical results in multiple domains demonstrating that these techniques can substantially improve performance over methods with manual representations.Item Competitive multi-agent search(2014-12) Bahceci, Erkin; Miikkulainen, RistoWhile evolutionary computation is well suited for automatic discovery in engineering, it can also be used to gain insight into how humans and organizations could perform more effectively. Using a real-world problem of innovation search in organizations as the motivating example, this dissertation formalizes human creative problem solving as competitive multi-agent search. It differs from existing single-agent and team-search problems in that the agents interact through knowledge of other agents' searches and through the dynamic changes in the search landscape caused by these searches. The main hypothesis is that evolutionary computation can be used to discover effective strategies for competitive multi-agent search. This hypothesis is verified in experiments using an abstract domain based on the NK model, i.e. partially correlated and tunably rugged fitness landscapes, and a concrete domain in the form of a social innovation game. In both domains, different specialized strategies are evolved for each different competitive environment, and also strategies that generalize across environments. Strategies evolved in the abstract domain are more effective and more complex than hand-designed strategies and one based on traditional tree search. Using a novel spherical visualization of the fitness landscapes of the abstract domain, insight is gained about how successful strategies work, e.g. by tracking positive changes in the landscape. In the concrete game domain, human players were modeled using backpropagation, and used as opponents to create environments for evolution. Evolved strategies scored significantly higher than the human models by using a different proportion of actions, providing insights into how performance could be improved in social innovation domains. The work thus provides a possible framework for studying various human creative activities as competitive multi-agent search in the future.Item Cultural enhancement of neuroevolution(2002) McQuesten, Paul Herbert; Miikkulainen, RistoAny transmission of behavior from one generation to the next via non–genetic means is a process of culture. Culture provides major advantages for survival in the biological world. This dissertation develops four methods that harness the mechanisms of culture to enhance the power of neuroevolution: culling overlarge litters, mate selection by complementary competence, phenotypic diversity maintenance, and teaching offspring to respond like an elder. The methods are efficient because they operate without requiring additional fitness evaluations, and because each method addresses a different aspect of neuroevolution, they also combine smoothly. The combined system balances diversity and selection pressure, and improves performance both in terms of learning speed and solution quality in sequential decision tasks.Item Discovering multi-purpose modules through deep multitask learning(2019-02-14) Meyerson, Elliot Keeler; Miikkulainen, Risto; Graumen, Kristen; Durrett, Greg; Nitschke, GeoffMachine learning scientists aim to discover techniques that can be applied across diverse sets of problems. Such techniques need to exploit regularities that are shared across tasks. This begs the question: What shared regularity is not yet being exploited? Complex tasks may share structure that is difficult for humans to discover. The goal of deep multitask learning is to discover and exploit this structure automatically by training a joint model across tasks. To this end, this dissertation introduces a deep multitask learning framework for collecting generic functional modules that are used in different ways to solve different problems. Within this framework, a progression of systems is developed based on assembling shared modules into task models and leveraging the complementary advantages of gradient descent and evolutionary optimization. In experiments, these systems confirm that modular sharing improves performance across a range of application areas, including general video game playing, computer vision, natural language processing, and genomics; yielding state-of-the-art results in several cases. The conclusion is that multi-purpose modules discovered by deep multitask learning can exceed those developed by humans in performance and generality.Item Efficient evolution of neural networks through complexification(2004) Stanley, Kenneth Owen; Miikkulainen, RistoArtificial neural networks can potentially control autonomous robots, vehicles, factories, or game players more robustly than traditional approaches. Neuroevolution, i.e. the artificial evolution of neural networks, is a method for finding the right topology and connection weights to specify the desired control behavior. The challenge for neuroevolution is that difficult tasks may require complex networks with many connections, all of which must be set to the right values. Even if a network exists that can solve the task, evolution may not be able to find it in such a high-dimensional search space. This dissertation presents the NeuroEvolution of Augmenting Topologies (NEAT) method, which makes search for complex solutions feasible. In a process called complexification, NEAT begins by searching in a space of simple networks, and gradually makes them more complex as the search progresses. By starting minimally, NEAT is more likely to find efficient and robust solutions than neuroevolution methods that begin with large fixed or randomized topologies; by elaborating on existing solutions, it can gradually construct even highly complex solutions. In this dissertation, NEAT is first shown faster than traditional approaches on a challenging reinforcement learning benchmark task. Second, by building on existing structure, it is shown to maintain an ”arms race” even in open-ended coevolution. Third, NEAT is used to successfully discover complex behavior in three challenging domains: the game of Go, an automobile warning system, and a real-time interactive video game. Experimental results in these domains demonstrate that NEAT makes entirely new applications of machine learning possible.Item Evolutionary neural architecture search for deep learning(2019-02-08) Liang, Jason Zhi; Miikkulainen,, Risto; Stone, Peter; Baldick, Ross; Huang, QixingDeep neural networks (DNNs) have produced state-of-the-art results in many benchmarks and problem domains. However, the success of DNNs depends on the proper configuration of its architecture and hyperparameters. DNNs are often not used to their full potential because it is difficult to determine what architectures and hyperparameters should be used. While several approaches have been proposed, computational complexity of searching large design spaces makes them impractical for large modern DNNs. This dissertation introduces an efficient evolutionary algorithm (EA) for simultaneous optimization of DNN architecture and hyperparameters. It builds upon extensive past research of evolutionary optimization of neural network structure. Various improvements to the core algorithm are introduced, including: (1) discovering DNN architectures of arbitrary complexity; (1) generating modular, repetitive modules commonly seen in state-of-the-art DNNs; (3) extending to the multitask learning and multiobjective optimization domains; (4) maximizing performance and reducing wasted computation through asynchronous evaluations. Experimental results in image classification, image captioning, and multialphabet character recognition show that the approach is able to evolve networks that are competitive with or even exceed hand-designed networks. Thus, the method enables an automated and streamlined process to optimize DNN architectures for a given problem and can be widely applied to solve harder tasks.Item General-purpose optimization through information maximization(2012-05) Lockett, Alan Justin; Miikkulainen, Risto; Ghosh, Joydeep; Mooney, Raymond; Ravikumar, Pradeep; Zitkovic, GordanThe primary goal of artificial intelligence research is to develop a machine capable of learning to solve disparate real-world tasks autonomously, without relying on specialized problem-specific inputs. This dissertation suggests that such machines are realistic: If No Free Lunch theorems were to apply to all real-world problems, then the world would be utterly unpredictable. In response, the dissertation proposes the information-maximization principle, which claims that the optimal optimization methods make the best use of the information available to them. This principle results in a new algorithm, evolutionary annealing, which is shown to perform well especially in challenging problems with irregular structure.Item Improving deep learning through loss-function evolution(2020-12-10) Gonzalez, Santiago; Miikkulainen, Risto; Banzhaf, Wolfgang; Durrett, Greg; Huang, QixingAs the complexity of neural network models has grown, it has become increasingly important to optimize their design automatically through metalearning. Methods for discovering hyperparameters, topologies, and learning rate schedules have lead to significant increases in performance. This dissertation tackles a new type of metalearning: loss-function optimization. Loss functions define a model's core training objective and thus present a clear opportunity. Two techniques, GLO and TaylorGLO, were developed to tackle this metalearning problem using genetic programming and evolutionary strategies. Experiments show that neural networks trained with metalearned loss functions are more accurate, have higher data utilization, train faster, and are more robust against adversarial attacks. A theoretical framework was developed to analyze how and why different loss functions bias training towards different regions of the parameter space. Using this framework, their performance gains are found to result from a regularizing effect that is tailored to each domain. Overall, this dissertation demonstrates that new, metalearned loss functions can result in better trained models, and provides the next stepping stone towards fully automated machine learning.Item Opponent modeling and exploitation in poker using evolved recurrent neural networks(2018-08-07) Li, Xun, Ph. D. in computer sciences; Miikkulainen, Risto; Ballard, Dana; Fernandez, Benito; Mok, AloysiusAs a classic example of imperfect information games, poker, in particular, Heads-Up No-Limit Texas Holdem (HUNL), has been studied extensively in recent years. A number of computer poker agents have been built with increasingly higher quality. While agents based on approximated Nash equilibrium have been successful, they lack the ability to exploit their opponents effectively. In addition, the performance of equilibrium strategies cannot be guaranteed in games with more than two players and multiple Nash equilibria. This dissertation focuses on devising an evolutionary method to discover opponent models based on recurrent neural networks. A series of computer poker agents called Adaptive System for Hold’Em (ASHE) were evolved for HUNL. ASHE models the opponent explicitly using Pattern Recognition Trees (PRTs) and LSTM estimators. The default and board-texture-based PRTs maintain statistical data on the opponent strategies at different game states. The Opponent Action Rate Estimator predicts the opponent’s moves, and the Hand Range Estimator evaluates the showdown value of ASHE’s hand. Recursive Utility Estimation is used to evaluate the expected utility/reward for each available action. Experimental results show that (1) ASHE exploits opponents with high to moderate level of exploitability more effectively than Nash-equilibrium-based agents, and (2) ASHE can defeat top-ranking equilibrium-based poker agents. Thus, the dissertation introduces an effective new method to building high-performance computer agents for poker and other imperfect information games. It also provides a promising direction for future research in imperfect information games beyond the equilibrium-based approach.Item Real-time robotic tasks for cyber-physical avatars(2017-05-03) Huang, Pei-Chi; Mok, Aloysius Ka-Lau; Miikkulainen, Risto; Stone, Peter; Sentis, Luis; Han, Song; Fok, Chien-LiangAlthough modern robots can perform complex tasks using sophisticated algorithms that are specialized to a particular task and environment, creating robots capable of completing tasks in unstructured environments without human guidance (e.g., through teleoperation) remains a challenge. In this research, we present a framework to meet this challenge for a "cyberphysical avatar," which is defined to be a semi-autonomous robotic system that adjusts to an unstructured environment and performs physical tasks subject to critical timing constraints while under human supervision. This thesis first realizes a cyberphysical avatar that integrates three key technologies: (1) whole body-compliant control, (2) skill acquisition from machine learning (neuroevolution methods and deep learning), and (3) vision-based control through visual servoing. Body-compliant control is essential for operator safety because avatars perform cooperative tasks in close proximity to humans; machine learning enables "programming" avatars such that they can be used by non-experts for a large array of tasks, some unforeseen, in an unstructured environment; the visual servoing technique is indispensable for facilitating feedback control in human avatar interaction. This thesis proposes and demonstrates a systematically incremental approach to automating robotic tasks by decomposing a non-trivial task into stages, each of which may be automated by integrating the aforementioned techniques. We design and implement the controllers for two semi-autonomous robots that integrate three key techniques for grasping and pick-and-place tasks. While a general theory is beyond reach, we present a study on the tradeoffs between three design metrics for robotic task systems: (1) the amount of training effort for the robots to perform the task, (2) the time available to complete the task when the command is given, and (3) the quality of the result of the performed task. The tradeoff study in this design space uses the imprecise computation model as a framework to evaluate specific types of tasks: (1) grasping an unknown object and (2) placing the object in a target position. We demonstrate the generality of our integration methodology by applying it to two different robots, Dreamer and Hoppy. Our approach is evaluated by the performance of the robots in trading off between task completion time, training time and task completion success rate, in an environment similar to those in the recent Amazon Picking Challenge.Item Robust non-linear control through neuroevolution(2003) Gomez, Faustino John; Miikkulainen, RistoMany complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning approaches have made progress in such problems, but have so far not scaled well. Neuroevolution, has improved upon conventional reinforcement learning, but has still not been successful in full-scale, non-linear control problems. This dissertation develops a methodology for solving real world control tasks consisting of three components: (1) an efficient neuroevolution algorithm that solves difficult non-linear control tasks by coevolving neurons, (2) an incremental evolution method to scale the algorithm to the most challenging tasks, and (3) a technique for making controllers robust so that they can transfer from simulation to the real world. The method is faster than other approaches on a set of difficult learning benchmarks, and is used in two full-scale control tasks demonstrating its applicability to real world problems.