• Login
    • Submit
    View Item 
    •   Repository Home
    • UT Electronic Theses and Dissertations
    • UT Electronic Theses and Dissertations
    • View Item
    • Repository Home
    • UT Electronic Theses and Dissertations
    • UT Electronic Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Sample efficient multiagent learning in the presence of Markovian agents

    Icon
    View/Open
    chakraborty_dissertation_20129.pdf (1.309Mb)
    Date
    2012-12
    Author
    Chakraborty, Doran
    Share
     Facebook
     Twitter
     LinkedIn
    Metadata
    Show full item record
    Abstract
    The problem of multiagent learning (or MAL) is concerned with the study of how agents can learn and adapt in the presence of other agents that are simultaneously adapting. The problem is often studied in the stylized settings provided by repeated matrix games. The goal of this thesis is to develop MAL algorithms for such a setting that achieve a new set of objectives which have not been previously achieved. The thesis makes three main contributions. The first main contribution proposes a novel MAL algorithm, called Convergence with Model Learning and Safety (or CMLeS), that is the first to achieve the following three objectives: (1) converges to following a Nash equilibrium joint-policy in self-play; (2) achieves close to the best response when interacting with a set of memory-bounded agents whose memory size is upper bounded by a known value; and (3) ensures an individual return that is very close to its security value when interacting with any other set of agents. The second main contribution proposes another novel MAL algorithm that models a significantly more complex class of agent behavior called Markovian agents, that subsumes the class of memory-bounded agents. Called Joint Optimization against Markovian Agents (or Joma), it achieves the following two objectives: (1) achieves a joint-return very close to the social welfare maximizing joint-return when interacting with Markovian agents; (2) ensures an individual return that is very close to its security value when interacting with any other set of agents. Finally, the third main contribution shows how a key subroutine of Joma can be extended to solve a broader class of problems pertaining to Reinforcement Learning, called ``Structure Learning in factored state MDPs". All of the algorithms presented in this thesis are well backed with rigorous theoretical analysis, including an analysis on sample complexity wherever applicable, as well as representative empirical tests.
    Department
    Computer Sciences
    Description
    text
    Subject
    Artificial intelligence
    Multiagent learning
    URI
    http://hdl.handle.net/2152/19459
    Collections
    • UT Electronic Theses and Dissertations
    University of Texas at Austin Libraries
    • facebook
    • twitter
    • instagram
    • youtube
    • CONTACT US
    • MAPS & DIRECTIONS
    • JOB OPPORTUNITIES
    • UT Austin Home
    • Emergency Information
    • Site Policies
    • Web Accessibility Policy
    • Web Privacy Policy
    • Adobe Reader
    Subscribe to our NewsletterGive to the Libraries

    © The University of Texas at Austin

    Browse

    Entire RepositoryCommunities & CollectionsDate IssuedAuthorsTitlesSubjectsDepartmentThis CollectionDate IssuedAuthorsTitlesSubjectsDepartment

    My Account

    Login

    Information

    AboutContactPoliciesGetting StartedGlossaryHelpFAQs

    Statistics

    View Usage Statistics
    University of Texas at Austin Libraries
    • facebook
    • twitter
    • instagram
    • youtube
    • CONTACT US
    • MAPS & DIRECTIONS
    • JOB OPPORTUNITIES
    • UT Austin Home
    • Emergency Information
    • Site Policies
    • Web Accessibility Policy
    • Web Privacy Policy
    • Adobe Reader
    Subscribe to our NewsletterGive to the Libraries

    © The University of Texas at Austin