Title page for ETD etd-08172012-110113

Type of Document Master's Thesis
Author Cunningham, Bryan
URN etd-08172012-110113
Title Non-Reciprocating Sharing Methods in Cooperative Q-Learning Environments
Degree Master of Science
Department Computer Science and Applications
Advisory Committee
Advisor Name Title
Cao, Yong Committee Chair
Cao, Yang Committee Member
Kavanaugh, Andrea L. Committee Member
  • Information Exchanges in Multi-Agent Systems
  • Multi-Agent Reinforcement Learning
  • Agent Interaction Protocols
  • Cooperative Learning
Date of Defense 2012-08-09
Availability unrestricted
Past research on multi-agent simulation with cooperative reinforcement learning (RL) for homogeneous agents focuses on developing sharing strategies that are adopted and used by all agents in the environment. These sharing strategies are considered to be reciprocating because all participating agents have a predefined agreement regarding what type of information is shared, when it is shared, and how the participating agent's policies are subsequently updated. The sharing strategies are specifically designed around manipulating this shared information to improve learning performance. This thesis targets situations where the assumption of a single sharing strategy that is employed by all agents is not valid. This work seeks to address how agents with no predetermined sharing partners can exploit groups of cooperatively learning agents to improve learning performance when compared to Independent learning. Specifically, several intra-agent methods are proposed that do not assume a reciprocating sharing relationship and leverage the pre-existing agent interface associated with Q-Learning to expedite learning. The other agents' functions and their sharing strategies are unknown and inaccessible from the point of view of the agent(s) using the proposed methods. The proposed methods are evaluated on physically embodied agents in the multi-agent cooperative robotics field learning a navigation task via simulation. The experiments conducted focus on the effects of the following factors on the performance of the proposed non-reciprocating methods: scaling the number of agents in the environment, limiting the communication range of the agents, and scaling the size of the environment.
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Cunningham_BL_T_2012.pdf 1.62 Mb 00:07:28 00:03:50 00:03:22 00:01:41 00:00:08

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.