AIREX: Cooperation and Coordination Between Fuzzy Reinforcement Learning Agents in Continuous State Partially Observable Markov Decision Processes

タイトル	Cooperation and Coordination Between Fuzzy Reinforcement Learning Agents in Continuous State Partially Observable Markov Decision Processes
本文（外部サイト）	http://hdl.handle.net/2060/20000085879
著者（英）	Berenji, Hamid R.; Vengerov, David
著者所属（英）	NASA Ames Research Center
発行日	1999-06-14
言語	eng
内容記述	Successful operations of future multi-agent intelligent systems require efficient cooperation schemes between agents sharing learning experiences. We consider a pseudo-realistic world in which one or more opportunities appear and disappear in random locations. Agents use fuzzy reinforcement learning to learn which opportunities are most worthy of pursuing based on their promise rewards, expected lifetimes, path lengths and expected path costs. We show that this world is partially observable because the history of an agent influences the distribution of its future states. We consider a cooperation mechanism in which agents share experience by using and-updating one joint behavior policy. We also implement a coordination mechanism for allocating opportunities to different agents in the same world. Our results demonstrate that K cooperative agents each learning in a separate world over N time steps outperform K independent agents each learning in a separate world over K*N time steps, with this result becoming more pronounced as the degree of partial observability in the environment increases. We also show that cooperation between agents learning in the same world decreases performance with respect to independent agents. Since cooperation reduces diversity between agents, we conclude that diversity is a key parameter in the trade off between maximizing utility from cooperation when diversity is low and maximizing utility from competitive coordination when diversity is high.
NASA分類	Behavioral Sciences
権利	No Copyright
URI	https://repository.exst.jaxa.jp/dspace/handle/a-is/226935