Hits:
Indexed by:会议论文
Date of Publication:2006-06-21
Included Journals:EI、CPCI-S、Scopus
Volume:2
Page Number:205-205
Key Words:reinforcement learning; MAS; actor-critic; RoboCup; function approximation
Abstract:Actor-Critic method combines the fast convergence of value-based (Critic) and directivity on search of policy gradient (Actor). It is suitable for solving the problems with large state space. In this paper, the Actor Critic method with the tile-coding linear function approximation is analysed and applied to a RoboCup simulation subtask named "Soccer Keepaway". The experiments on Soccer Keepaway show that the policy learned by Actor-Critic method is better than policies from value-based Sarsa(lambda) and benchmarks.