Learning Imitation Strategies using Cost-based Policy Mapping and Task Rewards

S.V. Gudla and M. Huber (USA)


Imitation, Reinforcement Learning, Policy Mapping


Learning by imitation represents a powerful approach for efficient learning and low-overhead programming. An important part of the imitation process is the mapping of observations to an executable control strategy. This is particularly important if the capabilities of the imitating and the demonstrating agent differ significantly. This paper presents an approach that addresses this problem by optimizing a cost function. The result is an executable strategy that as closely as possible resembles the observed effects of the demonstrator on the environment. To ensure that the imitating agent replicates the important aspects of the observed task, a learning component is introduced which learns the appropriate cost function from rewards obtained while executing the imitation strategy. The performance of this approach is illustrated within the context of a simulated multi-agent environment.

Important Links:

Go Back