Machine Learning and Applications, Fourth International Conference on
Download PDF

Abstract

In application domains characterized by dynamic changes and non-deterministic action outcomes, it is frequently difficult for agents or robots to operate without any human supervision. Although human feedback can help an agent learn a rich representation of the task and domain, humans may not have the expertise or time to provide elaborate and accurate feedback in complex domains. Widespread deployment of intelligent agents hence requires that the agents operate autonomously using sensory inputs and limited high-level feedback from non-expert human participants. Towards this objective, this paper describes an augmented reinforcement learning framework that combines bootstrap learning and reinforcement learning principles. In the absence of human feedback, the agent learns by interacting with the environment. When high-level human feedback is available, the agent robustly merges it with environmental feedback by incrementally revising the relative contributions of the feedback mechanisms to the action choice policy. The framework is evaluated in two simulated domains: Tetris and Keep away soccer.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles