Yuhuai Wu, University of Toronto
May 28, 2018, Mon, 14:00-15:00
Sparse reward is one of the most challenging problems in Reinforcement Learning. Hindsight Experience Replay (HER) tried to address this issue by converting a failure experience to a successful one through relabeling the goals. Despite its effective-ness, HER has limited application because it lacks a compact and universal goal representation. We present Augmenting experienCe via TeacheR’s ad-viCE (ACTRCE), an efficient reinforcement learn-ing technique that extends the HER framework using natural language as goal representation. We demonstrate the performance of our method in both 2D and 3D navigation tasks and analyze the benefits that the natural language representation brings via various experiments. We show ACTRCE significantly outperforms the previous methods and achieves at least 3x improvement in the sample efficiency.
Yuhuai is a 3rd year PhD student at University of Toronto, under the supervision of Roger Grosse. In the past, he was a student of Geoffrey Hinton, Yoshua Bengio, and Ruslan Salakhutdinov. He is a recipient of Google PhD Fellow in machine learning of 2017. He had done an internship at OpenAI in 2017 with John Schulman and Pieter Abbeel, and will join Deepmind for an internship in the summer of 2018. His main research interests are reinforcement learning and optimization.