Home

Augmenting Experience via Teachers' Advice


Speaker

Yuhuai Wu, University of Toronto

Time

2018-05-28 14:00:00 ~ 2018-05-28 15:00:00

Location

中院213

Host

Weinan Zhang

Abstract
Sparse reward is one of the most challenging problems in Reinforcement Learning Hindsight Experience Replay (HER) tried to address this issue by converting a failure experience to a successful one through relabeling the goals Despite its effective-ness, HER has limited application because it lacks a compact and universal goal representation We present Augmenting experienCe via TeacheR’s ad-viCE (ACTRCE), an efficient reinforcement learn-ing technique that extends the HER framework using natural language as goal representation We demonstrate the performance of our method in both 2D and 3D navigation tasks and analyze the benefits that the natural language representation brings via various experiments We show ACTRCE significantly outperforms the previous methods and achieves at least 3x improvement in the sample efficiency
Bio
Yuhuai is a 3rd year PhD student at University of Toronto, under the supervision of Roger Grosse. In the past, he was a student of Geoffrey Hinton, Yoshua Bengio, and Ruslan Salakhutdinov. He is a recipient of Google PhD Fellow in machine learning of 2017. He had done an internship at OpenAI in 2017 with John Schulman and Pieter Abbeel, and will join Deepmind for an internship in the summer of 2018. His main research interests are reinforcement learning and optimization.
© John Hopcroft Center for Computer Science, Shanghai Jiao Tong University
分享到

地址:上海市东川路800号上海交通大学软件大楼专家楼
邮箱:jhc@sjtu.edu.cn 电话:021-54740299
邮编:200240