Rethinking Supervision in Meta-Reinforcement Learning
- Gupta, Abhishek
- Invited Talk
Standard reinforcement learning techniques can be very data inefficient and scale poorly across tasks. Meta-reinforcement learning provides us a solution by leveraging prior experience on a distribution of related tasks to learn very efficient reinforcement learning algorithms. While this can often work well, it requires very expensive hand-specification of task distributions and rewards at training time. In this talk, I will discuss how we can rethink this concept of supervision in meta-reinforcement learning algorithms, by proposing the idea of unsupervised meta-reinforcement learning. This talk introduces the general idea of unsupervised meta-RL, discusses a provably optimal instantiation of this paradigm and shows how we can build a practical algorithm from these insights that is able to learn much more efficiently than tabula rasa RL techniques.