Exploration in Approximate Hyper-State Space
- Zintgraf, Luisa M*; Feng, Leo; Igl, Maximilian; Hartikainen, Kristian; Hofmann, Katja; Whiteson, Shimon
- Accepted abstract
-
[PDF]
[Slides]
[Join poster session]
Poster session from 15:00 to 16:00 EAT and from 20:45 to 21:45 EAT
Obtain the zoom password from ICLR
Abstract
Bayes-optimal agents are those that optimally trade off exploration and exploita- tion under task uncertainty, i.e., maximise online return incurred while learning. Although computing such policies is intractable for most problems, recent ad- vances in meta-learning and approximate variational inference make it possible to learn approximately Bayes-optimal behaviour for tasks from a given prior distri- bution. In this paper, we address the problem of exploration during meta-learning, i.e., gathering the data required for an agent to learn how to learn in an initially unknown task. Our approach uses reward bonuses that incentivise the agent to explore in hyper-state space, i.e., the joint state and belief space. On a sparse HalfCheetahDir task we show that our method can learn adaptation strategies for sparse tasks where existing meta-learning methods fail.