Safely Transferring to Unsafe Environments with Constrained Reinforcement Learning

  • Knight, Ethan*; Achiam, Joshua
  • Accepted abstract
  • [PDF] [Slides] [Join poster session]
    Poster session from 15:00 to 16:00 EAT and from 20:45 to 21:45 EAT
    Obtain the zoom password from ICLR


Agents deployed in the real world should operate safely, under constraints appropriate to the environment around them. In this work, we consider the problem of safe transfer: learning a safe, general policy from a low-stakes environment, and then transferring that policy to a more complex, high-stakes environment while continuing to satisfy safety constraints. In our experiments, we investigate safe transfer in an obstacle-avoidance setting, where we train a vision-based locomotion agent for transfer between simulated environments with different kinds of obstacles. In the low-stakes environment, the agent navigates around walls in its path, and in the complex high-stakes environment, the agent must avoid bumping into humanoids that are performing random actions from a motion capture dataset. We find that agents pre-trained in the low-stakes environment incur much lower cumulative cost than agents trained from scratch in the high-stakes environment while maintaining comparable performance, providing evidence and hope that future large-scale constrained reinforcement learning deployments can benefit from the safe transfer approach.

If videos are not appearing, disable ad-block!