Tutorial on Deep Reinforcement Learning

Abstract: Reinforcement learning considers the problem of learning to act and is poised to power next generation AI systems, which will need to go beyond input-output pattern recognition (even if such simpler AI has sufficed for speech, vision, machine translation) and will have to generate intelligent behavior. Example application domains include robotics, marketing, dialogue, HVAC, optimizing healthcare and supply chains.

Reinforcement learning poses significant challenges beyond pattern recognition, including exploration, credit assignment, stability, safety. While these challenges are far from solved, there have recently been several major success stories. This includes learning to play Atari games from raw pixels, beating the Go World Champion, learning complex locomotion behaviors, acquiring advanced manipulation skills, and controlling datacenter energy consumption. These successes have relied on the synergy between deep neural nets and reinforcement learning, i.e., deep reinforcement learning (Deep RL).

In this tutorial we will cover the foundations of Deep RL (including, but not limited to: CEM, DQN, TRPO, PPO, SAC) as well as dive into the specifics of some of the main success stories and provide perspective on where the field is headed.

To get the most out of this tutorial, the audience is assumed to have basic familiarity with neural networks, optimization, probability.

Bio: Professor Pieter Abbeel is Director of the Berkeley Robot Learning Lab and Co-Director of the Berkeley Artificial Intelligence (BAIR) Lab. Abbeel’s research strives to build ever more intelligent systems, which has his lab push the frontiers of deep reinforcement learning, deep imitation learning, deep unsupervised learning, transfer learning, meta-learning, and learning to learn, as well as study the influence of AI on society. His lab also investigates how AI could advance other science and engineering disciplines. Abbeel's Intro to AI class has been taken by over 100K students through edX, and his Deep RL and Deep Unsupervised Learning materials are standard references for AI researchers. Abbeel has founded three companies: Gradescope (AI to help teachers with grading homework and exams), Covariant (AI for robotic automation of warehouses and factories), and Berkeley Open Arms (low-cost, highly capable 7-dof robot arms), advises many AI and robotics start-ups, and is a frequently sought after speaker worldwide for C-suite sessions on AI future and strategy. Abbeel has received many awards and honors, including the PECASE, NSF-CAREER, ONR-YIP, Darpa-YFA, TR35. His work is frequently featured in the press, including the New York Times, Wall Street Journal, BBC, Rolling Stone, Wired, and Tech Review.