We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using a new technique called Automatic Domain Randomization (ADR). TThis shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
Machine Learning & Artificial Intelligence
At OpenAI, I co-led a team working on learning-based robotics. This means developing methods to allow robots to learn new tasks quickly, without being programmed explicitly for a specific task.
We’ve trained a human-like robot hand to manipulate physical objects with unprecedented dexterity. Our system, called Dactyl, is trained entirely in simulation and transfers its knowledge to reality, adapting to real-world physics using techniques we’ve been working on for the past year.
We’re releasing eight simulated robotics environments and a Baselines implementation of Hindsight Experience Replay, all developed for our research over the past year.
We show how an off-policy reinforcement learning algorithm can learn from its own failures in a multi-goal environment.
Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba
NIPS 2017, Long Beach
We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.
We show how an object detector can be trained entirely in simulation and successfully transfer to the real world.
Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, Pieter Abbeel
IROS 2017, Vancouver
We show how a robot can identify a task shown by a human in virtual reality, and replicate it in previously unseen conditions.
Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba
NIPS 2017, Long Beach
Gym is the industry-standard toolkit for benchmarking the performance of reinforcement learning algorithms.
The next big AI breakthroughs will require both new learning algorithms, and new infrastructure to scale machine learning to thousands of computers.
Machine Learning Systems at Scale (2017)
Thinking about the entirety of the stack is crucial for achieving scalable performance in distributed deep learning.
MLconf 2017, San Francisco
Building the Infrastructure that Powers the Future of AI (2017)
How OpenAI uses Kubernetes and TensorFlow to run machine learning experiments across thousands of machines
Keynote, KubeCon 2017, Berlin