Google, Amazon and University Researchers Program a Robot to Teach Itself to Walk

Can a robot teach itself to walk without human intervention? It can if it has been programmed with a multi-task learning procedure, an automatic reset controller, and a safety-constrained, reinforcement learning framework developed by a team of researchers from Google, Berkeley's artificial intelligence (AI) research lab and the Georgia Institute of Technology.

The underlying technique is actually called deep reinforcement learning (DRL). It uses deep learning (DL) and reinforcement learning (RL) principles to create algorithms that can be applied to robotics, but also video games, finance, and event healthcare. And the ability to reinforce learned behavior in robots is not new, but it has generally relied on a simulation in which a clone of the machine goes through its trials and errors in a virtual environment, the lessons from which are then downloaded into the device. What Google and the other researchers did was to create a way for "legged" robots to learn and adapt to ambulating challenges in the real world.

They laid out the details of the system they developed in a recently published paper, "Learning to Walk in the Real World with Minimal Human Effort."

The problem with systems that rely on virtual environments is, as the researchers put it, "building fast and accurate simulations to model the robot and the rich environments that the robot may be operating in is extremely difficult." But real-world RL presents challenges, too. It may take hundreds or even thousands of tries before the robot gets it right, and it can easily damage itself in the process. And that process is extremely time consuming. Or it might leave the training area, which would require humans to fetch them.

The researchers addressed these two challenges creatively: "By simultaneously learning to walk in different directions, the robot stays within the workspace. By automatically adjusting the balance between reward and safety, the robot falls dramatically less. By building hardware infrastructure and designing stand-up controllers, the robot automatically resets its states and enables continuous data collection."

They developed their system using a small, four-legged robot on a variety of "terrains," including flat ground, a soft mattress, and a doormat with crevices. The results of this phase of what is sure to be ongoing research are impressive:

"Our system can learn to walk on these terrains in just a few hours, with minimal human effort, and acquire distinct and specialized gaits for each one. In contrast to the prior work, in which approximately a hundred manual resets are required in the simple case of walking on the flat ground, our system requires zero manual resets in this case. We also show that our system can train four policies simultaneously (walking forward, backward and turning left and right), which form a complete skill-set for navigation and can be composed into an interactive directional walking controller at test time."

The growing demand for data-intensive, learned-based machine learning (ML) methods is leading to the adoption of DRL across a range of industries, from healthcare and education to manufacturing and marketing. That's the conclusion of another recently published study from Research and Markets ("Reinforcement Learning: An Introduction to the Technology"). In that study, market watchers found a growing demand for a general framework for DRL (also known as a semi-supervised learning model in the machine learning paradigm).

The team working on the self-learning robot project included Sehoon Ha, assistant professor in School of Interactive Computing, at the Georgia Institute of Technology; Stanford University professor Peng Xu, currently working as an applied scientist at Amazon Web Services; Zhenyu Tan and Jie Tan, software engineers working at Google Brain; and Sergey Levine, assistant professor in theĀ  Department of Electrical Engineering and Computer Sciences at UC Berkeley.

The researchers also published a video on YouTube demonstrating their new system that's worth a look.

About the Author

John K. Waters is the editor in chief of a number of sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at