Computer science: AI revisits its past to solve complex tasks
Nature
February 25, 2021
A family of reinforcement learning algorithms that score higher than human players and state-of-the-art artificial intelligence systems at classic Atari video games, such as Montezuma’s Revenge and Pitfall, is reported in this week’s Nature. Collectively known as Go-Explore, the algorithms offer a way to improve the exploration of complex environments, which may be an important step towards creating truly intelligent learning agents.
Reinforcement learning can be used to train artificial intelligence systems to make decisions by exploring and understanding complicated environments, and to learn how to optimally acquire rewards. Rewards may include a robot reaching a specific location or completing a level in a video game. However, existing reinforcement learning algorithms seem to struggle when complex environments offer little feedback.
Adrien Ecoffet, Joost Huizinga and colleagues identify the main impediments to effective exploration and present a family of algorithms that addresses these two challenges. Go-Explore can thoroughly explore environments and it builds up an archive to help it to remember where it has been, ensuring that it does not forget the route to a promising intermediate stage or successful outcome (the reward). The authors demonstrate the potential of the family of algorithms by using them to solve all previously unsolved Atari 2600 games. Go-Explore quadruples previous scores on Montezuma’s Revenge and surpasses average human performance on Pitfall (where previous algorithms were unable to score any points). Go-Explore can also solve a simulated robotic task where a robot arm must pick up an object and put it on one of four shelves, two of which are behind latched doors.
The simple principles of remembering and returning to promising areas for exploration are a powerful and general approach to exploration, the authors note. They suggest that the algorithms presented here could have applications in robotics, language understanding and drug design.
After the embargo ends, the full paper will be available at: https://www.nature.com/articles/s41586-020-03157-9
doi: 10.1038/s41586-020-03157-9
Research highlights
-
Jul 1
Space health: The path of most resistance could help limit bone loss during spaceflightScientific Reports
-
Jun 30
Evolution: Hawks learn on the fly to swoop up before perchingNature
-
Jun 28
Astronomy: Hydrogen- and helium-rich exoplanets may provide habitable conditions for billions of yearsNature Astronomy
-
Jun 24
Sport science: New wearable sensor to measure neck strain may detect potential concussionScientific Reports
-
Jun 23
Scientific community: Women credited less than men in scientific paper authorshipNature
-
Jun 22
Planetary science: Modelling electrolyte transport in water-rich exoplanetsNature Communications