November 21, 2016
Google’s DeepMind division has improved the speed and performance of its machine learning system with technology whose attributes are similar to how animals are thought to dream. Dubbed “Unreal” (Unsupervised Reinforcement and Auxiliary Learning), the system learned to complete Labyrinth, a 3D maze, ten times faster than the best existing artificial intelligence software and can now play up to 87 percent of expert human players’ performance. DeepMind researchers will now be able to try out new ideas much more quickly.
Bloomberg reports that the software replicates how animals are thought to dream of “positively or negatively rewarding events more frequently.” In that way, Unreal learns by replaying “its own past attempts at the game, focusing especially on situations in which it had scored points before.”
DeepMind developed Labyrinth “loosely based” on the video game series “Quake,” which “involves a machine having to navigate routes through a maze, scoring points by collecting apples.” The ability to score points reinforces positive behaviors.
“Our agent is far quicker to train, and requires a lot less experience from the world to train, making it much more data efficient,” said DeepMind researchers Max Jaderberg and Volodymyr Mnih, two of the seven scientists who published a paper on the topic. DeepMind, which has also taught its AI products to play the retro Atari title “Breakout,” “helped the system learn faster by asking it to maximize several different criteria at once, not simply its overall score in the game.”
One criterion was to have the system change its visual environment by performing certain actions. “The emphasis is on learning how your actions affect what you will see,” said Jaderberg and Mnih, who noted that this is “similar to the way newborn babies learn to control their environment to gain rewards,” including exposure to visual stimuli such as a shiny or colorful object. The researchers said it is premature to discuss “real-world applications” for Unreal “or similar systems.”
So far, Unreal has learned 57 vintage Atari game “much faster — and achieved higher scores — than the company’s existing software,” playing “on average” 880 percent better than the best human players (compared to 835 percent with DeepMind’s previous AI agent). The system showed its strength in the more complex Atari games such as “Montezuma’s Revenge,” on which Unreal reached 3,000 points, “greater than 50 percent of an expert human’s best effort.” The previous AI agent “scored zero points.”
DeepMind’s last breakthrough was earlier this year “when its AlphaGo software beat one of the world’s reigning champions in the ancient strategy game ‘Go’.” DeepMind research scientist Oriol Vinyals reports that Unreal will also be used to help create an interface for Blizzard Entertainment’s sci-fi video game “Starcraft II.”