November 6, 2017
In 2012, University of Toronto professor Geoffrey Hinton and two grad students showed off artificial neural networks, a technology that empowered machines to understand images. Google hired Hinton and his grad students six months later; Hinton now splits his time between Google and the university. Although neural networks now underlie speech transcription and many other tasks, Hinton isn’t enthused over the technology he helped launch. Instead, he’s now bullish on an “old” idea that could help reshape artificial intelligence.
Wired reports that Hinton, who is working with two Google colleagues in Toronto, just released two research papers that describe his approach, known as capsule networks, which are “a twist on neural networks intended to make machines better able to understand the world through images or video.”
In the papers, “Hinton’s capsule networks matched the accuracy of the best previous techniques on a standard test of how well software can learn to recognize handwritten digits” and “almost halved the best previous error rate on a test that challenges software to recognize toys such as trucks and cars from different angles.”
Hinton first came up with the idea that “vision systems need such an inbuilt sense of geometry” in 1979, and “laid out a preliminary design for capsule networks in 2011.” In these new networks, capsules, which are “small groups of crude virtual neurons,” are designed to track different parts of an object, such as a cat’s nose and ears, and their relative positions in space,” and a network of them can “use that awareness to understand when a new scene is in fact a different view of something it has seen before.”
This is a vast improvement on machine learning that needs “thousands of photos covering a variety of perspectives” to learn to recognize an image. Instead, the technology becomes more like a toddler who doesn’t “need such explicit and extensive training to learn to recognize a household pet.”
“Everyone has been waiting for it and looking for the next great leap from Geoff,” said NYU professor Kyunghyun Cho, who works on image recognition. Hinton, however, says that “capsule networks still need to be proven on large image collections, and that the current implementation is slow compared to existing image-recognition software,” but he is “optimistic he can address those shortcomings.”
Twenty Billion Neurons co-founder and University of Montreal professor Roland Memisevic is also upbeat, saying that “Hinton’s basic design should be capable of extracting more understanding from a given amount of data than existing systems.”