August 20, 2021
Google revealed its work on a new AI-enabled Internet search tool dubbed MUM (Multitask Unified Model), which can “read” the nuances globally of human language. The company says that users will be able to find information more readily and be able to ask abstract questions. MUM is not yet publicly available but Google independently used it for a COVID-19 related project. Vice president of search Pandu Nayak and a colleague designed an “experience” that gave in-depth information on vaccines when users searched for them.
Popular Science reports that Google planned to set up a site that would offer information on how vaccines work and where they are available but, because people around the world refer to COVID vaccines by different names, the team “spent hundreds of hours combing through resources to identify all the different names for COVID itself.”
This year, with the new tool, said Nayak, the team was “able to set up a very simple experiment with MUM that within seconds was able to generate over 800 names for 17 different vaccines in 50 different languages.”
He predicted that, over time, MUM will improve “a lot of language tasks that need to be solved, whether it’s classification, ranking, information extraction, and a whole host of others.” “Existing features and existing experiences will just work that much better,” he said.
At the Google I/O developer’s conference, senior vice president Prabhakar Raghavan debuted MUM. Now, the company said that, “MUM is able to acquire deep knowledge of the world, understand language and generate it, and train across 75 languages at once.”
Internal pilots are testing if MUM “can be multimodal — that is, able to simultaneously understand different forms of information like text, images, and video.” The goal is to enable users to ask more complex questions in natural language. Nayak said engineers are “training MUM to recognize the relationship between words and images, and it’s going well.”
MUM can understand language like BERT [Bidirectional Encoder Representations from Transformers], Google’s machine learning technique for natural language processing pre-training. But Google says MUM is “about 1,000 times more powerful” than BERT and that it is “trained on a high-quality subset of the public web corpus across all the different languages that Google serves.”
Although the team removed “adult content, explicit content, hate speech,” Nayak noted the challenges of working with large language models including “whether it reflects or reinforces biases that are there in the web.”
The team is also building “on an assembly of innovative features … to make search better.” “Today, when people come to search, it’s not like they come with fully formed queries in their heads,” Nayak said. “You have to take this fuzzy need that you have, convert it into one or more queries that you can issue to Google, learn about different aspects of the problem and put it together.”
Google to Introduce Increased Protections for Minors on Its Platform, Including Search, YouTube and More, TechCrunch, 8/10/21