August 22, 2016
To keep track of the massive amount of data shared on Facebook, the company’s Artificial Intelligence Research (FAIR) lab created fastText, which offers a variety of techniques that make it more accurate and easy to do. Today, Facebook is making fastText open source, available on GitHub, so developers can use its libraries anywhere. Among the techniques fastText uses are “bag of words” and “subword information.” Facebook will use fastText to cut down on “clickbait,” an ever-present irritation on the Internet.
TechCrunch quotes a post from Facebook authors Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov who explain that the fastText tools support “both text classification and learning word vector representations.”
“In order to be efficient on datasets with a very large number of categories, fastText uses a hierarchical classifier, in which the different categories are organized in a tree, instead of a flat structure (think binary tree instead of a list),” they say.
Wired explains fastText as “Facebook’s AI-driven text classification system, bag-of-tricks approach that helps machines efficiently glean information from the order in which words appear.” It’s quite a feat to use AI to understand the grammar that’s intuitive to adults. FastText uses a technique that “essentially take[s] a qualitative analysis problem and force[s] it to be quantitative through the addition of statistics,” which allows it to be “faster than traditional deep learning methods.” Best of all, fastText can work with a wide variety of languages, including German, Spanish, French and Czech.
Facebook makes some impressive claims for fastText, notes TechCrunch, saying it can be “trained on more than 1 billion words in less than 10 minutes using a standard multicore CPU” and “classify a half-million sentences among more than 300,000 categories in less than five minutes.”
According to Wired, Facebook has suggested that developers might use fastText to block spam, as well as “power search engines and autocomplete fields.” Recommendation engines could also benefit from fastText. In addition to open-sourcing fastText, Facebook has made several AI projects available to the public, including a tool for spotting code bugs and “designs for AI-optimized hardware.” Google, Microsoft, Baidu, Amazon and Yahoo have all open-sourced some of their artificial intelligence technologies.