Next-Gen Music Retrieval: Free Million-Song Dataset Released by Echo Nest

  • The Million Song Dataset has been released for free by The Echo Nest music application company to facilitate research into music recommendation engines. The dataset consists of audio features and metadata (but not the actual music) for a million popular music tracks.
  • Ars Technica reports that the dataset is a “freely-available collection of audio features and metadata for a million contemporary popular music tracks,” being analyzed by Columbia University’s Laboratory for the Recognition and Organization of Speech and Audio.
  • Currently, services like Pandora make use of musicologists to catalog the characteristics of songs. Researchers are looking at methods for computers to analyze songs in order to make recommendations based on your preferences. The dataset could potentially be used to develop a new generation of Music Information Retrieval services.
  • The National Science Foundation is also conducting The Listening Machine Project which is focused on analyzing “the individual sources present in a real-world sound recording,” which could lead to improved perception for robots, new prosthetic devices for hearing impaired and “a wide range of novel applications in content-based multimedia indexing,” explained LMP’s Dan Ellis, associate professor of Electrical Engineering at Columbia.