You music lovers out there probably think we’re living in a Golden Age. iTunes, Pandora , Rhapsody, music distribution and discovery couldn’t get any better, right? With the proliferation of music sites and apps, we must be at some sort of saturation point, after all, the telos of digital music technology.

But spend a bit of time talking to Brian Whitman, cofounder of The Echo Nest, and you realize that we’re really in a digital music Stone Age. Sure, we’ve come a long way, but there’s still plenty we can’t do–our recommendation engines are limited, as is our ability to sift information automatically from songs (to tell the sex of a singer just from his or her voice, for instance). The Echo Nest, a five-year-old company devoted to aggregating, indexing, using, and sharing vast troves of music data, just announced a collaboration with Columbia University’s LabROSA (Laboratory for the Recognition and Organization of Speech and Audio) on something called the Million Song Dataset, free to use for non-commercial music researchers.

Let’s begin with music recommendation. What’s wrong with Pandora? Repetition. Even cofounder and Chief Strategy Officer Tim Westergren admitted recently that they’re working on the repetition. Pandora’s site declares that there are 800,000 songs and counting in its database. Not a negligible number, by any means. But Echo Nest has 30 million. “It’s great for a top 40 radio experience,” Whitman tells Fast Company, giving Pandora credit where due. “But if you want to dig deep down into a lot more music, you need some automated discovery platform.” Whereas Pandora proudly employs people to manually go through music and classify it, The Echo Nest, says Whitman, “understands the world of music automatically.” And not just how it sounds.

The Echo Nest crawls the web in search of music and writing about music; it also partners with major labels like Universal and aggregators like 7Digital. It then devours data about the music, on both the “acoustic side”–tempo, key, etc. (Echo Nest’s system crunches that sort of data in about 10 seconds for a song)–and the “cultural side”–what reviewers are saying about the music for instance. It crawls the web, Google-style, ravenous for new musical information. If you tweet about the band you saw last night, “we have that in our databases within the hour,” says Whitman.

What are the uses of data on 30 million songs? Broadly, there are two categories: commercial and academic.