Audio association might be easy for humans, but teaching a computer to do it is actually pretty challenging. Disney researchers trained AI to recognize the sound of images by feeding it a collection of videos demonstrating an object making a specific sound, but background noise, narration or sound made from other objects could easily confuse the system. If the system was fed samples with most of the uncorrelated sounds filtered out, however, it did a pretty good job of suggesting the right sound for each image. Still, the system isn't perfect: the team reports that it occasional had trouble differentiating the image of a car or a tram, causing it to sometimes suggest the wrong sound for a particular vehicle.

Audio image recognition probably isn't useful to most of the population, but the team hopes it can be used to create an automatic Foley processing system for video production -- making it easier for editors to add-in sound-effects during the production process. The technology may also be able to help the visually impaired by creating an image personification system, enabling them to 'hear' objects on a computer screen. Still, Disney Research has a lot of work to do before it gets close to making either of those futures a reality.