This is yet another attempt of maintaining a list of datasets directly related to MIR. Other lists that I have found are this wiki, the ISMIR page, this web page, and this web page. If you are interested in speech processing, you can find a table of speech datasets on this page. If you are interested in multi-tracks, the Open Multitrack Testbed should be a good starting point. UPF also has an excellent page with datasets for world-music, including Indian art music, Turkish Makam music, and Beijing Opera. A curated list of MIDI sources can be found here. Two additional general resources are piano-midi.de for MIDI files and freesound.org for audio files.

If you know of other data sets that should be included in this list and eventually in the book please send me a note or post a comment.