September 20, 2019 Dataset Open Access

MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection

Purohit, Harsh; Tanabe, Ryo; Ichige, Kenji; Endo, Takashi; Nikaido, Yuki; Suefusa, Kaori; Kawaguchi, Yohei

This dataset is a sound dataset for malfunctioning industrial machine investigation and inspection (MIMII dataset). It contains the sounds generated from four types of industrial machines, i.e. valves, pumps, fans, and slide rails. Each type of machine includes seven individual product models*1, and the data for each model contains normal sounds (from 5000 seconds to 10000 seconds) and anomalous sounds (about 1000 seconds). To resemble a real-life scenario, various anomalous sounds were recorded (e.g., contamination, leakage, rotating unbalance, and rail damage). Also, the background noise recorded in multiple real factories was mixed with the machine sounds. The sounds were recorded by eight-channel microphone array with 16 kHz sampling rate and 16 bit per sample. The MIMII dataset assists benchmark for sound-based machine fault diagnosis. Users can test the performance for specific functions e.g., unsupervised anomaly detection, transfer learning, noise robustness, etc. The detail of the dataset is described in [1][2].

This dataset is made available by Hitachi, Ltd. under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

A baseline sample code for anomaly detection is available on GitHub: https://github.com/MIMII-hitachi/mimii_baseline/

*1: This version "public 1.0" contains four models (model ID 00, 02, 04, and 06). The rest three models will be released in a future edition.

[1] Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” arXiv preprint arXiv:1909.09347, 2019.

[2] Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019.

Files (100.2 GB) Name Size -6_dB_fan.zip

md5:f02ae808a58d84b6815b7ec38ff30879 10.9 GB Preview Download -6_dB_pump.zip

md5:d20b783a0ff9c93d58f452f98c37b112 8.2 GB Preview Download -6_dB_slider.zip

md5:49913eda7d37f182cbf8ed5c984140e0 8.0 GB Preview Download -6_dB_valve.zip

md5:fdfaf185fea61b21e11952a070a4ada7 8.0 GB Preview Download 0_dB_fan.zip

md5:6354d1cc2165c52168f9ef1bcd9c7c52 10.4 GB Preview Download 0_dB_pump.zip

md5:488748295c3f60b25de07b58fe75b049 7.9 GB Preview Download 0_dB_slider.zip

md5:4d674c21474f0646ecd75546db6c0c4e 7.5 GB Preview Download 0_dB_valve.zip

md5:178478eb0d11c79080a35562bfdeee71 7.5 GB Preview Download 6_dB_fan.zip

md5:0890f7d3c2fd8448634e69ff1d66dd47 10.2 GB Preview Download 6_dB_pump.zip

md5:a09ba6060c10fc09cd4c8770213b0b9f 7.7 GB Preview Download 6_dB_slider.zip

md5:838c2b3441858359c4704ef13a1b27ff 7.1 GB Preview Download 6_dB_valve.zip

md5:fe5fb7c337cd701b1d31dc641e621892 6.9 GB Preview Download