This code produces an infinite supply of digit images derived from the well known MNIST dataset using pseudo-random deformations and translations. This is a streamlined version of the code used for the experiments reported in (Loosli, Canu, Bottou, 2007). A subset of the examples generated by this code are known as MNIST8M. Unfortunately the original MNIST8M files have been deleted from the NEC servers. However you can use InfiMNIST to regenerate these files or generate much larger files if you prefer. You can even use this code to generate deformed MNIST examples on the fly.

Each InfiMNIST example is identified by a long integer index that determines the source of the example and the transformations applied to the pattern. The examples numbered 0 to 9999 are the standard MNIST testing examples. The examples numbered 10000 to 69999 are the standard MNIST training examples. Each example with index i>=70000 is generated by applying a pseudorandom transformation to the MNIST training example numbered 10000+((i-10000)%60000). Because the pseudo-random transformations are deterministically derived from the example number, this is similar to having a file containing about one trillion distinct MNIST examples.