Just came across a dataset, which has been released by Criteo for academic purposes. Earlier, Criteo had provided a part of this dataset for academic purposes. Here are a few salient features about the dataset:

More than 4 billion rows in total

Size more than 1 TB

Here is a link to benchmark performance

The dataset is hosted on Microsoft Azure platform and can be downloaded from Criteo Labs

A big thanks to Criteo for dwarfing my machine This should now become a good source of information for people in online display and learning