Kavanaugh Twitter Dataset

Dataset Details and Method of Ingestion

This dataset was collected using the Twitter search API over a three week period beginning September 22 and continuing through October 9, 2018. The following keywords and hashtags were tracked during this time: Kavanaugh, #Kavanaugh, “Supreme Court”, #KavanaughHearings, #KavanaughHearing and #kavanaughNomination.

A total of 56 million tweets from 3.2 million unique accounts were collected. This collection of tweets should represent the bulk of publicly available tweets regarding Kavanaugh with the associated search terms listed previously. The dataset has a compressed size of 11 GB and an uncompressed size of 315 GB. Each tweet object has a key added called “retrieved_on” which represents the epoch time that the tweet was ingested. The file is in ndjson format (new-line separated JSON objects).

Timeline of Dataset with Number of Tweets

Download this dataset:

https://files.pushshift.io/misc/Brett_Kavanaugh_Nomination_Tweets.ndjson.xz