I recently ran across an article titled The 100 Most Influential People in Crypto and decided to test my chops by analysing their twitter accounts to see how they felt about Bitcoin Cash. I’ll be using Python 3 for this project, and all the code will be available here.

Data Preprocessing

Hint: If you find preprocessing boring (can’t blame you) skip to the Data Analysis section below.

Scraping Tweets

First, I need to get a list of all the Twitter handles listed on the website. Using the urllib and BeautifulSoup libraries I can write a function to download/parse the page’s HTML, then extract the Twitter handles:

This may look like a ton of code just to get 100 Twitter handles, however the majority of it is boilerplate. In fact, you can use my web scraping boilerplate to kickstart any project that requires web scraping (if you have no experience with web scraping, see this video by Data Science Dojo).

Now that we have all 100 Twitter handles, it’s time to get the Tweets associated with each handle. I’ve found that the Twitter API is the most reliable way of accessing a user’s tweets, even though it only shows us the last ~3000 Tweets from each user. In the past I’ve tried to web scrape Twitter to access greater than 3000 Tweets from an account, however my results were spotty (at best).

After some Googling, I found this Github Gist; a Python script that downloads and saves a user’s Tweets in a CSV file. After modifying it to work with Python 3 and scrape 100 accounts in one go, I’m left with this:

Running the script above results in an error. It turns out the article used Peter Todd’s old Twitter handle (@petertoddbtc) which is set to private, causing the Twitter API to throw the error. So let’s change the outdated handle to Peter’s current handle (@peterktodd), and now we’re off to the races.

After running for an hour the script produces 100 CSV files (40MB in total) containing ~300,000 tweets:

Tweets saved in CSV format

Each of the 100 CSV files has the headings “id”, “created_at”, and “text”: