Topic Modeling of Twitter Followers¶

This notebook is associated to this article on my blog.

We use LDAvis to visualize several LDA modeling of the followers of the @alexip account.

The different LDAs were trained with the following parameters

10 topics, 10 passes, alpha = 0.001

50 topics, 50 passes, alpha = 0.01

40 topics, 100 passes, alpha = 0.001

Extraction of the data from twitter was done via this python 2 script And the dictionary and corpus were created via this one

To see the best results, set lambda around [0.5, 0.6]. Lowering Lambda gives more importance to words that are discriminatory for the active topic, words that best define the topic.

You can skip the 2 first models and jump to the last model which is the best (40 topics)

A working version of this notebook is available on nbviewer