⇐ Previous | Next ⇒

My NaNoWrimo Stats

Post created 2013-12-21 15:12 by Gabe Koss.



On a total whim I decided to participate in National Novel Writing Month. This is a month long writing marathon in which particpants attempt to write a 50,000 word novel in the month of November. I cheated a little bit and started on October 26.

Total Words 54173 October Words 2917 November Words 51256 Avg Words/Day (Nov) 1709

Progress Over Time

The vertical axis represents the word count of the story as it grew. Each bar indicates the total number of words reached per day. Hovering your mouse will show you the exact number of words reached on that date. The light line is created from the word count done each time I made a substantial save.

Common Words

After excluding common English stop words such as "that" or "is" the 10 most common words in my story were as follows:

sage 631 instances out 315 instances rama 249 instances back 184 instances one 165 instances down 144 instances looked 139 instances here 138 instances more 125 instances know 124 instances

Common Bigrams

Bigrams are two word units such as "depraved heathen" or "kind soul". The most common two word groupings were as follows:

of the 390 instances in the 222 instances to the 188 instances on the 163 instances into the 151 instances she had 107 instances was a 104 instances from the 92 instances out of 91 instances she was 90 instances

Code snippets:

I wrote the story with Vim and tracked my progress with Git. I did the analysis on this data using a combination of Ruby, D3.js and the Linux command line. Much of my data analysis was inspired by the classic Unix for Poets.

Here are some of the tools I used to do this analysis.

Extract top 10 words:

tr -sc '[A-Z][a-z]' '[\012*]' < story.md | tr '[A-Z]' '[a-z]' | sort | grep -E -v '^.{,2}$' | grep -E -v -f ../stop_words.grep |uniq -c | sort -n | tail -n 10

Extract top 10 bigrams

tr -sc '[A-Z][a-z]' '[\012*]' < story.md > nano.words tail -c +2 nano.words > nano.next paste nano.words nano.next | sort | uniq -c | sort -n > nano.bigrams tail -n 10 nano.bigrams

Please enable JavaScript to view the comments powered by Disqus.

Disqus