Most valuable language

First, we need to know how many posts identified languages used. I ran this against the last two years of data, and found that there were 7845 posts flagged [OC], and the algorithm above identified the language(s) used in 4189 of them.Next, we can do a simple comparison of languages usages vs posts that specified languages, and that's where you get the post at the beginning (% = 100*(posts specifying this language/posts specifying any language)):Note that the numbers above add up to >100% because some posts specified multiple languages (511 of the 4189 posts with identifiable languages).And that's the original goal. It's clear that Excel wins by a landslide. I guess it makes sense because almost everyone can use Excel and it's really quick to get plots out. Python dominating MATLAB surprised me at first but makes sense in retrospect since MATLAB is not free and has fewer users (it's just really great for working with data).To make it interesting, I wanted to see if any languages predicted more success on reddit. I tried doing that a few different ways. A simple one is to get the average score per post per language:That looks odd. We can't assume post scores have a normal distribution though, so another test is using medians:That's a huge disparity between median and average. How weird is the distribution? A histogram with logarithmic bins yields:That is much clearer to me. One interesting thing is that it spikes up in the 3 to 10 thousand score range, so I'm guessing that's when a post makes it to the front page maybe? An idea then is to look at the score distributions by language:It's pretty clear from this that excel is more bottom heavy than some of the others. A huge number of posts with a score of 0 used it, and it has very few posts with extremely high scores, especially considering that it is the most popular language/tool for this. It looks like MATLAB and Adobe tools have the highest percentage of high-scoring posts, but they have so few samples it's hard to know. Among the popular languages/tools, Python and R appear to do best.A final way to answer what languages/tools are most likely to yield a high score is to see what percentage of posts using the language/tool yield a score above 100: