As marijuana has become legal in several states, it's been a frequent topic of interest in the news. And as with any interesting topic, I like to find useful ways to visually analyze the data. In this case, let's have a look at the price of marijuana, and how it varies from state to state...

First, I needed some data. After a bit of Google searching, I found a website called priceofweed.com that has crowd-sourced marijuana price information. It lets you search for various areas (such as states), and here is an example of the type of data they have for North Carolina:

Their web site also has a map of the US, with color-coded '$' characters representing the price of marijuana in each state. It's a cute map, but not very analytic in my opinion.

So of course I decided to try creating my own version. The most difficult part was getting all the data. The data for each state was on a separate web page, embedded in the html code. Therefore I wrote a SAS data step to loop through each US state, and call a macro which I set up to 'scrape' the data out of the html code. I combined the data for all 50 states into a single dataset, and then saved it in a permanent library (so I wouldn't have to re-read all 50 web pages each time I experimented with my graphs). Here's a link to the code, in case you'd like to see the exact syntax.

Rather than plotting colored markers on my map, I used a choropleth map. And rather than having three colors (green, orange, red) for three price ranges (0-300, 300-400, 400+), I used quintile binning so that each of my 5 gradient shades of color represents 1/5 of the states. I also added a title & time stamp, and each state has HTML mouse-over text (click the screen-capture below to see the interactive map).

I was pretty happy with my improved map ... but I still felt something was lacking. It was interesting to see how the prices varied state-by-state, but I thought it would be even more useful to have an easy way to compare the prices of the states. Therefore I set up a sorted bar chart (and color-coded it the same way as the map, for each cross-referencing). I think the combination of the map and bar chart provide a lot of insight into the price data.

What other ways would you like to graphically analyze this data? I wonder if there might also be a correlation between price and the legalization status in the states (see my previous blog post showing which states have legalized marijuana use). And how might you re-use these techniques with data other than marijuana prices? I'll give this some more thought ... but I think I'll grab a snack first!

Note that buying marijuana is not legal in many states (see my prior post for when and where it’s been legalized). And although there is price data for all states, of course I'm not encouraging anyone to buy marijuana in states where it's illegal.