I have seen some debates about this lately. Lets take some quick and dirty look at the data. Etherscan.io lets us download CSV files with daily data for ETH price and hash rate (in reality these are semicolon-separated values). We download them in a directory, and they look like this:

etherprice

...

1493078400;50.11

1493164800;53.28

1493251200;63.10

1493337600;72.42

1493424000;69.85

1493510400;79.83

1493596800;77.60

1493683200;77.24 hashrate

...

1493078400;20759.4397

1493164800;20518.0576

1493251200;20819.5764

1493337600;22094.2863

1493424000;22674.7072

1493510400;22071.8930

1493596800;22307.6652

1493683200;22218.5493

The numbers in the first column are seconds since Epoch (January 1, 1970). The latest number, 1493683200, translates to May 2, 2017.

Now lets do some crappy python :)

from scipy.stats.stats import linregress

import csv

import numpy as np

import matplotlib.pyplot as plt

import datetime



with open('etherprice') as f:

prices = np.array([float(row[1]) for row in csv.reader(f, delimiter=';')])

with open('etherprice') as f:

times = np.array([datetime.datetime.fromtimestamp(int(row[0])) for row in csv.reader(f, delimiter=';')])

with open('hashrate') as f:

hashrates = np.array([float(row[1]) for row in csv.reader(f, delimiter=';')])



slope, intercept, r_value, p_value, std_err = linregress(prices, hashrates)

print 'slope=', slope, 'intercept=',intercept, 'r_value=', r_value, 'p_value=', p_value, 'std_err=', std_err



plt.plot(times,prices)

plt.plot(times,(hashrates-intercept)/slope)

plt.show()

This code reads both files, assuming that the timestamps are the same in both (and they are there are 643 points in both files if you download today). Then it runs linear regression, prints the results:

slope= 346.494231032 intercept= 600.14396646 r_value= 0.910826320068 p_value= 1.66351082006e-248 std_err= 6.20240860419

And finally, it plots the price against the timestamps (blue curve), and also the price calculated from the hash rate, as if the regression equality always holds (green curve).

hashrate = price * slot + intercept

I know it is a weird way of showing it, because I should have plotted actual hashrate against the hashrate calculated linearly from the price. But it would have just swapped the colors of the lines. I plotted the prices instead because people relate to price numbers much better than to hash rate numbers. Here are the graphs:

I am not going to make any conclusions at this point, because the analysis was quick and dirty and unscientific. But hopefully it helps discussions and generates more ideas.