TL;DR: Among companies in S&P 500 with at least 250 reviews on Glassdoor.com, every additional star rating (on a scale of 1-5) translated to roughly 11% increase in stock return over the past year .

This post is the result of need for quantifying the effect of Employee happiness on bottom line/stock return. Glassdoor ratings give a quantifiable measure of employee happiness and is often used by job seekers to check how well a company of interest treats its employees. While the correlation between employee happiness and glassdoor ratings is not perfect, IMO it is better than self reported employee satisfaction of companies due to obvious distrust among employees to air their feelings on a survey conducted by the employer.

The obvious hypothesis to test is if higher glassdoor ratings lead to higher return on stock over a reasonable period of time. Due to the nature of economy and companies focus on bottom line for the quarter, it is only fair to evaluate them on a similar timescale. In this study the stock return is calculated over a year ( about 4 quarters) to enable some sort of filtering over short term noise. The analysis is performed over 500 companies forming the S&P 500 index. It is assumed that such aggregate analysis over 500 companies will avoid falling victim to the law of small numbers and predicted trends will not be biased by one off events in individual companies.

The results are plotted in the figure above. An interactive version with datatips showing company name and the number of reviews is available on plot.ly website. The Glassdoor ratings have a resolution of 0.1 hence the discreteness of values along the x axis. A linear regression fit to the data shows that increase in every additional star rating leads to roughly 5% increase in stock return. However the p-value for this fit was 0.07 which is higher than the 0.05 threshold used usually. Not to say that the result is statistically insignificant, just saying that the results are not within the usually accepted norms of statistical significance.

It should be also noticed that quite a few of the companies had very few reviews on Glassdoor and including such companies may not be the best idea while testing the hypothesis. Subsequently it was decided to test the hypothesis on strong respondents only, i.e companies which have at least 250 reviews on glassdoor. In the case of strong respondents (225 companies), linear regression fit showed that increase in every additional star rating leads to roughly 11% increase in stock return and the p-value for the fit was 0.009!!. Further increasing the requirement on reviews to 500 (a total of 149 companies), resulted in a fit that showed 12% increase in stock return for increase in every additional star rating on glassdoor (and a p-value of 0.007).

Since it is seen that among strong respondents the p-value is sufficiently low, and the increase in return with increase in star rating is high it can be concluded that hypothesis is true for now. This leads to the obvious pondering of whether glassdoor ratings can be used as an investment strategy. My guess is possibly yes, although I believe you can’t pick individual stocks based on glassdoor ratings. A basket of well rated companies with relatively large number of reviews will out perform poorly rated companies with relatively large number of reviews. Another conclusion for the annals of obvious findings :).

Method:

The glassdoor ratings were queried using google search via Matlab, and the resulting URL text was mined to obtain relevant ratings and the number of reviews for each of the 500 companies. For calculating stock return, the files available on Matlab file exchange were utilized. The stock return was calculated including the dividends for each company using basic formulas. Plot.ly was used for creating the plots.

Notes: Due to issue with stock data on google finance the returns data for 4 companies were not obtained using the scripts. Hence all the aforementioned analysis for all respondents were done for 496 companies. The regression analysis was done using Matlab, which ignores the assigned NaN values for the four companies. In the case of strong respondents (250 reviews) only one company of 225 did not have stock data, and in the case of strong respondents (500 reviews) all companies had stock data. The stock data was collected for the period of 7/25/2013-7/25/2014. The glassdoor ratings were scraped on 7/25/2014. There was some discrepancy between plot.ly’s regression fit and Matlabs’s fit, the fit from Matlab were chosen. The results and conclusions do not change drastically either way.