After cleanup, the data looks reasonable and as expected, with each variable spanning the ranges one would expect of temperatures and precipitation:

summary (weather) # EDT Max.TemperatureF Mean.TemperatureF Min.TemperatureF Max.Dew.PointF MeanDew.PointF # 2012-2-16: 1 Min. : 26.0 Min. :21.0 Min. :14.0 Min. :12.0 Min. : 1 # 2012-2-17: 1 1st Qu.: 53.0 1st Qu.:45.0 1st Qu.:37.0 1st Qu.:39.8 1st Qu.:33 # 2012-2-18: 1 Median : 68.0 Median :58.0 Median :50.0 Median :54.0 Median :48 # 2012-2-19: 1 Mean : 66.9 Mean :58.7 Mean :50.6 Mean :52.5 Mean :47 # 2012-2-20: 1 3rd Qu.: 82.0 3rd Qu.:73.0 3rd Qu.:66.0 3rd Qu.:67.0 3rd Qu.:63 # 2012-2-21: 1 Max. :100.0 Max. :87.0 Max. :80.0 Max. :79.0 Max. :74 # (Other) :522 # Min.DewpointF Max.Humidity Mean.Humidity Min.Humidity Max.Sea.Level.PressureIn # Min. :-5.0 Min. : 42.0 Min. :28.0 Min. :12.0 Min. :29.4 # 1st Qu.:25.0 1st Qu.: 82.0 1st Qu.:59.0 1st Qu.:37.0 1st Qu.:30.0 # Median :42.0 Median : 89.0 Median :69.0 Median :48.0 Median :30.1 # Mean :41.2 Mean : 85.7 Mean :67.8 Mean :48.6 Mean :30.1 # 3rd Qu.:58.0 3rd Qu.: 93.0 3rd Qu.:77.2 3rd Qu.:60.0 3rd Qu.:30.2 # Max. :73.0 Max. :100.0 Max. :97.0 Max. :93.0 Max. :30.6 # # Mean.Sea.Level.PressureIn Min.Sea.Level.PressureIn Max.VisibilityMiles Mean.VisibilityMiles # Min. :29.1 Min. :28.7 Min. : 6.00 Min. : 2.0 # 1st Qu.:29.9 1st Qu.:29.8 1st Qu.:10.00 1st Qu.: 9.0 # Median :30.0 Median :29.9 Median :10.00 Median :10.0 # Mean :30.0 Mean :29.9 Mean : 9.97 Mean : 9.1 # 3rd Qu.:30.2 3rd Qu.:30.1 3rd Qu.:10.00 3rd Qu.:10.0 # Max. :30.5 Max. :30.5 Max. :10.00 Max. :10.0 # # Min.VisibilityMiles Max.Wind.SpeedMPH Mean.Wind.SpeedMPH Max.Gust.SpeedMPH PrecipitationIn # Min. : 0.00 Min. : 6.0 Min. : 1.00 Min. : 0.0 Min. :0.000 # 1st Qu.: 3.00 1st Qu.:13.0 1st Qu.: 6.00 1st Qu.: 0.0 1st Qu.:0.000 # Median : 9.00 Median :15.0 Median : 8.00 Median :22.0 Median :0.000 # Mean : 6.88 Mean :16.4 Mean : 8.04 Mean :18.6 Mean :0.124 # 3rd Qu.:10.00 3rd Qu.:20.0 3rd Qu.:10.00 3rd Qu.:28.0 3rd Qu.:0.030 # Max. :10.00 Max. :40.0 Max. :27.00 Max. :56.0 Max. :6.000 # # CloudCover Events WindDirDegrees.br... MP DayLength # Min. :0.0 Min. :1.00 Min. : 1 Min. :1.00 Min. : 9.27 # 1st Qu.:3.0 1st Qu.:1.00 1st Qu.:138 1st Qu.:2.00 1st Qu.:10.31 # Median :4.5 Median :1.00 Median :208 Median :3.00 Median :12.10 # Mean :4.4 Mean :1.47 Mean :202 Mean :2.98 Mean :12.07 # 3rd Qu.:6.0 3rd Qu.:2.00 3rd Qu.:283 3rd Qu.:4.00 3rd Qu.:13.87 # Max. :8.0 Max. :2.00 Max. :360 Max. :4.00 Max. :14.73

CloudCover being 0–8 looks a bit odd, since one might’ve expected a decimal or percentage or some quantification of thickness but turns out to be a standard measurement, the “okta”. We might expect some seasonal effects, but graphing a sensitive LOESS moving average and then a per-week spineplot suggests just 1 anomaly, that there were some days I rated “1” but never subsequently (this has an explanation: as I began keeping the series, my car caught on fire, was totaled; and I was then put through the wringer with the insurance & junkyard, and felt truly miserable at multiple points):

par ( mfrow= c ( 2 , 1 )) scatter.smooth ( x= weather $ EDT, y= weather $ MP, span= 0.3 , col= "#CCCCCC" , xlab= "Days" , ylab= "MP rating" ) weeks <- c ( seq ( from= 1 , to= length (weather $ MP), by= 7 ), length (weather $ MP)) spineplot ( c ( 1 : ( length (weather $ MP))), as.factor (weather $ MP), breaks= weeks, xlab= "Weeks" , ylab= "MP rating ratios" ) # we don't use the date past the spineplot & # it causes problems with all the models, so delete it rm (weather $ EDT)

MP from February 2012 to March 2013

A formal check for autocorrelation in MP ratings (most methods assume independence) turns up little enough that I feel free to ignore them. So the data looks clean and tame; now we can begin real interpretation. The first and most obvious thing to do is to see what the overall correlation matrix with MP looks like:

# all weather correlations with mood/productivity round ( cor (weather)[ 23 ,], digits= 3 ) # Max.TemperatureF Mean.TemperatureF Min.TemperatureF # 0.013 0.016 0.022 # Max.Dew.PointF MeanDew.PointF Min.DewpointF # 0.016 0.019 0.016 # Max.Humidity Mean.Humidity Min.Humidity # 0.042 0.026 0.011 # Max.Sea.Level.PressureIn Mean.Sea.Level.PressureIn Min.Sea.Level.PressureIn # 0.006 0.004 0.015 # Max.VisibilityMiles Mean.VisibilityMiles Min.VisibilityMiles # 0.066 0.004 -0.016 # Max.Wind.SpeedMPH Mean.Wind.SpeedMPH Max.Gust.SpeedMPH # 0.008 0.002 0.018 # PrecipitationIn CloudCover Events # -0.037 0.041 -0.001 # WindDirDegrees.br... MP DayLength # -0.006 1.000 0.036 # tail(sort(abs(round(cor(weather)[23,], digits=3))), 6) # DayLength PrecipitationIn CloudCover Max.Humidity Max.VisibilityMiles # 0.036 0.037 0.041 0.042 0.066 # MP # 1.000

All the r values seem very small: >0.1. But some of the more plausible correlations may be statistically-significant, so we’ll look at the 5 largest correlations

sapply ( c ( "DayLength" , "PrecipitationIn" , "CloudCover" , "Max.Humidity" , "Max.VisibilityMiles" ), function (x) cor.test (weather $ MP, weather[[x]])) # ... # DayLength PrecipitationIn # p.value 0.4099 0.394 # estimate 0.03594 -0.03717 # # CloudCover Max.Humidity # p.value 0.3445 0.3319 # estimate 0.04122 0.04231 # # Max.VisibilityMiles # p.value 0.1292 # estimate 0.06611

These specific variables turn out to be failures too, with large p-values.

One useful technique is to convert metrics into standardized units (of standard deviations), sum them all into a single composite variable, and then test that; this can reveal influences not obvious if one looked at the metrics individually. (For example, in my potassium sleep experiments, where I was interested in an overall measure of reduced sleep quality rather than single metrics like sleep latency.) Perhaps this would work here?