Update #9: Quandl API is Deprecated

According to an email I got from Quandl (and a few commenters corroborating), the Quandl EOD data API is no longer supported and is not providing data past March 27th. According to the CIO of Quandl, it was being provided for free by a 3rd party. That 3rd party is no longer providing the data, forcing us to search for other options.

Update #8: Yahoo! Finance API Back Up

The plot thickens once again! According to QuandMod’s getSymbols() function, the Yahoo! Finance API is currently the best free API that does not require registration. The URL endpoint of the Yahoo API, and the steps required to make a single download, have lengthened considerably.

Writing out the code required make a single download the new Yahoo API is more of a lesson in networking and cookie management than trading, so I will refrain from publishing it for the moment.

Switch to Quandl for Less Drama

In better news, Quandl has created an excellent API for downloading daily stock data. Their daily stock API is sort of a “free trial” for their premium subscription products. I spoke to the Quandl team about using it for frequent downloads and educational purposes, and they were fully supportive. So, I urge readers to stop depending on the fickle free services from tech giants, and sign up for an account and API key with Quandl.

Credit to GitHub user johnatasjmo for this solution:

# Quandl package must be installed library(Quandl) # Get your API key from quandl.com quandl_api = "MYAPIKEY" # Add the key to the Quandl keychain Quandl.api_key(quandl_api) quandl_get <- function(sym, start_date = "2017-01-01") { require(devtools) require(Quandl) # create a vector with all lines tryCatch(Quandl(c( paste0("WIKI/", sym, ".8"), # Adj. Open paste0("WIKI/", sym, ".9"), # Adj. High paste0("WIKI/", sym, ".10"), # Adj. Low paste0("WIKI/", sym, ".11"), # Adj. Close paste0("WIKI/", sym, ".12")), # Adj. Volume start_date = start_date, type = "zoo" )) }

Update #7: Google Finance API Down

Because everything I write about breaks, the Google Finance API is now only delivering the most recent year’s data, regardless of what parameters are passed to it.

Update #6: Google Finance API is Hidden but Live

After more research, I discovered that R’s quantmod package is using a hidden version of the Google Finance API that works just fine. Below is our old Yahoo! Finance function re-written to support the Google Finance API. It is actually shorter and more easily read.

# Make sure data.table is installed if(!'data.table' %in% installed.packages()[,1]) install.packages('data.table') # Function to fetch google stock data google <- function(sym, current = TRUE, sy = 2005, sm = 1, sd = 1, ey, em, ed) { if(current){ system_time <- as.character(Sys.time()) ey <- as.numeric(substr(system_time, start = 1, stop = 4)) em <- as.numeric(substr(system_time, start = 6, stop = 7)) ed <- as.numeric(substr(system_time, start = 9, stop = 10)) } require(data.table) google_out = tryCatch( suppressWarnings( fread(paste0("http://www.google.com/finance/historical", "?q=", sym, "&startdate=", paste(sm, sd, sy, sep = "+"), "&enddate=", paste(em, ed, ey, sep = "+"), "&output=csv"), sep = ",")), error = function(e) NULL ) if(!is.null(google_out)){ names(google_out)[1] = "Date" } return(google_out) } # Test it google_data = google('GOOGL')

Downloading All S&P 500 Stocks with Google

I believe I encountered some rate limiting when I attempted to do this, but it works nonetheless.

# Load list of symbols (Updated May 2017) SYM <- as.character( read.csv('http://trading.chrisconlan.com/SPstocks_current.csv', stringsAsFactors = FALSE, header = FALSE)[,1] ) # Hold stock data and vector of invalid requests DATA <- list() INVALID <- c() # Attempt to fetch each symbol for(sym in SYM){ google_out <- google(sym) if(!is.null(google_out)) { DATA[[sym]] <- google_out } else { INVALID <- c(INVALID, sym) } } # Overwrite with only valid symbols SYM <- names(DATA) # Remove iteration variables rm(google_out, sym) cat("Successfully download", length(DATA), "symbols.") cat(length(INVALID), "invalid symbols requested.

", paste(INVALID, collapse = "

\t")) cat("We now have a list of data frames of each symbol.") cat("e.g. access MMM price history with DATA[['MMM']]")

You should now be able to access the historical data of any symbol in the S&P 500 using, for example, DATA[[‘MMM’]] for 3M.

Notes on Performance and Usage

This downloads much faster than the old Yahoo! Finance API

I believe I encountered rate limiting after downloading about 200 symbols consecutively. This is not necessarily bad, it just means Google is doing a good job of controlling throughput. This is evidence that the API is better-constructed and less likely to be torn down. I found no documentation pointing to issues with multiple requests.

I have found evidence of a Google representative stating that this API is not to be used in the backend of any public application. I am curious about whether or not I will be able to use this code in Automated Trading with R in light of this statement. I believe it will not be a problem since quantmod has been publicly using it for so many years.

The data downloaded here is very similar to that used in Automated Trading with R. I believe the book can be adapted to use a service like this, although it will necessarily be less complex. There is a good amount of information missing from this historical data.

Volume seems to be missing from a good amount of symbols, and all prices are returned adjusted in advance. This is helpful computationally but removes a lot of implicit information about dividends and splits.

This API seems to download a strict maximum of 15 years of historic data. This is plenty of data, but I did have to switch the default start date up from 2000 to 2005 to accommodate this.

All of Google API’s are good about accepting a variety of date formats (and address formats for Maps), and this is no exception. As long as the dates are in month-day-year order with some delimiter like a dash, space, or plus sign, Google will interpret them correctly.

Altogether I think this a good replacement for Yahoo! Finance in its ability to provide free data for the purpose of retail trading and education. The trouble is, Yahoo’s API was hidden but still well-known. This API is hidden but does not seem to be very well-known. I will be making a post dedicated to this API for both R and Python in the future with the hope of making it the new educational standard for trading data.

Python Code

See this post: Download Historical Stock Data with Python for the equivalent code in Python.

Update #5: Google Finance API is Dead… Maybe?

I am not sure when this happened, but the Google Finance API is also dead. I know this must have been recently, because R’s quantmod package used to rely on this as its primary data source. I will keep searching for good free solutions and update this post with what I learn.

This is This is confusing to me, because R’s quantmod can still use Google as a source for historical price data. For example…

library(quantmod) # This fails because it defaults to Yahoo getSymbols("GOOG") # This fails because it specifies Yahoo getSymbols("GOOG", src = "yahoo") # This works, even in light of the above message getSymbols("GOOG", src = "google")

So, quantmod is still using Google as a source of data even though the finance API is dead, and has been since 2011 according to the above webpage. I will have to dive into quantmod’s source to figure out what is going on here. Updates to come.

Update #4: Yahoo! Finance API is Dead

After much discussion over the last two months, it is safe to say the Yahoo! Finance API will not be returning.

The question we are all trying to answer is, “What do we use now?” At this juncture, the answer will be different depending your means and goals. A few common situations are detailed below.

As an Author: I need a functional, comprehensive, and free API to support my book, Automated Trading with R. I am leaning towards Google Finance, and the other API’s that support the popular R package quantmod. The quantmod package built in a few fail-safes to Yahoo! Finance while still primarily relying on it, and seems for the most part unaffected by the outage. I expect to make some sacrifices regarding the volume and breadth of data while reworking the book, but these are welcome sacrifices in the name of accessibility and education.

As a Trader: I am comfortable paying premium rates for reliable data. Users in the comments of this post have suggested a handful of good options. There are a lot of low-cost services that attempt to be drop-in replacements for Yahoo! Finance, and there are a few high-value services that offer much more than Yahoo! Finance users are used to. Quandl, Bloomberg, and Reuters are examples of these high-value services, which, while costing significantly more, are very easy to scale and worth learning.

As a Researcher and Open-Source Developer: I feel that the Yahoo! Finance API’s death has destroyed a lot of opportunity for creating and sharing reproducible research. While researchers are not allowed to copy and disseminate Yahoo! Finance data, we are allowed to publish research that points to it. In other words, when we publish code, we include the download script in the beginning to allow “sharing” of financial time series data. It is an approach that gives Yahoo! credit where it is due, and still promotes sharing and learning. Now that Yahoo! Finance is gone, there is no standard approach to sharing financial time series data. I hope to make some posts in the near future to standardize a new free API (Google, Bloomberg, etc.) across popular languages.

In summary, we will rebuild.

Update #3: Yahoo! YQL API for Finance Data is also Affected

With hope for a new API but not clear end in sight, I got to work building a YQL-only solution for downloading historical price data. This was to serve as a backup in case we ultimately got no clarity on the CSV API. I discovered fairly quickly that the YQL API for accessing stock data, as it is popularized on the internet and as it is expressed in Automated Trading with R, depends internally on the http://ichart.yahoo.com API and is therefore inaccessible.

In the same way that we see “Could not process this ‘GET’ request” on the CSV API, the YQL API simply returns a JSON list of all of the stocks requested, each with its own individual “Could not process this ‘GET’ request” message. So, the YQL API itself is not down or unhealthy, rather, it is doing its job very well by alerting us that each individual stock is inaccessible.

Update #2: Yahoo! CSV API is Live but Undocumented

As of May 27th, 2017, we are seeing movement in the Yahoo! Finance CSV API. It appears not all hope is lost. Calls to https://ichart.yahoo.com no longer return “We are working on it” messages, rather, they return “Could not process this request” messages.

What we appear to have now is a functional but totally undocumented API.

The facts at this point:

All requests, like GET, POST, PUT, and DELETE, all return a message along the lines of “Description: Could not process this request.”

URLs are still being auto-forwarded to HTTPS

We no longer have a “Coming Soon” or “Working on it” message

API Calls using the old parameters return this error message, even with alternate HTTP request types.

I cannot find new documentation on this API

I have emailed Yahoo! about this and hope to receive a response soon. I am pleased that my original prediction of receiving a restructured API seems to be correct, as this will necessitate fewer and less drastic changes to Automated Trading with R.

Please comment below if you have any information to share that is not presented in this post.

Update: Yahoo! API Inactive and Pending Changes

As my luck would have it, the Yahoo! Finance API at ichart.yahoo.com went down within hours of me making this post. As of the afternoon of May 16th, 2017, the API calls in the script below will not download anything meaningful. Fortunately, Yahoo! seems to be making some structural changes and improvements to the API.

If you enter an API call into your web browser, you will see this splash screen:

The Facts

Here is what we know so far:

Countless open-source software packages and educational resources depend on the free Yahoo! Finance API

Yahoo! has made no official statement about rescinding public access to this data.

The API has been called “secret” and “hidden” because it is dificult to find official Yahoo! documentation on it. It has been mostly taught and passed along by knowledgeable professionals and educational resources.

The splash screen says “Our engineers our working quickly…”

All calls to ichart.yahoo.com are auto-converting to HTTPS regardless of their validity.

Calls to ichart.yahoo.com are setting new cookies that we haven’t seen before.

My Prediction

I am inclined to believe that Yahoo!’s engineers are indeed working to resolve the issue. As I have watched this situation evolve, I have seen the endpoint die, then get a splash screen, then get hooked up to SSL, then set new cookies. All of this points to a significant restructuring effort.

I believe the Yahoo! Finance API will update the structure of its calls, publish some legitimate documentation, and begin requiring authentication to access it. Hopefully that happens quickly, and I can update the scripts below. I will be watching vigilantly for this update, and likely publish a new edition of Automated Trading with R upon its release.

Original Post: Downloading Price History with R

In Automated Trading with R, we build a complete automated trading platform using the R language. In the second chapter, we get 15+ years of daily price data on every stock in the S&P 500 loaded into R using free API’s. The code required to do this is surprisingly brief and straightforward.

Thanks, Yahoo!

I have copied an R script below that will load historical price data of every S&P 500 right into R using the Yahoo! Finance API. In the book, we expand on this, explain how it works, and continue to refine this data.

Hopefully this script will serve as a quick solution to systems traders searching for a significant and reliable data source.

# Load list of symbols (Updated May 2017) SYM <- as.character( read.csv('http://trading.chrisconlan.com/SPstocks_current.csv', stringsAsFactors = FALSE, header = FALSE)[,1] ) # Make sure data.table is installed if(!'data.table' %in% installed.packages()[,1]) install.packages('data.table') # Function to fetch yahoo data yahoo <- function(sym, current = TRUE, a = 0, b = 1, c = 2000, d, e, f, g = "d") { if(current){ system_time <- as.character(Sys.time()) f <- as.numeric(substr(system_time, start = 1, stop = 4)) d <- as.numeric(substr(system_time, start = 6, stop = 7)) - 1 e <- as.numeric(substr(system_time, start = 9, stop = 10)) } require(data.table) tryCatch( suppressWarnings( fread(paste0("http://ichart.yahoo.com/table.csv", "?s=", sym, "&a=", a, "&b=", b, "&c=", c, "&d=", d, "&e=", e, "&f=", f, "&g=", g, "&ignore=.csv"), sep = ",")), error = function(e) NULL ) } # Hold stock data and vector of invalid requests DATA <- list() INVALID <- c() # Attempt to fetch each symbol for(sym in SYM){ yahoo_return <- yahoo(sym) if(!is.null(yahoo_return)) { DATA[[sym]] <- yahoo_return } else { INVALID <- c(INVALID, sym) } } # Overwrite with only valid symbols SYM <- names(DATA) # Remove iteration variables rm(yahoo_return, sym) cat("Successfully download", length(DATA), "symbols.") cat(length(INVALID), "invalid symbols requested.

", paste(INVALID, collapse = "

\t")) cat("We now have a list of data frames of each symbol.") cat("e.g. access MMM price history with DATA[['MMM']]")

Save Data to CSV Files

After running the above, run the following to save each stock’s price history as a CSV file to a folder in your working directory.

# Folder where you are storing the data setwd('~/Desktop/stockdata') # for( sym in names(DATA) ){ write.csv(DATA[[sym]], paste0(sym, '.csv'), row.names = FALSE) cat('Wrote', sym, 'data to CSV file.

') }

Navigate to the folder you set above to see hundreds of CSV files.

See my study of the S&P 500’s long-term behavior for some analysis on this type of data.

Bonus: Golfing the Download Script

To showcase just how efficient the R Language can be, we have condensed the first script from this post into under 400 characters (about 3 tweets). There is less error checking, and we do not make use of the more efficient data.table package, but the list D is more or less the same as DATA from the first two examples.

l='http://trading.chrisconlan.com/SPstocks_current.csv' S=read.csv(l,header=FALSE)[,1] f=function(s){ b=c(0,1,2000) i=list(c(6,7),c(9,10),c(1,4)) v=c(b,sapply(i,function(v)substr(Sys.time(),v[1],v[2])),'d') v[5]=as.numeric(v[5])-1 read.csv(paste0("http://ichart.yahoo.com/table.csv?s=",s,paste0('&',letters[1:7],'=',v,collapse=''),'&ignore=.csv'))} D=list() for(s in S)D[[s]]=f(s)

Note: Check that ichart.yahoo.com is Healthy

If your code is hanging, visit ichart.yahoo.com and check to make sure it is live. It has been experiencing occasional outages.