Christian Robert is planning a graduate seminar in which students read 15 classic articles of statistics. (See here for more details and a slightly different list.)

Actually, he just writes “classics,” but based on his list, I assume he only wants articles, not books. If he wanted to include classic books, I’d nominate the following, just for starters:

– Fisher’s Statistical Methods for Research Workers

– Snedecor and Cochran’s Statistical Methods

– Kish’s Survey Sampling

– Box, Hunter, and Hunter’s Statistics for Experimenters

– Tukey’s Exploratory Data Analysis

– Cleveland’s The Elements of Graphing Data

– Mosteller and Wallace’s book on the Federalist Papers.

Probably Cox and Hinkley, too. That’s a book that I don’t think has aged well, but it seems to have had a big influence.

I think there’s a lot more good and accessible material in these classic books than in the equivalent volume of classic articles. Journal articles can be difficult to read and are typically filled with irrelevant theoretical material, the kind of stuff you need to include to impress the referees. I find books to be more focused and thoughtful.

Accepting Christian’s decision to choose articles rather than books, what would be my own nominations for “classics of statistics”? To start with, there must be some tradeoff between quality and historical importance.

One thing that struck me about the list supplied by Christian is how many of these articles I would definitely not include in such a course. For example, the paper by Durbin and Watson (1950) does not seem at all interesting to me. Yes, it’s been influential, in that a lot of people use that statistical test, but as an article, it hardly seems classic. Similarly, I can’t see the point of including the paper by Hastings (1970). Sure, the method is important, but Christian’s students will already know it, and I don’t see much to be gained by reading the paper. I’d recommend Metropolis et al. (1953) instead. And Casella and Strawderman (1981)? A paper about minimax estimation of a normal mean? What’s that doing on the list??? The paper is fine–I’d be proud to have written it, in fact I’d gladly admit that it’s better than 90% of anything I’ve ever published–but it seems more of a minor note than a “classic.” Or maybe there’s some influence here of which I’m unaware. And I don’t see how Dempster, Laird, and Rubin (1977) belongs on this list. It’s a fine article and the EM algorithm has been tremendously useful, but, still, I think it’s more about computation than statistics. As to Berger and Sellke (1987), well, yes, this paper has had an immense influence, at least among theoretical statisticians–but I think the paper is basically wrong! I don’t want to label a paper as a classic if it’s sent much of the field in the wrong direction.

For other papers on Christian’s list, I can see the virtue of including in a seminar. For example, Hartigan and Wong (1979), “Algorithm AS 136: A K-Means Clustering Algorithm.” The algorithm is no big deal, and the idea of k-means clustering is nothing special. But it’s cool to see how people thought about such things back then.

And Christian also does include some true classics, such as Neyman and Pearson’s 1933 paper on hypothesis testing, Plackett and Burnham’s 1946 paper on experimental design, Pitman’s 1939 paper on inference (I don’t know if that’s the best Pitman paper to include, but that’s a minor issue), Cox’s hugely influential 1972 paper on hazard regression, Efron’s bootstrap paper, and classics by Whittle and Yates. Others I don’t really feel so competent to judge (for example, Huber (1985) on projection pursuit), but it seems reasonable enough to include them on the list.

OK, what papers would I add? I’ll list them in order of time of publication. (Christian used alphabetical order, which, as we all know, violates principles of statistical graphics.)

Neyman (1935). Statistical problems in agricultural experimentation (with discussion). JRSS. This one’s hard to read, but it’s certainly a classic, especially when paired with Fisher’s comments in the lively discussion.

Tukey (1972). Some graphic and semigraphic displays. This article, which appears in a volume of papers dedicated to George Snedecor, is a lot of fun (even if in many ways unsound).

Akaike (1973). Information theory and an extension of the maximum likelihood principle. From a conference proceedings. I prefer this slightly to Mallows’s paper on Cp, written at about the same time (but I like the Mallows paper too).

Lindley and Smith (1972). Bayes estimates for the linear model (with discussion). JRSS-B. The methods in the paper are mostly out of date, but it’s worth it for the discussion (especially the (inadvertently) hilarious contribution of Kempthorne).

Rubin (1976). Inference and missing data. Biometrika. “Missing at random” and all the rest.

Wahba (1978). Improper priors, spline smoothing and the problem of guarding against model errors in regression. JRSS-B. This stuff all looks pretty straightforward now, but maybe not so much so back in 1978, back when people were still talking about silly ideas such as “ridge regression.” And it’s good to have all these concepts in one place.

Rubin (1980). Using empirical Bayes techniques in the law school validity studies (with discussion). JASA. Great, great stuff, also many interesting points come up in the discussion. If you only want to include one Rubin article, keep this one and leave “Inference and missing data” for students to discover on their own.

Hmm . . . why are so many of these from the 1970s? I’m probably showing my age. Perhaps there’s some general principle that papers published in year X have the most influence on graduate students in year X+15. Anything earlier seems simply out of date (that’s how I feel about Stein’s classic papers, for example; sure, they’re fine, but I don’t see their relevance to anything I’m doing today, in contrast to the above-noted works by Tukey, Akaike, etc., which speak to my current problems), whereas anything much more recent doesn’t feel like such a “classic” at all.

OK, so here’s a more recent classic:

Imbens and Angrist (1994). Identification and estimation of local average treatment effects. Econometrica.

Finally, there are some famous papers that I’m glad Christian didn’t consider. I’m thinking of influential papers by Wilcoxon, Box and Cox, and zillions of papers of that introduced particular hypothesis tests (the sort that have names that they tell you in a biostatistics class). Individually, these papers are fine, but I don’t see that students would get much out of reading them. If I was going to pick any paper of that genre, I’d pick Deming and Stephan’s 1940 article on iterative proportional fitting. I also like a bunch of my own articles, but there’s no point in mentioning them here!

Any other classics you’d like to nominate (or places where you disagree with me)?