Text Size A A

The last time the National Research Council updated its rankings of U.S. graduate programs, in 1995, it sent an advance copy to C&EN—not an e-mailed file, but a hardcover book. Also, many of today’s graduate students were not yet in high school. Times have changed since then. So have graduate departments in chemistry and chemical engineering.

Up-to-date evaluations of doctoral programs matter, for reasons from graduate student recruitment to funding distribution. So in the 2005–06 academic year, the council embarked on an ambitious new assessment, evaluating more than 5,000 doctoral programs in 62 fields, including chemistry and chemical engineering. Universities have been chomping at the bit for the council’s new findings ever since. Late last month, they got what they so eagerly awaited—sort of. The new results are anything but straightforward numerical rankings—they are a complex set of ranking ranges, several for each school, accompanied by a vast collection of data about the programs themselves. And they’ve left universities searching for meaning and direction in the numbers.

The NRC committee that produced the assessment says it deliberately avoided traditional ordinal rankings (C&EN, Oct. 4, page 5). Instead, it emphasized that rankings are never perfectly precise and explained that the immense collection of data can be used to produce ratings that are customized for a given university’s or program’s concerns and priorities.

The committee had its reasons for getting rid of hard-and-fast rankings. The 1995 assessment, which had numerical rankings, was criticized for implying falsely high levels of precision, says Jeremiah Ostriker, an astrophysicist and former Princeton University provost who chaired the NRC report committee. “We attempted to provide a balanced assessment, and that is a perilous enterprise to undertake,” Ostriker said at a public briefing held on the day NRC released the assessment. “There’s no rigorous mathematical formula that ensures a correct set of values” when it comes to ranking anything, he added.

The committee’s explanations didn’t quell criticism of the report—which cost more than $4 million to produce—in the media and online. “It appears that the ranking system just got a lot more complicated while still telling us essentially nothing,” commented Ryan Carroll, who recently graduated with a B.S. degree in chemical biology from the University of California, Berkeley, on a discussion of NRC results at the social news website Reddit. He is considering graduate school in biochemistry.

“NRC chose a particularly opaque way of presenting the data,” says Charles A. Wight, a physical chemist and dean of the graduate school at the University of Utah. “I think what they misjudged was how much people would use their methodology against them.”

“Can you imagine what would happen if the [Associated Press college] football poll were done with ranges?” says Timothy Barbari, a chemical engineer and dean of the graduate school at Georgetown University. “It would drive people crazy.”

Others say NRC’s approach makes sense. “It’s a scientific way of presenting the data,” says Franz M. Geiger, associate chair of the chemistry department at Northwestern University, which achieved high overall ranking ranges of between second and 17th and between third and ninth in the nation in NRC’s new assessment. “Thousands of people provided input. How can that possibly be summarized in one number, without any uncertainty?”

In hard-and-fast rankings, distinctions drawn between a program ranked fifth and a program ranked seventh are negligible at best, says physical organic chemist Marye Anne Fox, chancellor of UC San Diego. “You can easily distinguish a top 10 department from a top 50 department,” she says. Trying to rank with any more precision is a meaningless exercise, she says.

Still others emphasized the value of the NRC’s database. “I think there’s a lot of useful data in there,” says chemist H. Holden Thorp, chancellor of the University of North Carolina, Chapel Hill, where the chemistry department achieved high overall ranking ranges of between fourth and 14th and between eighth and 23rd in the nation in NRC’s report.

C&EN’s analysis of the database shows that, in NRC’s evaluation process, faculty seemed to unconsciously value size as a measure of program quality. Measures of diversity were deemed far less valuable. And public programs were underrepresented in the top 25% of the ranking ranges.

The two overall ranking ranges were calculated by assigning weights to characteristics chemistry faculty said they valued in a graduate program. When faculty were directly asked what made a good graduate program, grants, publications, and citations all come out on top. These were the factors that determined the S, or survey-based, ranking ranges. But when faculty rated programs themselves, program size, in terms of the average number of doctorates awarded per year, is by far the most correlated with high marks. These weights went toward calculating R, or regression-based, ranking ranges.

Faculty didn’t necessarily rate big programs highly because they are big. However, big programs tend to have strengths that fall in line with faculty’s other preferences—C&EN’s analysis of NRC’s database suggests that faculty from bigger programs have on average more publications and more citations than their colleagues in smaller programs.

Strength for a large program “requires lots of world-class faculty. You’re able to cover more areas in chemistry,” says Daniel Neumark, chair of the chemistry department at UC Berkeley, which graduated more chemistry Ph.D.s than any other between 2002 and 2006. “If you took our department and randomly crossed out half the names, the publications per faculty member might be similar, but the department wouldn’t be as good,” he says. “So in that sense, size has a meaning of its own apart from per-faculty metrics.”

“Chemistry deals with laboratories and very sophisticated equipment,” adds Mark Ratner, chair of the Northwestern chemistry department. “To purchase state-of-the-art mass specs and things like that requires a pretty big program to amortize the cost,” he adds.

“Size is an issue, but I don’t think it’s the issue. I think focus is the issue and talent is the issue,” Ratner says.

No matter how the weights are compared, diversity landed low on the totem pole: Criteria such as the percent of underrepresented minority faculty and the percent of female students were not valued anywhere near as highly as other measures. And in the R-ranking ranges for chemistry and chemical engineering programs, many diversity criteria had negative correlations, meaning that more diverse programs were less well regarded in faculty’s rating exercise.

“It’s self-serving,” says Karen Gale, a professor of pharmacology at Georgetown University who asked the NRC committee about diversity at a public briefing on Sept. 28. Most of the top programs are not diverse, she says. “Why would programs value what they don’t have?”

“There are already many studies that tell us we are not sufficiently diverse,” says University of Oklahoma chemist Donna J. Nelson, who has published extensive surveys on diversity in graduate education. “I don’t know if this one will affect how much attention departments are paying to diversity.”

The importance someone ascribes to diversity may depend on how they frame it, Nelson adds. Diversity might be considered on par with ethics in terms of importance, if one thinks that diversity should be pursued because it’s the right thing to do, she says. “However, if one considers diversity from the practical standpoint, that it is necessary for our country to maintain economic and scientific strength, it could fall in the same degree of importance as patents or technology programs,” she says.

Yet another report won’t matter in making improvements on the diversity front, agrees Gloria Thomas of Xavier University of Louisiana, who was principal investigator of a National Science Foundation grant that funded the Women Chemists of Color Summit symposium held in August at the American Chemical Society national meeting in Boston (“CENtral Science,” The Editor’s Blog, Aug. 25). “The change has to happen with individuals, with their own perceptions and their own personal biases.”

“We still have quite a lot of work to do to make chemistry more inclusive,” Thorp adds. “But I think the quality of research will improve when we do that.”

"Can you imagine what would happen if the [Associated Press college] football poll were done with ranges? It would drive people crazy."

Beyond the weights that determined the ranking ranges, the ranges themselves contain reason for reflection. For example, public universities are the majority among the 180 programs offering chemistry doctorates that NRC evaluated. C&EN sorted NRC’s database in descending order of S- and R-ranking ranges and determined that public institutions are underrepresented among the top 25% of programs.

“I think that’s a concern across lots of measures throughout higher education,” Thorp says. Private universities have more flexibility to build departments strategically, and they don’t have to lobby a state legislature for funding, he says. But he notes that change is possible at public universities: UNC Chapel Hill put a plan in place to improve its chemistry program during a recession and then acted to build infrastructure and hire people when cash became available. “You have to be opportunistic when you’re at a public institution,” he says.

Although the report’s biggest strength may be its database, some universities have raised concerns about errors in the data. For instance, the University of Washington, Seattle, reported widespread inaccuracies, including in its chemical engineering department. Daniel T. Schwartz, chair of that department, told C&EN his university’s interpretation of the survey led to vastly overstated numbers of faculty in his department.

He says the inaccuracies could impact universities besides his own for two reasons. First, he says different schools interpreted NRC’s questions about faculty numbers differently, an inconsistency that could confound the many metrics that are on a per-faculty-member basis. And second, his university’s error might be amplified if it was one of the schools used in determining the R-ranking weights, information that NRC has not disclosed. “I’m less worried about our ranking than I am about data integrity,” he says.

The committee has announced a process for evaluating possible mistakes and corrections, which will be posted at sites.nationalacademies.org/PGA/Resdoc/PGA_044475. The committee plans to recalculate ranking ranges in response to updated data, but only if errors are deemed to be on NRC’s part, says Charlotte Kuh, who directed the NRC study.

For now, universities are watching for short-term impacts of NRC’s report and planning to use it for long-term improvements. “We certainly will be watching our applications this year to see whether there’s any noticeable change,” says chemical engineer William B. Russel, dean of the graduate school at Princeton University. Russel is also comparing Princeton with its peers on measures he hasn’t had easy access to before, such as the percentage of women in the sciences and the availability of professional development opportunities for students.

“One of the most important things this study has done is to put the metrics on the table,” Wight adds. When NRC came up with its 20 graduate program criteria, it sent a message that this was data worth collecting regularly, he explains. As a result, Utah has begun keeping track of things it hadn’t done systematically before, such as median time to degree, he adds.

And NRC’s database makes strategizing for long-term improvements easier, Wight adds. “It’s a matter of looking at the data and searching for low-hanging fruit. Sometimes there might be trivially easy things you can do to bring your program in line with top-tier programs,” he says.

As for NRC, the committee hopes to hold workshops this winter to discuss uses of the data with universities. Now that the methodology is in place, they hope to raise enough money to update portions of the ranking ranges in two to three years.

“There’s no shortage of rankings out there, but it’s probably good to have another way of looking at things that is statistically based,” Thorp says. “It’s good for higher education that these data have come out.”