Submitted on June 1, 2011

The recent New York Times article, “Tests for Pupils, but the Grades Go to Teachers," alerts us of an emerging paradox in education – the development and use of standardized student testing solely as a means to evaluate teachers, not students. “We are not focusing on teaching and learning anymore; we are focusing on collecting data," says one mother quoted in the article. Now, let’s see: collecting data on minors that is not explicitly for their benefit – does this ring a bell?

In the world of social/behavioral science research, such an enterprise – collecting data on people, especially on minors – would inevitably require approval from the Institutional Review Board (IRB). For those not familiar, IRB is a committee that oversees research that involves people and is responsible for ensuring that studies are designed in an ethical manner. Even in conducting a seemingly harmless interview on political attitudes or observing a group studying in a public library, the researcher would almost certainly be required to go through a series of steps to safeguard participants and ensure that the norms governing ethical research will be observed.

Very succinctly, IRBs’ mission is to see that (1) the risk-benefit ratio of conducting the research is favorable; (2) any suffering or distress that participants may experience during or after the study is understood, minimized, and addressed; and (3) research participants’ agreed to participate freely and knowingly – usually, subjects are requested to sign an informed consent which includes a description of the study’s risks and benefits, a discussion of how confidentiality will be guaranteed, a statement on the voluntary nature of involvement, and a clarification that refusal or withdrawal at any time will involve no penalty or loss of benefits. When the research involves minors, parental consent and sometimes child assent are needed.

In short, IRB procedures exist to protect people. To my knowledge, student evaluation procedures and standardized testing are exempt from this sort of scrutiny. So the real question is: Should they be? Perhaps not.

One would think unethical research is something of the past but the truth is there are still many lapses and areas that lack sufficient regulation – See here for a discussion of the use of human tissue samples in medical research, a la Henrietta Lacks. Is standardized testing an area that managed to fly under our ethical/IRB radar?

In my experience, IRB asks the tough questions that researchers – sometimes too enamored with their work – may feel compelled to overlook: Do you really need these data? Why? What is to be gained and learned from the research? What are the downsides? In addition to safeguarding participants, these questions are conducive to rigorous and theory-driven data collection – there’s no “collect data first, ask questions later," so to speak, but quite the opposite.

This post is not about the appropriate use of student test scores or whether they should play a part in teacher evaluations. This is about looking at these questions from a different angle, using an entirely different framework. If collecting data on kids is what we are doing, then, at a minimum, shouldn’t we be doing so in an ethical manner? This is not to suggest that current practices are unethical; I can’t speak to that. But in the world of social science research – educational tests being one of the few exceptions – any work that involves research on humans is subject to IRB scrutiny and supervision, which acts as a sort of quality assurance that seems to be missing in the case of student testing.

Such a framework should also help structure important questions that many people are already asking about standardized tests – Are they worth my kid’s time? Are they a good measure of student learning? How will they be used to help improve my child’s instruction? What’s else could my child be doing if he/she wasn’t busy taking tests or doing test prep? In what ways may tests be negative for my kid? Should students be aware that their scores will be used to evaluate their teachers and why (or why not)? Do parents know that they have the right to opt out of having their children take these tests?

Many of these issues are already being raised; what seems to be missing is a framework under which all of these questions are routinely, inclusively, and simultaneously asked for each, individual instance of testing. Only such a schema, I argue here, would be conducive to both improved research designs (i.e., better data for better teacher evaluation models) and a coherent and comprehensive set of answers for parents, kids, and educators.

- Esther Quintero