What Parents Need To Know About Big Data And Student Privacy

Enlarge this image iStockphoto iStockphoto

INTERACTIVE: What Kinds Of Data Do States Collect?

My first brush with professional journalism — and with violations of student privacy — came when I was a sophomore at Yale. It was 1999, and George W. Bush, a Yale alumnus, was running for president.

A writer for The New Yorker cold-called my dorm room looking for students who might have access to Bush's records. By sheer coincidence, a friend of mine who worked in the dean's office had, out of curiosity, lifted W's transcript from the files. A Deep Throat-style handoff was arranged, anonymity assured, and the candidate's grades ran in the magazine. They were mostly C's.

Today, getting ahold of the transcript of a VIP — or any student — would require less in-person skulduggery and more clever computer searching. That's because student data have largely moved online in just the past few years. Information is being collected and distributed at unprecedented scale, from the time that toddlers enter preschool all the way into the workforce.

And that shift is forcing policymakers and legal experts to improvise new policies and procedures aimed at protecting the privacy of young people. Critics fear the misuse of student data by hackers, marketers and, most worryingly, by the government authorities who themselves are collecting it.

Concern is growing:

This month, a working group on big data and privacy appointed by President Obama released its findings. Alongside recommendations to update the "Consumer Privacy Bill of Rights" and pass national legislation regarding data breaches, the group singled out data collected on students in school as a matter of special concern.

In March, New York became the first state to make it someone's job to oversee this vital issue, creating a position called chief privacy officer in the Education Department. The job description? "Establishing standards for educational agency data security and privacy policies." Translation: providing the state's 698 school districts and over 500 colleges and universities, as well as state agencies, with uniform approaches to managing — and protecting — student data such as test scores, transcripts, health information, even dates of birth, racial or ethnic standing and Social Security numbers.

In that same legislation, New York became the last state to pull the plug on InBloom. The project was supposed to create a shared infrastructure for storing student data and making them available to educational software developers, but it had to shut down after drawing the ire of privacy advocates.

But the potential here is great as well. A report by McKinsey last year singled out education as the sector that could benefit the most from the free exchange of data, adding as much as $1.2 trillion to the economy through more efficient, effective instruction.

And better use of data in education could even prevent tragedy, such as the death of Avonte Oquendo, a 14-year-old autistic boy who wandered off school property in Queens after his mother had warned his teacher he "likes to run."

Student data used to be the pet cause of a small group of lawyers and activists. Now, in part because of the InBloom controversy, the issue is gaining broader attention. This year, 82 bills in 32 states have been introduced that somehow address student privacy.

But what, exactly, is new here? How worried should you be as a parent? And what are the remedies?

What's New?

Schools have always kept records. They need all sorts of information: from students' names and addresses to grade books, attendance, transcripts and disciplinary records; information about health and learning disabilities; even family income numbers that determine eligibility for the federal free and reduced lunch program. Once, these files were on paper, and later they were stored on hard drives at individual schools and districts.

Until very recently, as students moved from elementary to middle school and high school and college, little more than a one-page transcript followed.

In 2005, things changed. The federal government began awarding grants for the creation of Statewide Longitudinal Data Systems, or SLDS. That marked the entrance of big data into education, enabled by the leaps forward in the ability to store and process information on remote servers "in the cloud." States and schools for the first time could centralize, organize, search and analyze information on millions of students, in the ways that corporations have been doing for decades. And many for-profit companies, like Google, have rushed in to help them do this, providing software to collect and crunch this information.

As the name implies, Statewide Longitudinal Data Systems create unique numbers to identify students and track them from the day they enter kindergarten, or even preschool. In other words, a wealth of information, contained in a single record, can follow a student for 20 years or more. (For a list of exactly what states are tracking, see the interactive graphic on the left.)

In some states these systems are being extended and shared across state lines, to private and for-profit colleges, and with employers. They're also being used in at least 17 states to track teachers from their grades in teachers' college, to their students' performance in the classroom.

So, you can see why privacy advocates have concerns. "My younger son's records were breached when he was in college," says Sheila Kaplan, a New York student privacy activist who was involved in drafting the legislation creating a chief privacy officer. "I was amazed. Why would you collect records that you can't protect? As more parents become aware of this, they are freaked out."

"Most people, when you say the word 'educational data,' the first thing that comes to mind is a test score," says Aimee Guidera of the Data Quality Campaign, a group backing the data push. "We're helping to redefine what data is in education, so it really is a wide breadth of data points that come together to provide a richer picture."

Here's an example. In 2013, New York City learned with the help of student data tracking that almost four of five public high school graduates needed remediation when they got to city community colleges. That suggests a mismatch between what was on the state high school tests, and what students actually needed to know in college.

Data systems could soon integrate software designed to monitor learning and provide feedback to teachers, schools, students and parents. Programs such as DreamBox Learning, Khan Academy and Scholastic's Math 180 automatically crunch information at split-second intervals, from how many problems a student solved to the time he or she spent doing it. This information can create a detailed picture of student performance, and prompt teacher interventions at just the right moment — an innovation known as "learning analytics."

In 2009, the U.S. Department of Education made creation of an SLDS mandatory for any state that wanted to win funds under its Race to the Top program. "That provided a boost to underscore the work we were doing," Guidera says. "It reinforced what states were already doing, raised the priority and made the data issue a sexy one by calling it out as something we needed to focus on."

Today, all 50 states and the District of Columbia, Puerto Rico and the Virgin Islands have made progress on creating an SLDS. Guidera says 44 states have at least some ability to connect K-12 with postsecondary data.

What The Law Says

The main law that governs data kept by public schools is the 1974 Family Educational Rights and Privacy Act, or FERPA. It gives parents and students, once they turn 18, three rights: to inspect their own records, to correct those records, and to give consent in writing before the release of those records to any third party.

Well, for the most part. There are two blanket exemptions. One covers the "what" (of student information) and the other the "who" (is authorized to see it).

"The big hole in FERPA is directory information," says Sheila Kaplan, the privacy activist. She explains: FERPA allows schools to release a student's "name, address, telephone number, date and place of birth, honors and awards, and dates of attendance" without first obtaining consent (although they are supposed to disclose the release and allow parents to opt out of directories).

The second hole got much, much wider in the past few years.

FERPA always allowed school officials to release records to other education officials without parental consent. In 2008, that right was expanded to contractors and volunteers, as long as they were under "direct control" of schools. This included for-profit cloud service providers.

"That opened the doors for the Googles and the Microsofts," says Khaliah Barnes, director of the Student Privacy Project at the Electronic Privacy Information Center, or EPIC. In 2011, a second exemption allowed schools to release information to "authorized representatives" of state authorities.

The Data Quality Campaign was among those that pushed for the changes. It argued that individual parental authorization was "impractical" for big-data systems.

In 2012, EPIC sued the Education Department to fight the new regulations. The suit was dismissed for lack of standing.

Privacy advocates say that all of this creates three separate categories of risk — from hackers, marketers and spies.

Hackers

The more information is collected, the more it is centralized, and the longer it is stored, the greater the danger from hackers. "If a school district loses a computer disk with information of about 200 kids on it, it's terrible," says Joel Reidenberg, director of the Center on Law and Information Policy at Fordham Law School. "Now let's say it's a large data set that includes 20,000 students that gets breached. The magnitude is much greater."

There have been several recent examples at major universities where tens of thousands of student records were stolen or accidentally exposed.

Reidenberg notes that school districts may have lower security than large corporations or universities. "You have failures at institutions that are spending millions trying to protect the security of their data. Is there any reason to believe that school systems are going to be more successful?"

Increasing the concern, many apps by their very nature are designed to make student data more shareable than ever before. According to a recent investigation by Politico, apps such as Learnboost, a free online gradebook program, make it easy for teachers to email student records around the web. Similar concerns arise when teachers start Facebook pages, as many do, exposing classroom discussions to the commercial web.

So far, there have been no reports of large breaches of SLDS. Although Reidenberg says that could be because we don't know about them. Last year his center at Fordham Law examined agreements between school districts and their cloud providers and found that most do not require providers to disclose any leaks.

Marketers

The second concern is that student data will be monetized. Reidenberg's study found that fewer than 7 percent of district contracts restricted the sale or marketing of student information by vendors. It did not, however, say how many of the cloud service providers are actually selling that info.

But often the issue is murkier than the outright sale of information. For many cloud services, like Google Apps, the entire business model is based on mining data for marketing. "A quarter of the services are free to the districts — the providers are monetizing it somehow," Reidenberg says. Even the nonprofit Khan Academy allows third parties like Youtube to track students' web usage.

In practice, defining the commercial misuse of student data is tricky. A program such as Pearson's enVisionMATH, a software-based tutoring platform, continuously analyzes millions of data points on student performance in order to improve its products and pitch more relevant products to school systems. That's both an educational and a commercial use.

Spies

The final potential threat comes not from shady hackers or greedy vendors. It's that the very people who create and maintain these databases will misuse the information.

Recall, advocates want students tracked all the way into the workforce. "We have a real disconnect about knowing how well we're preparing students for the world of work," the Data Quality Campaign's Guidera says. But for privacy experts, this raises the specter that a student's suspension in third grade could be used to deny him a job 15 years later. Or what about someone being denied a spot in a public university because big data predict she is overwhelmingly likely to drop out?

"Without any kinds of deletion, we've really created that permanent record," Reidenberg says. "If someone makes an 'adverse decision' about you based on your credit report, they have to inform you." Yet no such protection exists for student records.

What Is To Be Done?

The issues at stake with student data are similar to those involving health records, credit reports and consumer information like credit card numbers. But there's something more: Privacy advocates say the keeping of long-term records about young, unformed people deserves special scrutiny.

EPIC's Khaliah Barnes has drafted a "Student Privacy Bill of Rights." It would enable individual students to:

limit access to their information; and

amend their own records.

It would also:

limit the amount of data collected and how long it is stored;

ban commercial use and other inappropriate reuse; and

mandate transparency and accountability about all of the above.

The Data Quality Campaign, too, acknowledges that, as an advocate of more data collection and use, it must also champion good privacy practices.

"Parents need to understand: Why is this information being collected? How is it going to help my child? What is being done to protect it? Lots of places aren't doing these things," Guidera says. "We're encouraging states and districts to annually publish a list so people can see exactly what's being collected and why." But disclosure is not enough, Guidera says; the responsible use of data means using that data to help students, not to trigger negative consequences. "We're seeing a culture shift in education, from using data as a hammer to hit people over the head, to a flashlight to light their way."