It’s become the norm in America for parents to capture their children’s smiles, tantrums, and impish shenanigans—sometimes cute, sometimes deeply embarrassing—on blogs, YouTube videos, and Twitter feeds. But MIT professor Deb Roy makes even the most obsessive at-home documentarians seem inattentive: He recorded, on video and audio, nearly every waking moment of the first three years of his son’s life—not as an exercise in parental vanity, but in the name of science. His goal was to create as complete a picture as possible of how one child learns a language. For his study, “The Human Speechome Project,” he embedded 11 cameras and 14 microphones in the ceilings of his home, and set them to record for an average of 12-14 hours a day. Now Roy and his team have begun the enormous task of trying to make sense of the data—all 120,000 hours of it.

Roy’s decision to use his own child as a research subject makes people uncomfortable: When the New York Times wrote a story about Roy, the comments were on the outraged side. Recording a child for amusement is one thing, but taking those recordings to the lab for analysis may be quite another. It may seem unethical—perhaps even dangerous to the child’s mental health.

But it’s crucial to realize that while Roy’s using the latest technology, his tactic is not new: Language researchers have long used their children as subjects. All parents feel a sense of wonder as they watch their children piece together their first words, and their first phrases; scientist parents can’t help but feel professional curiosity as well. Because some have given in to the pull of this curiosity and turned their observations into data, we’re a little bit closer to figuring out the mystery of how humans acquire language.

One of the early notable studies of a child by his parent was published by the philosopher Dietrich Tiedemann in 1787. From Tiedemann’s careful notes we learn that his son Friedrich began to communicate by pointing at 8.5 months, that he first said “duck” and “potato” at 23 months, and that he had a much easier time pronouncing p, t, and k than z, w, and sp. In those days there was plenty of philosophical discussion about the nature of children and what they knew when: Were they blank slates, or did they possess innate knowledge? Did they understand concepts before language? For the most part, these issues were debated in armchairs. Tiedemann’s approach was more practical: Here is a child, let’s see what he actually does. He rejected anecdotal evidence and out-of-thin-air wisdom, relying instead on carefully recorded observations. His study didn’t settle the blank-slate debate at the time, but it encouraged others to try an empirical approach to scholarship. Tiedemann’s conclusion—that children do, in fact, posses some pre-linguistic knowledge—has since been borne out by a couple centuries’ worth of research.

In the 1800s empiricism gained in popularity and many scientists published studies of their own children, Charles Darwin among them. But it wasn’t until the 1900s that parents went beyond jotting down things that struck them as interesting and started keeping thorough journals of everything their kids said. The psychologists Clara and William Stern published their Kindersprache in 1907—a detailed study of the first three years of their children’s lives that catalogued the sounds, words, and parts of speech they used at various stages. Jean Piaget completed studies of his children in the 1920s. In the 1940s, the linguist Werner Leopold published four volumes of notes on every aspect of his two daughters’ bilingual language development (they were raised with German and English), including a meticulous phonetic record of the girls’ pre-lingual babbling stage.

The sheer size of these studies meant they could be mined by scholars for data. They offered enough information to investigate questions like: How many nouns and at what age? What kind of sounds and in which order? Such information had never been catalogued in this way or in this quantity before, and it helped advance the field of linguistics. Roman Jakobson, one of the most influential linguists of the 20th century, pulled from these studies to support his theory that languages are not collections of particular sounds, but systems of contrast. Children acquire language by sorting out the difference between, say, words articulated with the lips (labial features) and those articulated with the tip or blade of the tongue (coronal features). Put another way: They’re not figuring out how to pronounce man so much as recognizing the difference between man and ban. (The former word starts nasally, the latter does not.)

Until about 1950, there was a sense in which researchers had to use their own children as subjects. How else would it be possible to get the kind of access needed to collect evidence? But advances in recording technology made it possible to gather data from a child without actually living with him. Scientists started tape-recording interactions between children and their parents for a few hours at a time, either in the home or in the lab. And the birth of cognitive science introduced a new method for looking at child language: the controlled experiment. You didn’t have to look at everything the child did, you could just come up with a specific hypothesis and then test it. In 1958, for example, Jean Berko Gleason developed her famous “Wug Test.” By asking a brief series of fill-in-the-blank questions, such as “This is a wug. These are two …?” she showed that even very young children internalize the word-building rules of language and can produce correct examples of those rules (“wugs”) that they had never heard before.

Yet researchers continued to use their own children as subjects, because the practice was always and will always be more than a matter of convenience. It’s not as if scientists have children in order to test out a theory. Rather, they have children and then find that the experience of watching them acquire language raises all sorts of questions. Then they follow where the children lead them.

In 1962, Ruth Weir published Language in the Crib, a study of the monologues her toddler son produced alone, while drifting off to sleep. A senior colleague of hers had a hard time believing that children really did this—they are learning language from others at that stage; why would they talk to themselves?—so he asked around, and all the mothers he talked to said, “Of course! Children do this all the time!” Any mother could notice this behavior; it took a linguist mother to identify it as an area for research. The monologues of Weir’s son showed that very young children rehearsed and experimented with linguistic structures on their own. And the study of “crib talk” became a new way to find out how toddlers come to understand the world.

In the 1980s, Jeri Jaeger, another linguist mother, decided to write a book on children’s slips of the tongue after a colleague informed her of the “well-known fact” that children didn’t make slips of the tongue until age 7. She subsequently sent him a list of 100 slips that her daughter had made before the age of 3. Kids don’t make slips of the tongue very often, but if you’re around kids all the time (because they live in your house), and pay special attention to their language (because you’re a linguist), you’ll find that there’s a lot of territory between “not often” and “never.”

Parent-child studies helped popularize the use of empirical research in linguistics; they have inspired new theories and exposed facts about language behavior that no one had yet considered. They have, in brief, been good for the profession.

They also don’t seem to have done any harm to children. The point of these studies is to describe nature, and so nature is allowed to take its course—it’s just being observed and documented more closely than it might otherwise be. There is still, of course, the potential hazard of exposing intimate details of a child’s life that he or she might not like to have exposed. It’s lucky that the obscure 1919 study “Parallel learning curves of an infant in vocabulary and in voluntary control of the bladder” was made in the pre-video era. For reasons that they don’t explain very well, the parents of the little girl in this study thought it would be interesting to compare her toilet-training success rate with her rate of word learning. So they logged over 4,600 entries on her potty “successes” and “accidents.” (Their approach didn’t have much of an impact on science. In retrospect, it looks more like a harbinger of the Facebook status update.) But as long as scientist parents follow professional guidelines about subject privacy and leave highly personal information out of their studies, kids are more likely to feel violated by their parents’ YouTube accounts than by their journal articles.

The child’s potential embarrassment is not the true issue here, though. Often the real objection arises from a harder-to-explain feeling that there is something unfair, or just cold, and a little icky, about a parent turning the microscope on his or her own child. The child has no choice; the parent has all the power.

Ironically, however, Roy’s study indicates that the balance of power could shift as technology becomes more sophisticated. With Roy’s Speechome project, it’s possible to analyze not just what the child is doing, but what he hears—which is to say, what Roy and his wife are doing. At the recent conference of the Cognitive Science Society, Roy’s team presented a paper that turned the microscope on the parents, showing how the way they alter the complexity and prosody of their speech influences the way the child learns. In an e-mail, Roy told me that he set out thinking that “language development” described a process that the child went through, but in analyzing the data he came to see it as a process that the parents go through as well. The more all-encompassing and detailed the study of child language becomes, the more we end up looking at what goes on around the child. The parent becomes a subject of his own study.

Like Slate on Facebook. Follow us on Twitter.