Fifty years ago, almost every publisher in the United States was independent. Beginning in the late 1960s, multinational corporations consolidated the industry ...

Fifty years ago, almost every publisher in the United States was independent. Beginning in the late 1960s, multinational corporations consolidated the industry. By 2007, four out of every five books on bookstore shelves were published by one of six conglomerates: corporate entities that hold businesses from different industries under one governing financial structure. I call this period—from, roughly, RCA’s purchase of Random House at the end of 1965 until the release of the Amazon Kindle and the 2007–8 financial crisis—the conglomerate era.

The conglomerate era was full of prophecies about the coming death of literature, or, on the other hand, its continued flourishing. Literature, said the doomsayers, needed some freedom from commerce to survive. Otherwise we’d be left with only cookbooks and celebrity memoirs. Novelists, especially, rattled their swords. They even convinced the US Senate, in 1980, to hold a hearing about breaking up the conglomerates. E. L. Doctorow argued on behalf of PEN that “the concentration into fewer and fewer hands of the production and distribution of literary work is by its nature constricting to free speech and the effective exchange of ideas and the diversity of opinion.” Publishers countered that—either in spite or because of their consolidations—more and more diverse literature was being published than ever.

The terms of the debate have remained remarkably constant. Literature will die or flourish. Meanwhile, under pressure over time, literature transformed. Into what?

A decade ago, Zadie Smith imagined “Two Paths for the Novel”: lyrical realism or avant-garde. Meanwhile, Mark McGurl published The Program Era, about how creative writing programs changed American literature. The editors of n+1 responded by proposing that the two paths for the novel were MFA or NYC: creative writing programs or New York publishing.

Both essays tell partial truths. By missing corporate conglomeration, they miss the whole. The two paths paved by the period—which subsume and reorient realism or avant-garde, MFA or NYC—were nonprofit or commercial. Two different ways of structuring publishers’ finances created a split within literature, yielding two distinct modes of American writing.

When conglomerates took control of publishers, they installed management practices designed to enforce greater attention to the bottom line. A conglomerate contains businesses from distinct industries. In many cases, publishers became small parts of media conglomerates, such as CBS and News Corp. Fixation on profit raised the hackles of authors and editors, who penned screeds, held protests, and eventually created an alternative system: nonprofit publishing.

If conglomeration chained books to the market, and the market restricted aesthetic freedom, then one solution was to subsidize publishers. To take state and philanthropic money, presses had to become nonprofits. Feminists, antiracists, aficionados of translation, and countercultural printers did, and built a movement. It took a decade to gather momentum. By the 1990s, journalists reporting from the American Booksellers Association’s conference wrote, “It is as if, in 1991, a critical mass has been achieved, with the small press section now crystallized into … a temple to democracy in book publishing.” No longer a world of relatively small independent commercial presses, by the early 1990s the publishing industry had become divided into the Big Six and a swarm of upstart nonprofits.

But had this made any difference for what lay between a novel’s covers? Did conglomeration change the stories we told or how we told them? It’s a big question. Thousands of novels were published each year, more than I could read even a fair sample from to make any determinations. For help, I turned to new data science methods.

Nonprofit publishers claim—in grant applications, press releases, interviews—that because they are liberated from the market they can publish literature that distinguishes itself from that of the conglomerates by being more literary.

Literariness is a contested category. Some argue it names the unique style that separates literary from genre or mass-market fiction. Others argue that literariness has more to do with sociology than with the text itself: that is, literariness resides in the extratextual act of claiming separation from the demands of the market. Nonprofits perform the latter; the claim creates its truth in the act of declaring it: literature published by nonprofits becomes, by definition, literary. But is there truth to the former? Is there anything in the text that distinguishes nonprofit fiction from that of the conglomerates?

I needed a method that could distinguish between conglomerate and nonprofit novels solely on the basis of text. Text classification, a method popular in machine learning and artificial intelligence, allows for that.

Conglomeration expresses itself as mechanical, nonprofits as fleshy. Machine, body. Contemporary literature has a dualism.

It works like this. Everyone with email relies on text classification to separate spam from legitimate emails. Email providers train their computational models to recognize the difference by giving them emails labeled “spam” and others labeled “not spam.” They then ask the model to learn the features that most reliably distinguish the two types, which could include a preponderance of all caps or phrases like “free money” or “get paid.” They test the model by giving it unlabeled emails and asking it to classify them. If the model can do it accurately a high percentage of the time, that’s a good spam filter.

I built a model and determined whether it could distinguish between conglomerate and nonprofit novels. I trained the model to see novels in terms of diction and syntax. It tracks which words and parts of speech are used, and how often. If the model, on this basis, can’t tell the difference between conglomerate and nonprofit novels, then that doesn’t prove that there’s no difference between them, just that there’s no difference that a computer can discern on the basis of diction and syntax. Based on my experience reading novels from both categories, it was unclear to me whether they were different in ways detectable by a machine. And I could not separate my experience from my knowledge of a novel’s publisher and reputation, a disadvantage not shared by modeling.

The model should, by random chance, as with a coin flip, be right 50 percent of the time. It is right about 70 percent of the time. This is not an excellent rate, but it is much better than random. “Conglomerate” and “nonprofit,” definite categories for a novel’s provenance, ought to be imprecise as indicators of a novel’s diction and syntax. That my crude model is enough to accurately predict classes 70 percent of the time impressed me. If it were much better than this, I would be concerned. Given the imprecise science of acquisition, the idiosyncrasy of editorial habits, and the circulation of staff and authors between conglomerates and nonprofits, I expected at least some overlap and blurriness.

The model shows that conglomerate and nonprofit novels differ at the level of text. Is the difference a matter of “literariness”? One affordance of my machine learning model is that it records the features that most reliably distinguish between classes. It gives every word or element of syntax a weight that it uses to classify the novels. Here are the words and elements of syntax that most differentiate the categories:

A glance is enough to notice the coherence of the nonprofit features, as well as how they differ from those of the conglomerates. Nonprofit features are mostly about embodiment: body parts (“shoulders,” “fingers,” “palms”), actions performed with a body (“tear,” “kicking,” “beating,” “jumping”), what a body might perceive (“sweat,” “swollen,” “stiff,” “rhythm”), and “body” itself.

The conglomerate features require more interpretation. I split them into three groups, displayed below with a selection of the most common words in the above graphic. The first group demarcates a world of law (“lawyer,” “expert,” “deal,” “judged,” “injury”), crime (“murdered,” “blackmail,” “suspicion,” “demanded”), and power (“ambition,” “maître,” “affair,” “champagne,” “dominate,” “senior,” “envying”). The second signals bureaucracy (“desk,” “office,” “capacity,” “type,” “supply”) and form in the sense of order, logic, and pattern (“coherent,” “ordering,” “reasonably,” “ruled,” “including,” “fitted,” “type”). The third has dispositions and mores of polite society (“assume,” “mood,” “thank”).

I built this model to investigate whether nonprofits are, as they claim, more literary than conglomerates. The results allow me to extend recent computational studies into literariness and answer yes. These studies have shown, for example, that the “particular nature” of fiction is its commitment to our “perceptual” engagement with the world, “grounded in an appeal to encounter rather than reality.”

Does fictionality equal literariness? In a broad sense, fiction is literature, and what marks it as such ought to be named literary. But literariness is also used to demarcate literary fiction from genre writing. In this sense, nonprofit fiction is literary because it does not mark itself with a clear genre signal but doubles down on what is distinct about fiction: language of embodiment. It winks at doing this with the one word that seems at first an outlier: fiction itself. These results lend credence to a definition of literariness: fiction that emphasizes what distinguishes it as fictional.

The language of conglomerate fiction is different, less about embodiment. One strand, that of law and crime, belongs to detective fiction, with murder, blackmail, and courtroom drama. It is, that is, generic. The rest invite speculation that an allegorizing process is at work—conglomerate books tend to express conglomeration itself. And not just any conditions of conglomeration, but specifically bureaucracy (ruled, ordering, coherent, desk, type); a results-driven professionalism (ambition, champagne, deal); and the language of correspondence, petition, and rejection (thank, goodbye, intend, assume, ignore). We might see these as allegorizing the processes that I have argued elsewhere define the conglomerate era: rationalization, the bottom line, and the rise of the agent.

There is something science-fiction-y about this, as if authorship worked at multiple scales, as if the many minds cooperating within modern bureaucracy to bring a book to print composed, beyond their will, a collective agency. Maybe even a collective authorship. Conglomeration expresses itself as mechanical, nonprofits as fleshy. Machine, body. Contemporary literature has a dualism.

Computational analysis is recursive. Initial results are provisional, an opportunity for reflection and revision. The mechanical intelligence of conglomerate novels must be understood in a context of the history of the firm during capitalism’s long postboom downturn. The literariness of nonprofit novels must be understood in a context in which state and private money carries expectations. According to Ralph Ellison—an original appointee to the advisory board for the National Endowment for the Arts, which provided the grants to establish the nonprofit publishing movement in the 1980s—art does the work of lubricating the otherwise dangerous friction generated by differences internal to American society. “By projecting free-wheeling definitions of the diversity and complexity of American experience it allows for a more or less peaceful adjustment between the claims of ‘inferiors’ and ‘superiors’—a function of inestimable value to a society based, as is ours, upon the abstract ideal of social equality.” Ellison celebrates art for its ability to resolve real inequality with symbolic projections, a boon for a democratic state under capitalism. This, Ellison contends, is the mission of the NEA.

How, then, does the conglomerate-nonprofit division—the mechanical, the embodied—influence representation along axes of inequality, such as gender and race? This question brings me to the cutting edge of computational methods. I am revising my model to ask how white women and writers of color differently express the institutional conditions from which they write. I am researching how, amid the patriarchy and the extraordinary whiteness of the industry, writers leveraged their constrained agency to respond to systemic forces. Authors found themselves responding with newly generative forms, like autobiographical fiction (Chris Kraus’s I Love Dick, Percival Everett’s Erasure) and literary genre fiction (Joan Didion’s The Last Thing He Wanted, Karen Tei Yamashita’s Tropic of Orange). But that’s a story for another day.

This article was commissioned by Richard Jean So.

Featured image: Costco Warehouse - Book Stacks. (2007). Photograph by brewbooks / Flickr