The world’s largest collection of words is now available to the public, for free, through a new Google online database that opens a door to the evolving landscape of language.

The potential of this searchable library is unveiled in today’s issue of the journal Science, where a Harvard University-Google team shows how word usage has waxed and waned over the past two centuries. Their study yields cultural insights as diverse as the spread of innovation, the effects of youth and profession on fame, and even trends in censorship.

The Google Books “Ngram Viewer” and the downloadable raw data set (ngrams.googlelabs.com) achieve what mere mortals can’t: analysis of 500 billion words from 5 million books published over the past four centuries, part of Google’s ambitious book-scanning project.

The search tool makes it possible for anyone to chart usage of a word or phrase over time, using computer technologies that scroll through a sequence of letters. If strung together, the letters would reach to the moon and back, 10 times over.

Then word frequencies, charted on a graph, can be compared and contrasted. And by clicking on the decades at the bottom of the graph, it’s possible to view actual titles.

Just an initial swipe at the data shows, for instance, that the word “grill” has surged since 1965, while “fry” has flattened. “Tofu” exploded in usage, while “hot dog” waned. “Freud” is more deeply ingrained in our literature than “Galileo,” “Darwin,” or “Einstein.”

“God” got a lot of ink in the early 19th century, but now needs a new publicist — use of the word declined as the 20th century became more secular.

The tool “will furnish a great cache of bones from which to reconstruct the skeleton of a new science,” according to the paper by a team of Harvard linguists and Google engineers.

The team, which included Google’s Matthew K. Gray, Dan Clancy and Peter Norvig, says this data analysis is a window into our culture. The team also included researchers from Encyclopaedia Britannica and the American Heritage Dictionary.

“We know nothing can replace the balance of art and science that is the qualitative cornerstone of research in the humanities,” wrote Jon Orwant, Google Books engineering manager. “But we hope the Google Books Ngram Viewer will spark some new hypotheses ripe for in-depth investigation, and invite casual exploration at the same time.”

The Harvard-Google study provides several examples of how quantitative methods can provide insights into topics as diverse as the spread of innovations, or the effects of youth and profession on fame, or even trends in censorship. Many of the books in the database, pulled from libraries such as Stanford’s, are protected by copyright. But Google gets around that problem by stripping them off the page, with no context except the date they appeared.

The team’s exploration of what they call “culturomics” (culturomics.org) — the application of massive amounts of data analysis to the study of human culture — includes some startling revelations, such as:

About 8,500 new words enter the English language annually, which fueled a 70 percent growth of the lexicon between 1950 and 2000.

Innovations spread faster than ever. For instance, words describing inventions from the end of the 19th century spread more than twice as fast through the literature as those from the early 1800s.

Modern celebrities are younger and more famous than their 19th-century predecessors, but their fame is shorter-lived. Based on references in literature, celebrities born in 1950 achieved fame at an average age of 29, compared to 43 for celebrities born in 1800. But their fame also disappears faster, with a “half-life” that is increasingly short.

The most famous actors tend to become famous younger, around 30, than the most famous writers and politicians. Based on references to their names, writers become famous at the average of 40 years and politicians after age 50. But patience pays off: Top politicians end up much more famous than the best-known actors.

Trends in censorship can be identified through counts of names. For example, Jewish artist Marc Chagall was mentioned just once in the entire German body of literature from 1936 to 1944, even as his prominence in English-language books grew roughly fivefold. Similar suppression is seen in Russian with regard to Leon Trotsky; in Chinese with regard to Tiananmen Square; and in the U.S. with regard to the “Hollywood Ten,” a group of entertainers blacklisted in 1947.