Measuring Style in the Book of Mormon

This blog post is long (3000 words, 28 images), and it is a quick, poor, and short summary of stylometry relevant to the Book of Mormon. I’ve left out some topics that were only covered with words. I’ve not included one Master’s Thesis that I was shown more recently that confirms 19th century word/phrase choice for most of the language used in dictating the Book of Mormon, nor examples showing extensive, anachronistic17th-18th century syntax in the Book of Mormon, but neither of these has significant new impact on what stylometry shows us about the Book of Mormon. It was written by several authors, and we don’t have any other writings from these authors. Enjoy the summary, or head on over here http://www.exploringsainthood.org/measuring-style-in-the-book-of-mormon/ and follow my whole exploration through, or download a pdf of all 22 posts here 4.4 MB PDF

Data Selection

Breaking up the Book of Mormon is a hard question. Here’s how one researcher did it: Note that all the segments have close to 10,000 words. This means that there are plenty of words to achieve statistical significance for many kinds of tests and comparisons.

Here’s what that author found:

Principle Component Analysis of Five Partially Independent Vocabulary Richness Variables

Holmes claimed Joseph had a personal voice, Isaiah had a voice, and all Mormon scripture had one single, additional voice—the prophetic voice of Joseph.

Reexamination of the Principle Component analysis

One “prophetic voice”, or multiple authors? Looks like many to me.

Put some normalized numbers of the variability, and it looks like 4-10 authors for Mormon scripture.

What Holmes Didn’t Do

Holmes could have easily predicted a number of expected authors based on the methods he used, like this author did before him:

That worked well.

Here’s what the table would have looked like if Holmes had done all his work:

Number of Selections Number of “Observed” Authors Holmes Conclusion Expected (Sichel Method) Names of Observed Authors Names of Authors Proposed by Holmes 1 3 0 ? Lehi, Jacob, Abraham 2 2 0 ? Alma, Moroni 3 4 2 ? Isaiah, Joseph Smith, Nephi, Doctrine and

Covenants Isaiah, Joseph Smith 4 0 0 ? 5 1 0 ? Mormon 18 0 1 ? Prophetic Voice

Next time, finish the work. And that’s poor peer reviewing.

Joseph Smith and the Prophetic Voice

The Prophetic Voice

Joanna Southcott generated different styles that Holmes called her “prophetic voice”. Notice the three different genres:

Even with three genres, Southcott couldn’t match Joseph with just his personal writings and the Doctrine and Covenants. Here’s a look at the numbers with Isaiah used to normalize the measures of variability between the second study (excluding the Book of Mormon) and the first study (including the Book of Mormon):

Sources 1st Principle Component Variability 2nd Principle Component Variability Isaiah 1.0 0.4 Joanna Southcott 4.8 3.8 Joseph Smith (personal) 2.1 1.2 Joseph Smith and Doctrine and Covenants 8.7 2.4 Joseph Smith and Mormon Scripture 9 6

Joseph produced double the variability in the first two components, not including the third and fourth that Holmes showed were significant for the Book of Mormon (but ignored for Southcott).

Vocabulary Richness and Non-contextual Words

Holmes switched to non-contextual words, later. They are more discriminating. Kind of funny that non-contextual words were the measure used by Larsen et al. and by Hilton and coworkers in the studies of Book of Mormon stylometry that Holmes claimed his results disproved. We’ll take his apology as implicit.

Stylometry and Forgeries

Sometimes fraud or imitation fools stylometry, sometimes it doesn’t. Here’s a summary of factors involved in fooling stylometry and how they apply to Book of Mormon studies:

Comparison of Adversarial Authorship Studies with Mormon Scripture Historical and Stylometric Studies

Fooling Stylometric Measures Mormon Scripture Favors/Disfavors Fraud 6500 word reference samples Multiple 10000 word reference samples Disfavors. Longer reference texts give more information regarding authors’ styles. 500 word samples for classification 10000 word samples for classification Disfavors. It is presumably harder to hide your style over longer texts. Simple, familiar topics Complex, unfamiliar topics Disfavors. Greenstadt and her coauthors assume it is harder to concentrate on obfuscation or imitation while inventing or remembering complex, new material. Written in short times Written or dictated in short times Neutral. Long times and multiple revisions are not necessary to fool some stylometric measurements. “Dumbed down” to obscure personal style Less rich vocabulary in Mormon scripture than in Joseph Smith’s personal papers. The same is seen for Joanna Southcott’s prophetic voice, although to a lesser degree. Favors. This was the most common technique to obscure a personal style. Distictive authorial style for imitation No historically verified texts being imitated, other than the Bible, which is quoted extensively and not disguised Disfavors. Joseph Smith apparently created distinct and consistent authorial styles for Nephi, the Doctrine and Covenants, Moroni, and Alma, and nearly consistent for Mormon—all without having any known reference authors to copy. This is the most readily testable question, however, with hundreds of thousands of books from Joseph Smith’s time now available in electronic format. Closed set of authors Open set of authors Disfavors. Using a closed set of authors forces the results to select the closest style without allowing for the possibility that none of the styles match. The Pauline Epistles, Sherlock Holmes, and Jane Austen studies used open set methods and were all able to identify authors as different from the imitated author. Adversarial authorship attack known No direct evidence of fraud Disfavors. Stylometric methods are demonstrably highly effective at identifying authors when sample sizes are as large as those from Mormon scripture. The only time this is known to be untrue is when authors are deliberately disguising their style or copying another, and even then they often fail to disguise their style. Machine translation doesn’t disguise style Claimed to be translations Disfavors. Authors’ styles are preserved through multiple machine translations, consistent with Joseph Smith having “translated” texts by multiple authors. Automated selection of machine identifiable stylometric features Stylometric features including the very sensitive, noncontextual word pairings Disfavors. Forecasting a little to papers not yet presented, but two Book of Mormon stylometry papers use noncontextual word pairings to test authorship. This method was employed in the Sherlock Holmes and Jane Austen studies, but not in the adversarial authorship studies.

Nearly every consideration mentioned in studies on stylometric fraud disfavors the presence of fraud in Book of Mormon styles.

Faulkner’s “Fraudulent” Wordprints

Some LDS researchers went looking for authors who created multiple styles when examined by strong stylometric measures (non-contextual word pairs). They did find one—Faulkner. They also found two more who were explicitly trying, but failed—Mark Twain and Robert Heinlein.

And Faulkner did it by making conscious contextual word pairs that are typically subconscious and non-contextual. He was imitating dialects, and he apparently had a really good ear for it. Add this to the studies on fraudulent stylometry, and it’s getting really hard to argue that Joseph made all the styles himself.

Joseph’s Personal Style

Joseph did a lot of dictations. How did that affect his personal style?

“Most Likely” Authors of Joseph’s Dictations using closed-set stylometry

The following is an excerpt from the first table in a 2013 study by Jockers. It shows the number of the 96 texts dictated by Joseph with the author identified by stylometry that had a style most similar, or second most similar, to the text:

Table 1 Identified Author 1st choice 2nd choice Barlow 1 0 Cowdery 32 21 IsaiahMalachi 3 1 Longfellow 2 4 Pratt 24 12 Rigdon 12 10 Smith 15 25 Spalding 7 23

Notice that:

13/96 (Barlow, Longfellow, Isaiah/Malachi, and Spalding) were attributed to authors with no connection to the texts.

32/96 were assigned to Cowdery

24/96 were assigned to Pratt

12/96 were assigned to Rigdon

15/96 were assigned to Smith

13.5 % ‘wrong’ (assigned to controls), 15.6 % ‘right’ (assigned to Smith). How many of the texts assigned to Cowdery, Pratt, and Rigdon were penned by those scribes?

Scribe # texts assigned to scribe # of those texts for which scribe acted as scribe Cowdery 32 2 Pratt 24 1 Rigdon 12 0

So Rigdon and Spalding yielded a total of 19.8 % false positives. 81.2 % of the passages were objectively misattributed. Whatever this method is, however good it is elsewhere, it’s hopeless for answering questions about Joseph’s dictation—which includes all of Mormon scripture.

Stylistic Overlap and Joseph’s Dictations

An Alternate Interpretation

If it’s even worth doing, here’s one way to explain how closed-set stylometry could so badly misattribute Joseph’s writings to others:

All it would take is Joseph’s style being more diffuse than those of the scribes, and their having some overlap in style. Then Joseph’s style would be attributed first to the scribe with a more focused style, and only later to Joseph. And if scribes were imperfect, not catching every word exactly as it was spoken? That could diffuse Joseph’s style even further without implying that Joseph had no personal style.

Dual Authored Book of Mormon

Closed-set Stylometry and the Book of Mormon

Here’s one of those charts, again, that had 80% errors on the last problem it was applied to. This from a 2008 paper by Jockers et al.:

# of Book of Mormon Chapters Assigned to Author Proposed Author 1st Choice 2nd Choice Rigdon 93 104 Isaiah & Malachi 63 38 Spalding 52 58 Cowdery 20 17 Pratt 9 15 Barlow 0 1 Longfellow 2 6

Of the text that isn’t quotes from Isaiah or Malachi:

45.8% assigned to Rigdon

25.6% to Spalding

13.3% to Isaiah/Malachi

5.4% to Pratt/Longfellow

So 18.7% of the text is objectively (by any measure) false positives.

From the 2013 study we saw that Rigdon and Spalding showed up as false positives 19.8% of the time. In addition, Cowdery showed up most often when he wasn’t even scribe for the texts—31.3% of the total texts. We also saw that the NSC method got it “right” only 15.6% of the time. Something broke between the control tests and application to Joseph Smith’s dictations.

How Closed-sets Generate Misattribution

A closed-set method will always give a positive answer. Here’s a simplified visual representation of what this closed-set study has definitively shown us:

Chapter 1 would be assigned to Cowdery, chapter 5 to Pratt, 6 to Spalding, etc., despite none of the chapters matching the styles of the candidate authors.

The Book of Mormon Still Has Many Authors

The closed-set method may be poorly applied, but it did assign chapters of the Book of Mormon to 5 different authors.

Opening Authorship Possibilities

Stylometry is a multidimensional statistical problem. You observe all the features in multiple dimensions and look for clustering. Arrows represent the stylometric features vectors of four hypothetical authors. If the arrows are for different authors make different clusters, then you can tell them apart with your set of stylometric measurements.

The same thing can be done in classifying tumor cells.

Test a fourth tumor type against three known tumor types in a closed-set problem, and the results tell you it is one of the first three types. Modify the method to be open-set, and the fourth type (+’s) clusters in one corner, while types 1-3 cluster in the middle. You can tell there is a new type.

Sidney Rigdon Wrote the Federalist Papers?

The closed-set methods applied to the Federalist Papers (instead of the Book of Mormon) showed that Sidney Rigdon wrote most of Alexander Hamilton’s Federalist Papers.

The open-set method does much better, telling us there is an unknown author (Hamilton):

Include Hamilton in the closed-set of authors, and the closed-set method works fine:

Closed-set methods are a bad idea if there might be an unknown author.

Book of Mormon Authors Unknown

Use open-set methods on the Book of Mormon and what do you get? The Book of Mormon was written by an unknown author or authors. None of the 19th century authors fit the bill.

The closed-set study had plenty of evidence present in its own results to show that it was misapplied. Only the chapters with values above the line at 1.9 (figure below) could be confidently attributed by the closed-set method, and almost all of those were the Isaiah and Malachi quotes.

Again, the open-set method does better. And you will notice that Book of Mormon styles are all over the map compared with 19th century styles.

One “prophetic voice”? Rigdon and Spalding? Try again.

Nephi is not Alma is not Joseph

Using frequencies of non-contextual word pairs, the same author uses word pairs with almost the exact same frequency over different texts (0-6 rejections, or statistically significant differences in frequencies). Different authors often have 7 or more differences. When avoiding changes in genre, which can confuse stylometry, comparisons can be made between Joseph, Nephi, and Alma. Nephi matches his own style, as does Alma.

When you compare Nephi and Alma with each other they are different:

They are also different from Joseph, Oliver, and Solomon Spalding:

Once again, multiple authors in the Book of Mormon.

The Highly Criticized 1st Book of Mormon Study

While the results of the oldest Book of Mormon stylometry study overstate the evidence for multiple authorship (not intentionally—the field has progressed a lot since 1980), it too found multiple authorship and a lack of 19th century authorship:

The Late War, The Book of Mormon, and Rare n-grams

Stylometric comparisons with The Late War uncovered the presence of many many short phrases in the Book of Mormon that are nearly unique to early 19th century pseudo-biblical writings. The Book of Mormon is written in Pseudo-Biblical, 19th century language. That’s terribly unsurprising, but truly interesting. As for the rest of the comparisons, keep in mind these observations if you choose to sift through them:

The Book of Nullification , published in 1830, has twice as many similarities with the Book of Mormon as the Book of Mormon has with The Late War .

, published in 1830, has twice as many similarities with the Book of Mormon as the Book of Mormon has with . The Johnsons’ second study, including more books and an improved method of comparison, identified three other books—which had never previously been proposed as source material for the Book of Mormon—as being more similar (and thus more closely related) to the Book of Mormon than The Late War .

. Their study does not include any texts of fewer than 15,000 words. “Unique” matches with the Book of Mormon may be much less unique than is indicated if the body of smaller texts were included.

The Urantia Book and the Book of Mormon

The Urantia book is another purportedly multi-authored, revealed text. Some stylometric measures seemingly confirm multiple authorship, however, the differences are plausibly explained by shifts in genre and the passage of time—two criteria that cannot explain shifts in Book of Mormon styles.

Could a Single Author Produce the Variety of Stylometric Features? Fooling Stylometric Measures Mormon Scripture Urantia Book 6500 word reference samples Multiple reference samples with as many as 10–30,000 words per proposed author Multiple papers of at least 1000 words 500 word samples for classification 200–10,000 word samples for classification (method dependent) At least 1000 words for classification. 1000 is better than 500, but weaker than 2,000–10,000. Simple, familiar topics Complex, unfamiliar topics. Joseph Smith is reported to have told stories about some topics treated in the Book of Mormon, but had not previously written anything of significant length. Complex topics of unknown familiarity. Sadler (the Urantia Book recorder) was demonstrably well-read and previously or simultaneously wrote on several topics related to the Urantia Book. Written in short times Written or dictated in short times. No time available for significant, subconscious shifts in authorial style, and no rewriting. Written in unknown amounts of time (possibly short), but over many years, thus allowing for observed linear shifts in authorial style. “Dumbed down” to obscure personal style Less rich vocabulary in Mormon scripture than in Joseph Smith’s personal papers. Styles changed to match genre, a conscious decision influencing style that even untrained authors can affect. Distictive authorial style for imitation No historically verified texts being imitated (except the Bible) Sadler is known to have read and possessed numerous texts on topics treated in the Urantia Book, however the only clear imitation is the Bible Closed set of authors Open set of authors Open set of authors Adversarial authorship attack known No direct evidence of fraud No direct evidence of fraud Machine translation doesn’t disguise style Claimed to be translations, suggesting authorial styles should be preserved. Claimed to be revelations. We don’t know what to expect stylometrically from revelations from different sources. Genre controlled Genre controlled for in some studies, revealing multiple authorship. Genre controlled for in one study, consistent with single authorship.

One author didn’t write the Book of Mormon. Two didn’t, either. And we don’t have anything else written by the people who did. For me, this is fact. Explain it how you will.