Feedback

Comments on this reported should be submitted on this related blog entry. Comments will be open for roughly the next two weeks. Off topic and off color comments will be silently filtered and discarded.

Caveat

This analysis makes the assumption that the two media outlets (The Smoking Gun and Gawker) who initially covered Guccifer 2’s first document printed their own PDF files. If we learn that either/both of those media outlets received their PDF’s from Guccifer 2 or another third party, then this report will updated accordingly.

Background

In a previous report, Did Guccifer 2 Plant his Russian Fingerprints?, we detailed various aspects of the first five Word documents (1.doc, 2.doc, … 5.doc) that Guccifer 2 published on his WordPress.com blog site. It was widely reported at the time, that the first document, 1.doc, displayed “Russian fingerprints” (Russian error messages written in Cyrillic letters). In that report we described the process that embedded those “Russian fingerprints” inside 1.doc. We showed that the following actions and circumstances were necessary to ensure that the Russian error messages were embedded inside 1.doc.

A Word document (the “Trump opposition report”) was chosen as the source document; it is unique in its ability to (1) trigger invalid hyperlink errors, and (2) to have content that is somewhat relevant to the Trump campaign. Out of over 2000 Word (.docx) documents (in the Podesta email collection), only the Trump opposition report met those criteria. In fact only four (4) documents would trigger the error.

Word 2007 was used.

Russian language settings were set in both Windows and in Word.

After opening the original “Trump opposition report”, Guccifer 2 had to confirm twice that he understood that the original Trump opposition report has problems and to direct Word to attempt a recovery.

The Word document (.docx) was initially saved as an intermediate RTF file. When this intermediate RTF file is closed and re-opened, the Russian error messages will be displayed (because Russian language settings were previously set).

Text from this intermediate RTF file was then copied and pasted into an empty Word document. In Guccifer 2’s case, this empty document was a template document that had its original body text removed.

This new document with the copied material was saved again as an RTF file. This final RTF file became 1.doc. This chain of actions created a document with embedded Russian error messages (“Russian fingerprints”).

The sequence of circumstances that created these “Russian fingerprints” is sufficiently complex and unusual to raise the question: Did Guccifer 2 plant those “Russian fingerprints” intentionally?

Overview

In this report, we focus on early media coverage that reported on the “Trump opposition report” (1.doc). We show that an additional sequence of circumstances/coincidences was necessary to produce the PDF’s that became the focus of early mainstream and social media coverage.

Wittingly, or not, the media served a critical role in getting the message out that there were “Russian fingerprints” inside the first document that Guccifer 2 disclosed. The media became Guccifer 2’s assistant by completing the long path from the original Trump opposition report to the final published PDF’s with Russian error messages in them (the so-called “Russian fingerprints”). We elaborate on that claim in this report.

The subtlety of the process which embedded the Russian error messages into 1.doc combined with the journalists’ choice of particular word processing applications will cause confusion among the media outlets who covered Guccifer 2 early on. We discuss a few examples.

In November, 2017 the media took a renewed interest in Guccifer 2’s first document disclosure; they noticed that Guccifer 2 had added the word “confidential” to his version of the Trump opposition report. They suggested that Guccifer 2 did this to make the document more “alluring” – to attract media attention.

We differ with the media’s interpretation of Guccifer 2’s motivation for adding “confidential” to 1.doc‘s page footer and watermark. In our previous report, we state:

We see another possible, important, reason for using a mostly empty legacy Word (.doc) file as a template: A copy/paste operation from an RTF document generated by saving the original Trump opposition report as an RTF file (using Word 2007 with Russian language settings) was a necessary step to embed the Russian error messages into the final RTF file (1.doc).

The journalists may not have been aware of Adam Carter’s prior work, where Adam showed that the “confidential” page footer and watermark were injected into the final document by first copying the original document’s text into a template document which was empty, except for its watermark and page footer (both had “confidential” in them). The process of injecting “confidential” into the final document was not a simple matter of “airbrushing” it in, as suggested by AP.

For some readers and researchers, the copy/paste of an intermediate (RTF) copy of the Trump opposition report into a template document might be seen as a (unconventional) method for injecting “confidential” into 1.doc. However, it can also be interpreted as a “cover” for the final copy/paste operation which was a necessary step – it was needed to embed the Russian error messages into the final document (1.doc).

What is news, however, is that the DNC changed their story on how/where Guccifer 2 came by his copy of the original Trump opposition report. In their first announcement (via WaPo) the DNC stated “Russian government hackers penetrated the computer network of the Democratic National Committee and gained access to the entire database of opposition research on GOP presidential candidate Donald Trump“. Yet in the November 2017 AP report, an anonymous ex-DNC official is cited: “The first document Guccifer 2.0 published on June 15 came not from the DNC as advertised but from Podesta’s inbox.” That is a significant divergence, which has gone unnoticed and unreported on, by the media. To place this in context, recall that Guccifer 2 was implicitly linked to the initial alleged hacking of the DNC and his widely publicized “Russian fingerprints” were important to making that connection.

We also look at the 1.doc timeline, from the time it was created to the time that The Smoking Gun (TSG) and Gawker posted their articles on this first document to be disclosed by Guccifer 2. We point out some anomalies in the timeline.

Summary

The Russian error messages that appeared in the PDF’s published by The Smoking Gun (TSG) and Gawker were there only because those media outlets used Word for Mac and LibreOffice (respectively) to create their PDF’s. If either/both of the media outlets had used the much more widely available Word for Windows application (on a default US/English installation) the error messages would have appeared in English.

Ars Technica was incorrect when they suggested that the Russian error messages probably resulted from the process of converting 1.doc to PDF. In fact, the Russian error messages will appear if either LibreOffice or Word for Mac is used to open 1.doc. They will be visible to the user before the final PDF is printed – their appearance has nothing to do with the process of printing the document as a PDF file.

Neither TSG nor Gawker reported on the presence of Russian error messages in the PDF’s that they included with their articles. They also did not report on the easily viewed “Last Saved By” property, which listed “Феликс Эдмундович” (aka “Iron Felix”) as the user who last saved the document.

We note some anomalies in the 1.doc timeline.

Timeline

The following timeline summarizes some key events and developments as they relate to the analysis of Guccifer 2’s early Word document disclosures. For a much more detailed timeline, consult Adam Carter’s Guccifer 2 timeline.

Analysis

A Quick Look at Guccifer 2’s Document Metadata

Some relevant metadata for Guccifer 2’s first five documents are shown below.

A tab-separated file with the results listed above is here.

The fields highlighted in blue have values that are different from their matching source document.

Note: The “last modified by” value of “user” in 4.doc is different than in the source document – there it is spelled “User”.

The yellow highlighted fields (based on our analysis) were inherited from a file used as a template.

Since this report focuses on 1.doc, keep in mind that the metadata for 1.doc shows “Warren Flood” as the Author, and “Феликс Эдмундович” (aka “Iron Felix”) as the user who last saved the document. This information would have been readily available to anyone (or any journalist) who opened 1.doc in a word processing application.

Follow the Russian Fingerprints – Error Messages in Cyrillic

Ars Technica noticed early on that there were Russian language error messages in the PDF file posted by Gawker. This file was generated when the document (1.doc) was printed to PDF. Here is an excerpt from Ars Technica (emphasis added).

Various researchers took a closer look at those error messages, trying to understand how they might have been generated. IVN reported on this, and had the following to say.

If Guccifer 2’s 1.doc is opened in Word for Windows with default English language settings, IVN is correct that the error message will appear in English; the same is true if the document is then printed to PDF. However, if we enable Russian language settings in Word for Windows, we see the error message appears in its Russian equivalent.

Some researchers mistakenly thought that the Russian error messages only appear in the generated PDF file. Ars Technica contributed to the confusion (emphasis added).

We think that Ars Technica may have seen the error messages in English when they opened 1.doc in Word for Windows, yet they also saw the error messages in Cyrillic (Russian) when they viewed Gawker‘s PDF. They concluded that the appearance of these Russian error messages had something to do with the process of printing to PDF. Their assumption is wrong; we explain why below.

Ars Technica suggests the rather surprising idea that Gawker‘s PDF came directly from Guccifer 2. They suggest that the Russian error messages appeared because Guccifer 2 had Russian language settings enabled when he printed the PDF. If true, it would have been poor form on Gawker‘s part not to inform their readers that they were publishing a PDF provided by Guccifer 2. Further, Ars Technica could have easily confirmed their theory by contacting Gawker. Gawker, for their part could have contacted Ars Technica (post publication) to correct the record.

Ars Technica reported on Gawker‘s PDF file — Gawker fortuitously opened the document with LibreOffice when they printed the document. If the document had been printed with Microsoft Word for Windows with default English language settings (as Ars Technica apparently tried this at first), the errors would have appeared as English text.

Above, we made a point of saying “Word for Windows” and not just “Word”. As detailed in a following section, we observe that Word for Mac behaves in a surprisingly different manner than Word for Windows, when it encounters the empty hyperlinks found inside the 1.doc RTF file.

Garçon, May I See the French Menu?

An important point made in the previous section is that when 1.doc is opened in Word for Windows, the user will see the hyperlink error messages in the language that the user has chosen for Word. If a user in France had opened 1.doc, using Word for Windows, they would have seen the following.

A Closer Look at the TSG and Gawker PDF’s

We next analyze both the TSG PDF [archive] file and the Gawker PDF [archive] file to gain a better understanding of how Gawker‘s PDF file differed from TSG‘s PDF. Their metadata is shown in the table below.

A tab-delimited file with the data shown above, is here.

We can see that Gawker‘s PDF file was generated by LibreOffice (Writer). Gawker’s PDF metadata preserved “Warren Flood” and TSG‘s did not. TSG‘s file was generated by Microsoft Word for Mac.

Gawker’s PDF: The Path Less Traveled

Notice above that Gawker‘s page count is 25 pages smaller than TSG‘s. We looked at the fonts used by Gawker; they use DejaVu fonts which are open source fonts typically used on a Linux system. The default LibreOffice install on Windows will use Windows fonts and the generated PDF will be similar in size to that created by Microsoft Word, the page counts are equal with Word.

We surmise that Gawker‘s PDF file was created by LibreOffice running on Linux. This seems a surprising choice of platform for a media outlet – we made a quick search of Gawker‘s job ads and saw no LibreOffice requirement for their reporters. The decision to use LibreOffice will appear prescient in hindsight – LibreOffice will print the Russian error messages without the need to set Russian language settings. Below, we will explain why LibreOffice works that way; it is not obvious – LibreOffice‘s behavior will be the result of how it handles an empty URL, in combination with an artifact found inside the 1.doc RTF file.

The Russian Hyperlink Error Messages are Embedded Inside the 1.doc RTF file

In a companion report, we detailed the sequence of steps that ended with Russian error message being embedded inside of 1.doc. Below, is a screen shot, which shows a “before” and “after” where a hyperlink field’s display text, “VIDEO” is replaced with “Error! Hyperlink reference not valid.”. In Guccifer 2’s 1.doc, that error message is written in Russian.

LibreOffice Doesn’t Speak Russian, but It Can Fake It

We return to the point made earlier – if we open 1.doc in LibreOffice, the Russian error messages are visible. They will be displayed in Russian, independent of the user’s language settings. Why? This behavior derives from the fact that LibreOffice handles these invalid (empty) URL’s differently than Microsoft Word for Windows.

We observe that LibreOffice does not issue an error when it encounters an empty URL inside a HYPERLINK field; it simply prints the text defined by \fldrslt. The \fldrslt value in this case is the display text for the URL, which happens to be the Russian error message. LibreOffice prints that Russian error message independent of the user’s current language setting; it thinks it is simply the URL’s display text.

Word for Mac also Speaks a little Russian

Surprisingly, Word for Mac behaves differently from Word for Windows, when it encounters an empty URL. Word for Mac behaves similarly to LibreOffice; it quietly accepts the empty URL and simply displays the hyperlink text (defined by the \fldrslt function code) inside the document. This text happens to be a Russian error message, written in Cyrillic.

If we edit the hyperlink we can see this in more detail.

Recap: Only Word for Windows Issues Locale Specific Invalid Hyperlink Error Messages when It Encounters Empty URL’s

As we have shown, Word for Windows diagnoses empty URL’s as invalid; it issues a locale specific error message when it encounters the problematic URL’s. These empty URL’s are the result of a bug in Word 2007 (for Windows), which causes Word to mis-handle URL’s with HTML-encoded spaces (%20) in them; they are translated into empty URL’s during a “Save as RTF” operation. In addition, after a copy/paste of the first saved document into another document, the (Russian) error message text are embedded into the document as the displayed text string associated with the problematic hyperlinks.

LibreOffice and Word for Mac quietly accept empty URL’s and just display the text associated with the hyperlink. In this case, the text is a Russian error message. Thus, they will always display the Russian error message text independent of the user’s current language settings.

Quite a Coincidence

Based on the PDF metadata discussed earlier, TSG used Word for Mac to generate its PDF and Gawker used LibreOffice; this ensured that their PDF’s displayed “Russian fingerprints” (Cyrillic error messages). Given that Apple’s Mac has about 10% of the desktop market and Linux has about 2%, it is a remarkable coincidence that neither outlet used Word for Windows to generate their PDF files. If either/both of the media outlets had used the ubiquitous Word for Windows application, the error messages in their PDF’s would have appeared in English (the default for US installations).

UPDATE [2018-05-13]: A reader comment suggests that the percentage of journalists using Mac’s is likely quite a bit higher than 10%. This 2008 informal CES survey indicates 27% of journalists at the convention used Mac’s; that figure has probably increased quite a bit since then. The reader agrees that the use of Linux by a non-tech media outlet is surprising. We note that both Ars Technica and IVN were apparently using Word for Windows, and Ars Technica is a recognized tech savvy media outlet.

It is possible that the journalists at Gawker were wary of opening a Word document sent from a potentially hostile foreign party. Gawker may have used LibreOffice on a non-Windows platform to minimize their risk of opening 1.doc. This may explain their choice of word processor. Their decision to use LibreOffice proved to be fortuitous because it led to the disclosure of the Russian error messages found in Gawker‘s PDF file.

It seems that when Ars Technica first opened 1.doc, they saw error messages in English. They chose to report on Gawker‘s PDF file instead, presumably because it displayed the Russian error messages (in their reporting, the “Russian fingerprints”). Ars Technica waved this discrepancy off as perhaps being an artifact of the method that Guccifer 2 used to print the PDF file, which is surprising because it suggests that Gawker published a PDF that Guccifer 2 provided to them.

IVN: Confused by the Word for Windows and Gawker PDF Disconnect

IVN probably opened Guccifer 2’s 1.doc in Word for Windows and saw error messages written in English, and then compared that to Gawker’s PDF. In the excerpt below, the reference to the “original leaked Trump dossier” is a link to Guccifer 2’s 1.doc.

Shown below, IVN abandons their theory that the Russian error messages might have been generated by the use of a Russian business report application. They had seen references to the Russian error messages in some of the developer’s online documentation.

We agree with IVN that if Gawker had used a Russian version of Word for Windows, their PDF rendering of 1.doc would have had Russian error messages in it. In that case, we would have to ask: Why was Gawker using a Russian version of Word? (We have explained earlier that Word for Windows will print the error messages according to the user’s current language setting.)

Below, IVN also suggests that at some point in time TSG’s PDF may have had neither English error messages nor Russian error messages in it(!).

We see no evidence of IVN‘s claim above. We wonder if the page count differences between Gawker‘s PDF and TSG‘s PDF caused IVN to refer to an area of TSG‘s PDF that had no error messages in it?

Did Gawker Outsource Their Analysis to Russia?

IVN floats an alarming theory that Gawker may have outsourced their analysis (and their print job) to a sub-contractor in Russia(!).

What IVN is talking about above, are links such as the following in Gawker‘s PDF. These appear to be artifacts of converting PDF TOC links into URL’s that in turn point to a local mount point named “cloud_crowd”.

IVN expands on their theory further.

Let’s take a closer look. Gawker provides access to their PDF through a (somewhat annoying) document browser widget.

Looking under the hood, we see that this document browser is hosted by documentcloud.org.

At this point, we will skip a few steps and have a look at a DocumentCloud job ad [archive].

We found Cloud Crowd; it is not an outsourcing company. Probably not Russian, either.

Media Re-runs: The Trump Opposition Report Makes a Come Back

In November, 2017 (over a year after Guccifer 2 appeared on the scene), two media outlets developed a new interest in Guccifer 2’s first document (the Trump opposition report). The AP observed [archive] that Guccifer 2’s version of the Trump opposition report (1.doc) had the word CONFIDENTIAL in it and the original document did not.

The addition of “confidential” to Guccifer 2’s 1.doc was also noted by Business Insider [archive].

This is not news to most researchers who have followed prior efforts to analyze Guccifer 2’s documents. Adam Carter spotted the addition of the “CONFIDENTIAL” watermark and “Confidential” page footer, and explained how a template file was used to inject this data. All of that and more is explained in detail on his web site, g-2.space [archive].

AP’s use of the term “air brush” is taking some editorial license; it sounds as if they are suggesting that Guccifer 2 might have used image editing software. For from it, as we detail in our companion report, Did Guccifer 2 Plant his Russian Fingerprints?

Did Guccifer 2 Photoshop his Screenshots?

Business Insider makes this surprising observation.

When we open 1.doc in Word, we see “CONFIDENTIAL” in the watermark and “Confidential” in the page footer, as shown in Guccifer 2’s screen shots. What was the journalist’s mistake? We think that the answer can be found here.

It seems likely that the author viewed the document with “Full Screen Reading” selected. This will disable the display of the watermark and page headers and footers. We conclude that Guccifer 2 did not alter his screenshots.

When Did Guccifer 2 Contact the Media?

The AP cites the editor-in-chief of The Smoking Gun, who states that Guccifer 2 contacted his organization via email shortly after noon on June 15, 2016 (the day after the DNC announced that it had been hacked). This is the same day that Guccifer 2 made his debut. We will place this fact into the timeline section that is found later in this report.

Narrative Change: Guccifer 2 Did Not Hack the DNC?

Although this brief mention in the AP report [archive] (Nov. 3, 2017) did not receive much attention, it represents a major change in the Guccifer 2 “narrative”.

Compare this to the DNC’s original report [archive] (WaPo: June 14, 2016).

The next day after the DNC reported that Russian hackers had taken the Trump opposition report, Guccifer 2 made his debut. Guccifer 2’s version of the Trump opposition report was featured in early reporting by TSG and Gawker. The following day, Ars Technica reported on the “Russian fingerprints” in that document. This created a strong link between Guccifer 2 and the Russian hackers who allegedly took documents from the DNC. This recent report that Guccifer 2’s 1.doc is derived from the Podesta emails undoes a large part of the early Guccifer 2 narrative, yet this divergence has gone unnoticed and unreported.

Can we Source 1.doc to Podesta’s Emails?

Based on our related research, we observe that all five of Guccifer 2’s early document disclosures can be sourced to the Wikileaks Podesta email collection. We show how 1.doc was derived from some version of the Trump opposition report, but do not have enough information to determine if it came from Podesta’s email or from some other version stored on a DNC server. The AP report cites an anonymous former DNC official as saying that Guccifer 2’s version of the Trump opposition report is derived from the version found in Podesta’s emails.

If the version of the Trump opposition report found in Podesta’s emails is the same as the version found on a DNC server, investigators would not be able to differentiate them. As it turns out, the version found in Podesta’s email is likely different from the version found at the DNC. The reason: Podesta received his version of the Trump opposition report via an intermediary, who seems to have made some quick changes to the document before forwarding it to Podesta. Here is the email, showing that it is from Tony Carrk using a hillaryclinton.com email address.

The relevant metadata is shown below (times are EST).

As background, a “save as” operation will reset the revision number to 2. Above, we see a revision number of 3 and a total edit time of 1 minute. It looks like Carrk made a small edit and then saved the document. This change will cause this document to be different than the version sent from the DNC (which was possibly authored by Lauren Dillon).

If we had access to Carrk’s original or the DNC original, we might be tempted to try an RSID comparison, which is the technique that Adam Carter used to demonstrate that a template file had been used to create 1.doc, and that we used to locate the actual template document.

If investigators had access to Podesta’s actual emails (prior to the Wikileaks disclosure) and the DNC’s servers, they should have been able to make this determination quite soon after Guccifer 2 disclosed his version of the Trump opposition report (1.doc). If they had access only to the DNC servers, they should have able to make this determination soon after Wikileaks published the Podesta email collection (Oct., 2016). Given, this, we ask why did the DNC wait over a year to let this information quietly leak out?

The 1.doc Timeline

In order to understand the progression of media coverage of 1.doc we constructed a timeline.

The table above can be downloaded as a tab-separated file, here.

Below, this is a different view of the same data, showing each entity in its own column.

The table above can be downloaded as a tab-separated file, here.

NOTE 1: Although the WordPress.com metadata tells us the date/time that Guccifer published his first blog post, we can’t be certain that is when the blog went “live”. For example, if the blog had first been published privately and then made public later, we think it it possible that the publish date/time metadata value would remain unchanged. There were anecdotal reports that the blog went live later in the day (circa 5PM EDT). Countering those reports, TSG published their article circa 2PM and in that article they mention Guccifer 2’s blog; what we don’t know is whether TSG might have updated their article later in the day and then added the reference to Guccifer 2’s blog.

NOTE 2: The “last modified” timestamp on the PDF’s posted by TSG and Gawker tells us when the PDF’s were last posted, but we have no way of knowing whether the same document might have been posted earlier, or whether a different PDF had been posted, before it.

In the analysis that follows, we assume that Guccifer 2’s blog went live at 12 PM and that the PDF’s produced by TSG and Gawker were posted at the times shown.

1.doc Timeline Observations

A few observations:

TSG published their article about two hours ahead of Gawker.

TSG and Gawker created their final PDF’s within 6 minutes of each other.

TSG and Gawker posted their final PDF’s within 2 minutes of each other.

TSG initially published their article about an hour before their final PDF was posted. This suggests the possibility that another PDF may have preceded the final PDF; otherwise the link from the article to the PDF would have been broken (or non-existent).

Gawker published their PDF about half an hour before they published their article. This is the usual order; they placed the PDF on their web site before the article was published, so that the article could link to it.



Gawker published their article a full half hour after TSG posted their final PDF.

We tried to construct scenarios to explain the various anomalies mentioned above. In particular, we wanted to see if the timeline might support a theory that either/both outlets received their PDF’s from a third party, or may have been “tipped off” by a third party that they should open 1.doc with a word processing application other than Word for Windows. We decided that potential inaccuracies in the timeline made any analysis too speculative to be useful.

Guccifer 2’s (Long and Busy) First Day at Work

Based on the metadata, we know that Guccifer 2 first created his five Word (RTF) documents at around 2 PM his time (assuming that he works in a GMT+3 time zone); his WordPress blog will not go live until around 7 PM his time. If we allow time for him to find his first five documents out of several thousand (if he chose them only from Podesta email attachments) and to then doctor them up (using a complex series of steps described in our related report, Did Guccifer 2 Plant his Russian Fingerprints?, then he probably got started even earlier.

When we combine the work of creating and uploading the documents with the work of creating a WordPress blog and communicating with two media outlets, Guccifer 2 was busy that first day.

Given the time demands involved, we found it interesting that Guccifer 2 pursued two separate media outlets: TSG and Gawker. It isn’t surprising that TSG and Gawker accepted a non-exclusive on this big story, but it also isn’t the norm. To their credit, both outlets were able to publish quickly on what may have been very little advance notice. Gawker was generous enough to credit TSG with publishing first. Based on the timeline (and the fact that Gawker was quick to mention TSG), we wonder if Gawker and TSG might have known about each other?

Closing Thought

Courtesy: Goodreads