Note that the bug can only record a character when at least one of the 6 bits is true. As a result, the implant can not read the character with binary code 000000. Furthermore, the bug can not 'see' any of the special keys, like Shift, Space, Backspace, Tab and Carriage Return. Note that the home position of the 6 latch interposers does not correspond to the home position of the print head, as 5 of the 6 interposers use negative logic. As a result, the hyphen (-) can not be sensed.



As it doesn't know when Shift is depressed, the characters at the upper case hemisphere of the print head will be mapped onto those of the lower case hemisphere. Luckily, the upper case characters are at the same relative position on the print head as the lower case ones, just rotated by 180°. Although this will produce some ambiguity in the output, the text will still be readable.





The complete letter mapping is shown above. Note that the hyphen and underscore 1 will be omitted as these correspond to the default position of the interposers. In practice this means that all text will be in lower case, that the hyphen is missing, that interpunction characters may be different and that all special functions, such as space and backspace, are omitted. An example:

Meeting with "Jerry" at Tulip Hotel (room A-23) on 24 November at 10:00 ↓ meetingwith'jerry'attuliphotel8rooma239on24novemberat10/00 This is not the whole story however. According to the NSA report, the Russians compressed the 6-bit data into a 4-bit frequency select word. Although the report doesn't explain what they mean by this, we can make a few educated guesses. The reason for compressing it into 4-bits, was probably the fact that the Russians only had access to 4-bit digital technology at the time. The problem with 4 bits however, is that each data word has just 16 possible combinations (24).





By examining the frequency of letters in the English language, we see that some letters are used more often than others. If we assign a unique binary combination to the 9 most frequently used letters, and group the others, e.g. as shown in the rightmost histogram above, we need just 15 binary combinations, leaving one for the joint use of interpunction characters. If numbers are also needed, more characters could be grouped to free up additional binary combinations, or they could be mapped on top of the letters. In the example below, we have only used letters:





Although this method of grouping will lead to ambiguity in the recovered data, it will generally be possible to 'guess' which character of a particular group was used, based on probability theory. For example: in the intercept above, the bigram CU (1) is more likely to occur than UC. Likewise, the bigram TU (2) is more common than TC, leaving us with positions (3) and (4) to try manually. In practice this might have been implemented as a manual or a (partially) automated process.