I got a bad feeling yesterday when I had to include reference information about a 16-bit CRC in a serial protocol document I was writing. And I knew it wasn’t going to end well.

The last time I looked into CRC algorithms was about five years ago. And the time before that… sometime back in 2004 or 2005? It seems like it comes up periodically, like the seventeen-year locust or sunspots or El Niño, except for me and CRCs it is about every five years.

Almost every engineer who has to write communication software runs into this sort of thing at least once, especially C programmers:

You want to send data from point A to point B.

Someone says, “Oh, you should use a CRC so you can detect errors.”

You say “Oh, what’s that?”

You look up information about Cyclic Redundancy Checks.

You read the Ross Williams Painless Guide to CRC Error Detection Algorithms. (I can’t tell you how many times I’ve seen a copy of this, or a Dr. Dobbs article, lying in the output tray of a printer at work, and I think “Uh oh, someone’s trying to lose their CRC virginity.”)

You find source code from somewhere and use it.

And everyone’s happy.

And then the next time you have to write communications software, you either find the code you wrote the last time, or you follow the last two steps.

Except that’s not enough for me. Because I don’t like finding source code from somewhere. Somewhere is a bad place to rely on. Somewhere has bugs and misinformation and it just floats around, like influenza, mostly causing mild havoc but occasionally fatal results. So I leave the code from somewhere to the amateurs. If I want to sleep well at night, I get my code from a reputable place. And it took me maybe the third time of going through the CRC merry-go-round to realize this.

So my great wild goose chase started in 2009 or 2010. I spent maybe the better part of a week going around in circles trying to find a Definitive Source for CRC Algorithms.

Why go to all this trouble?

Here’s the thing. The CRC is a well-known mathematical algorithm — and I highly recommend you read the Painless Guide if this is new to you — that allows you to compute a digest of a series of input bits, using Galois field arithmetic. It has the property that a small number of errors in the input bits make it improbable or even impossible to go undetected in the digest. You change one single bit from 0 to 1 or 1 to 0 anywhere in the input data, and the digest is guaranteed to change. But the CRC is a parameterized algorithm. The CRC is not unique. Even a CRC of a specific number of bits (e.g. 16-bit or 32-bit CRC) is not unique. In order for you to compute the CRC the same way as someone else, you either need to use the same software, or you need to know several things. Here’s the list that Ross Williams gives in section 14 of his Painless Guide, referring to table-driven CRC algorithms using a particular polynomial:

Width of the poly (polynomial).

Value of the poly.

Initial value for the register.

Whether the bits of each byte are reflected before being processed.

Whether the algorithm feeds input bytes through the register or xors them with a byte from one end and then straight into the table.

Whether the final register value should be reversed (as in reflected versions).

Value to XOR with the final register value.

And not only is it a parameterized algorithm, but there is no standard for what the parameters are. Ross Williams describes them one way, I might describe them another way.

In short, it is a real headache. Imagine if you had two engineers located at different sites, who weren’t allowed to communicate with each other, and you had to give them some information (but not source code) so they could write software to compute CRCs in an identical manner. What would you tell them?

Standards?

All right, at this point it’s time to look for a definitive standard. Because if you can just say, “Okay guys, go get a copy of the BLT-867-5309 standard,” that’s a lot easier and clearer than trying to write your own description. And unfortunately, CRC is one of those topics that is strewn with misinformation. Yeah, there are online calculators and tons of articles on the subject, but as a rule they don’t cite their sources, and sometimes they use different terminology, and who’s to say that Ross Williams’s website doesn’t disappear tomorrow? Standards are definitive and durable.

A quick look at the Wikipedia article on CRCs lists a whole bunch of different ones for a 16-bit CRC, and a few for 32-bit CRCs, but one of the 32-bit entries looks promising. It’s the one that includes Ethernet and PKZIP and PNG and a few other things. When I went searching for CRC standards five years ago, I kept reaching dead ends and paywalls (many standards must be purchased) and useless vague information… until I finally ran across the W3C PNG specification, which includes a well-written sample implementation in C. Wow. When I saw this, I felt like I had just found the Holy Grail. The gzip algorithm in RFC-1952 has the same sample implementation.

So 32-bits? Case closed. 16 bits? For some reason I settled on the CRC16-CCITT polynomial as used in XMODEM , and left it at that.

This week I had to specify a 16-bit CRC. And at first I wrote down CRC16-CCITT and thought I was done. But then I noticed that this just specifies the polynomial. There’s no definitive standard of that name. The XMODEM CRC does have a specification. But the XMODEM has a weakness: if your input bytes start with 00, the CRC calculation cannot detect dropped bytes. That is, these input byte strings all have the same XMODEM CRC of 0xF8E5 :

AA 55 00 AA 55 00 00 AA 55

This is because XMODEM uses a CRC variant with an initial value of 0000 . The way the math works, if the CRC state ever contains 0000 and you shift in zeros, the state remains at 0000 , so dropped bits or bytes are undetected. If your protocol requires input data to not start with 00 , then you don’t have to worry, but in my case this wasn’t true. So XMODEM was out, at least for my purposes.

References to CRC16-CCITT are somewhat conflicting. There’s a CRC16-CCITT/FALSE in one of the CRC catalogues, which claims that it is misidentified as CRC16-CCITT, whereas the CRC used in the KERMIT protocol is supposedly the true CRC16-CCITT, and the CRC used in the X.25 protocol is something else. But the KERMIT CRC, like XMODEM , also has an initial value of 0000 , so it cannot detect dropped initial 00 bytes.

When I was researching this article, I finally ran across the ITU (formerly CCITT) standards for V.42 and X.25. Both provide free PDF copies of the standards, and both standards list a 16-bit CRC used as a Frame Check Sequence (V.42 section 8.1.1.6.1 and X.25 section 2.2.7.4). The text in X.25 is a little more verbose, and this standard even gives some sample data strings in an appendix, so I think the X.25 CRC is a good choice to use when you want to use a standard CRC mechanism:

2.2.7.4 Frame Check Sequence (FCS) field

The notation used to describe the FCS is based on the property of cyclic codes that a code vector such as 1000000100001 can be represented by a polynomial \( P(x) = x^{12} + x^5 + 1 \). The elements of an n-element code word are thus the coefficients of a polynomial of order n – 1. In this application, these coefficients can have the value 0 or 1 and the polynomial operations are performed modulo 2. The polynomial representing the content of a frame is generated using the first bit received after the frame opening flag as the coefficient of the highest order term.

The FCS field shall be a 16-bit sequence. It shall be the ones complement of the sum (modulo 2) of:

the remainder of \( x^k (x^{15}+x^{14}+x^{13}+x^{12}+x^{11}+x^{10}+x^9+x^8+x^7+x^6+x^5+x^4+x^3+x^2+x+1) \) divided (modulo 2) by the generator polynomial \( x^{16} + x^{12} + x^5 + 1 \), where k is the number of bits in the frame existing between, but not including, the final bit of the opening flag and the first bit of the FCS, excluding bits (synchronous transmission) or octets (start/stop transmission) inserted for transparency, and bits inserted for transmission timing (i.e. start or stop bits); and the remainder of the division (modulo 2) by the generator polynomial \( x^{16} + x^{12} + x^5 + 1 \) of the product of \( x^{16} \) by the content of the frame, existing between but not including, the final bit of the opening flag and the first bit of the FCS, excluding bits (synchronous transmission) or octets (start/stop transmission) inserted for transparency and bits inserted for transmission timing (i.e. start or stop bits).

As a typical implementation, at the transmitter, the initial content of the register of the device computing the remainder of the division is preset to all 1s and is then modified by division by the generator polynomial (as described above) on the address, control and information fields; the ones complement of the resulting remainder is transmitted as the 16-bit FCS. At the receiver, the initial content of the register of the device computing the remainder is preset to all 1s. The final remainder, after multiplication by \( x^{16} \) and then division (modulo 2) by the generator polynomial \( x^{16} + x^{12} + x^5 + 1 \) of the serial incoming protected bits and the FCS, will be 0001110100001111 (\( x^{15} \) through \( x^0 \), respectively) in the absence of transmission errors.

NOTE — Examples of transmitted bit patterns by the DCE and the DTE illustrating application of the transparency mechanism and the frame check sequence to the SABM command and the UA response are given in Appendix I.

And then Appendix I gives these example patterns (note that X.25 bits are transmitted LSB first so the binary bitstream 1100 0000 is the hex byte 03 ):

03 3F → FCS = 5B EC

01 73 → FCS = 83 57

01 3f → FCS = EB DF

03 73 → FCS = 33 64

So what do we do with all this polynomial nonsense? Well, what it means is that you initialize with 0xFFFF , left-shift in bits from the input least-significant-bit first modulo the polynomial, then XOR with 0xFFFF and interpret the result as two reversed bytes. Here’s some Python code that calculates the X.25 CRC, both using the raw bit shift ( calc_bitwise() ) and using a table-based method for bytewise update ( calc_table() ). And we can check that they match the X.25 expected output:

''' Python code to calculate X.25 CRC16 values. Copyright 2014 Jason M. Sachs Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ''' class CRC_X25(object): def __init__(self): poly = 0x1021 self.poly = poly self.brevtable, self.table = self.create_tables() def create_tables(self): brevtable = [] # Bit-reverse table for bytes table = [] # Each element j of the table has a value # of x^16*p{j} mod poly, where p{j} is # the GF(2) polynomial interpretation # of the bits of integer j for j in xrange(256): bv0 = j brev = 0 bval = j << 8 for k in xrange(8): bval = self._left_shift(bval) brev <<= 1 if bv0 & 1: brev |= 1 bv0 >>= 1 table.append(bval) brevtable.append(brev) return tuple(brevtable), tuple(table) def _left_shift(self, bval): bval <<= 1 if bval & 0x10000: bval ^= 0x10000 bval ^= self.poly return bval def calc_bitwise(self, data): crc = 0xFFFF for b in data: bval = ord(b) for k in xrange(8): crc = self._left_shift(crc) if bval & (1 << k): crc ^= self.poly crc ^= 0xFFFF return chr(self.brevtable[crc >> 8]) + chr(self.brevtable[crc & 0xff]) def calc_table(self, data): crc = 0xFFFF for b in data: bval = self.brevtable[ord(b)] crc = ((crc << 8) & 0xFFFF) ^ self.table[((crc >> 8) ^ bval) & 0x00FF]; crc ^= 0xFFFF return chr(self.brevtable[crc >> 8]) + chr(self.brevtable[crc & 0xff]) def __call__(self, data): return self.calc_table(data)

crcalg = CRC_X25() for input in ['033F','0173','013F','0373']: crc1 = crcalg.calc_bitwise(input.decode('hex')).encode('hex') crc2 = crcalg.calc_table(input.decode('hex')).encode('hex') print "%s -> %s, %s" % (input, crc1, crc2) 033F -> 5bec, 5bec 0173 -> 8357, 8357 013F -> ebdf, ebdf 0373 -> 3364, 3364

Hey, we have a match! It’s nice when things work out the way you expect.

Algorithm Specifications

The best algorithm specifications have these three elements:

A mathematical description of the algorithm A reference implementation Test inputs and outputs

Including all three of these is routine in cryptography algorithms, but I don’t know of any CRC specification that includes all three. The reason to include test inputs and outputs is so that when you implement an algorithm, you immediately have something to test it against, and can gain a high degree of confidence that you have implemented it properly. The reason to include a reference implementation is so that you can demonstrate the algorithm in practice. The reason to include a mathematical description is so that you can unambiguously define the algorithm behavior independent of any computer language.

Most of the CRC algorithm specifications in the standards include only a mathematical description; some include also a reference implementation; I like X.25 because it includes test inputs and outputs (even if it doesn’t have a reference implementation).

During my research for this article, I was shocked to find out that one standard includes only a reference implementation, and neither a mathematical description or test inputs/outputs. This is IETF Standard 51, consisting of RFC’s 1661 and 1662, for PPP, the Point-to-Point Protocol.

Some historical background: In the early 1990’s, the residential Internet access methods we commonly use today, DSL and cable modem, were essentially nonexistent. If you wanted to access something out there on the ‘Net, you used a dialup modem. Not only that, but it was limited to whatever software handled the serial stream communication with the server. Typically it was terminal software, that gave you access to a remote shell on the server. Newer services like Prodigy and America Online had graphical front ends. In either case, the server had access to the Internet, but your computer didn’t. You could only talk to the server and ask it to interact with the world. Then two standards came along that changed this. They were SLIP and PPP. You still had to dial up, but the dialup servers were no longer necessary for providing content, or even for executing programs like a shell or user interface; they now were essentially just Internet router gateways.

PPP is still commonly used for DSL. So it was surprising to me that the RFC for the PPP standard gave only a reference implementation. This is considered bad form: it is rare to present an algorithm and say, “The spec is what this program does,” because maybe the program has implementation-dependent quirks or, even worse, a mistake. If you have an implementation-as-spec and you want to rewrite it to make it more efficient, there’s no guaranteed way to do it — even most small programs are complex enough that it is essentially impossible to guarantee that another implementation is equivalent for all inputs and outputs. The PPP standards don’t even include the term CRC, they just use Frame Check Sequence and FCS. Now, there are alternate RFCs out there, like RFC 1549 which states:

For more information on the specification of the FCS, see ISO3309 [2] or CCITT X.25 [6].

So RFC 1549 at least says the FCS spec is stated formally in another standard. But RFC 1549 was obsoleted and superceded by RFC 1662, which deleted that text in favor of

For more information on the specification of the FCS, see the Appendices.

And the appendices have only the source code. Gack!!!!

Now clearly this isn’t a practical problem; anyone using equipment that uses PPP just sees it working. But I think it’s sad that such a widespread standard botched what could have been a good specification.

Wrapup

Anyway, don’t you make the same sloppy mistake that PPP does. If you’re going to use a CRC algorithm in your system, make sure you specify the algorithm unambiguously, either by citing a formal standard like X.25 that specifies it unambiguously, or by providing the three elements of a good algorithm specification:

A mathematical description of the algorithm A reference implementation Test inputs and outputs

A specification containing only sample code isn’t enough.

© 2014 Jason M. Sachs, all rights reserved.