Poly1305-AES: a state-of-the-art message-authentication code

A state-of-the-art message-authentication code

Poly1305-AES computes a 16-byte authenticator of a message of any length, using a 16-byte nonce (unique message number) and a 32-byte secret key. Attackers can't modify or forge messages if the message sender transmits an authenticator along with each message and the message receiver checks each authenticator.

There's a mailing list for Poly1305-AES discussions. To subscribe, send an empty message to poly1305-subscribe@list.cr.yp.to .

Why is Poly1305-AES better than other message-authentication codes?

Guaranteed security if AES is secure. There's a theorem guaranteeing that the security gap is extremely small (n/2^(102) per forgery attempt for 16n-byte messages) even for long-term keys (2^64 messages). The only way for an attacker to break Poly1305-AES is to break AES.

There's a theorem guaranteeing that the security gap is extremely small (n/2^(102) per forgery attempt for 16n-byte messages) even for long-term keys (2^64 messages). The only way for an attacker to break Poly1305-AES is to break AES. Cipher replaceability. If anything does go wrong with AES, users can switch from Poly1305-AES to Poly1305-AnotherFunction, with an identical security guarantee.

If anything does go wrong with AES, users can switch from Poly1305-AES to Poly1305-AnotherFunction, with an identical security guarantee. Extremely high speed. My Poly1305-AES software takes just 3843 Athlon cycles, 5361 Pentium III cycles, 5464 Pentium 4 cycles, 4611 Pentium M cycles, 8464 PowerPC 7410 cycles, 5905 PowerPC RS64 IV cycles, 5118 UltraSPARC II cycles, or 5601 UltraSPARC III cycles to verify an authenticator on a 1024-byte message. Poly1305-AES offers consistent high speed, not just high speed for one favored CPU.

My Poly1305-AES software takes just 3843 Athlon cycles, 5361 Pentium III cycles, 5464 Pentium 4 cycles, 4611 Pentium M cycles, 8464 PowerPC 7410 cycles, 5905 PowerPC RS64 IV cycles, 5118 UltraSPARC II cycles, or 5601 UltraSPARC III cycles to verify an authenticator on a 1024-byte message. Poly1305-AES offers high speed, not just high speed for one favored CPU. Low per-message overhead. My Poly1305-AES software takes just 1232 Pentium 4 cycles, 1264 PowerPC 7410 cycles, or 1077 UltraSPARC III cycles to verify an authenticator on a 64-byte message. Poly1305-AES offers consistent high speed, not just high speed for long messages. Most competing functions are designed for long messages and don't pay attention to short-packet performance.

My Poly1305-AES software takes just 1232 Pentium 4 cycles, 1264 PowerPC 7410 cycles, or 1077 UltraSPARC III cycles to verify an authenticator on a 64-byte message. Poly1305-AES offers high speed, not just high speed for long messages. Most competing functions are designed for long messages and don't pay attention to short-packet performance. Key agility. Poly1305-AES can fit thousands of simultaneous keys into cache, and remains fast even when keys are out of cache. Poly1305-AES offers consistent high speed, not just high speed for single-key benchmarks. Almost all competing functions use a large table for each key; as the number of keys grows, those functions miss the cache and slow down dramatically.

Poly1305-AES can fit thousands of simultaneous keys into cache, and remains fast even when keys are out of cache. Poly1305-AES offers high speed, not just high speed for single-key benchmarks. Almost all competing functions use a large table for each key; as the number of keys grows, those functions miss the cache and slow down dramatically. Parallelizability and incrementality. Poly1305-AES can take advantage of additional hardware to reduce the latency for long messages, and can be recomputed at low cost for a small modification of a long message.

Poly1305-AES can take advantage of additional hardware to reduce the latency for long messages, and can be recomputed at low cost for a small modification of a long message. No intellectual-property claims. I am not aware of any patents or patent applications relevant to Poly1305-AES.

Guaranteed security, cipher replaceability, and parallelizability are provided by the standard polynomial-evaluation-Wegman-Carter-MAC framework. Within that framework, hash127-AES achieved extremely high speed at the expense of a large table for each key. The big advantage of Poly1305-AES is key agility: extremely high speed without any key expansion.

Other standard MACs are slower and less secure than Poly1305-AES. Specifically, HMAC-MD5 is slower and doesn't have a comparable security guarantee; CBC-MAC-AES is much slower and has a weaker security guarantee. Both HMAC-MD5 and CBC-MAC-AES are breakable within 2^64 messages. I'm not saying that anyone is going to carry out this attack; I'm saying that everyone satisfied with the standard CBC security level should be satisfied with the even higher security level of Poly1305-AES.

How do I use Poly1305-AES in my own software?

poly1305aes

To get started, download and unpack the poly1305aes library:

wget http://cr.yp.to/mac/poly1305aes-20050218.tar.gz gunzip < poly1305aes-20050218.tar.gz | tar -xf -

-m64

cd poly1305aes-20050218 env CC='gcc -O2' make

cp `cat FILES.lib` yourproject/ cat Makefile.lib >> yourproject/Makefile

poly1305aes.h

Makefile

poly1305aes.a

poly1305aes.a

poly1305aes.h

Inside the program, to generate a 32-byte Poly1305-AES key, start by generating 32 secret random bytes from a cryptographically safe source: kr[0] , kr[1] , ..., kr[31] . Then call

poly1305aes_clamp(kr)

kr[0]

kr[1]

kr[31]

Later, to send a message m[0] , m[1] , ..., m[len-1] with a 16-byte nonce n[0] , n[1] , ..., n[15] (which must be different for every message!), call

poly1305aes_authenticate(a,kr,n,m,len)

a[0]

a[1]

a[15]

After receiving an authenticated message a[0] , a[1] , ..., a[15] , n[0] , n[1] , ..., n[15] , m[0] , m[1] , ..., m[len-1] , call

poly1305aes_verify(a,kr,n,m,len)

poly1305aes_verify

Do not generate or accept messages longer than a gigabyte. If you need to send large amounts of data, you are undoubtedly breaking the data into small packets anyway; security requires a separate authenticator for every packet.

Please make sure to set up a Googleable web page identifying your program and saying that it is ``powered by Poly1305-AES.''

How does the Poly1305-AES implementation work?

The simplest C implementation of Poly1305-AES is poly1305aes_test, which relies on GMP and OpenSSL. I suggest starting from the top: read poly1305aes_test_verify.c and work your way down.

Test implementations in other languages:

C++ (only the Poly1305 part): poly1305_gmpxx.h , poly1305_gmpxx.cpp . This is easier to read than the C version.

, . This is easier to read than the C version. Python: Ken Raeburn has contributed some sample code with my four ``Appendix A'' tests.

Perl: Tony Betts has contributed some sample code with the same tests.

You can then move on to the serious implementations:

poly1305aes_sparc , published 31 January 2005.

, published 31 January 2005. poly1305aes_aix , published 6 February 2005.

, published 6 February 2005. poly1305aes_macos , published 7 February 2005.

, published 7 February 2005. poly1305aes_ppro , published 14 February 2005.

, published 14 February 2005. poly1305aes_athlon , published 18 February 2005.

If you're trying to achieve better speed, make sure you understand all the different situations covered by my speed tables. You might want to start with my essay on the difference between best-case benchmarks and the real world. I designed the Poly1305-AES software, and the underlying Poly1305-AES function, to provide consistent high speed in a broad range of applications. A slight speedup in some situations often means a big slowdown in others; a Poly1305-AES implementation making that tradeoff might be useful for some applications, but it will be at best an alternative, not a replacement.

Where can I learn more about Poly1305-AES?