By Adam Taylor

In the last blog we finished the AES algorithm explanation and understood the steps needed for encryption, decryption, and key expansion. This blog looks at the C code needed to implement the AES algorithm in software, so that we can initially baseline the software code prior to offloading it to the PL (programmable logic) side of the Zynq SoC. Because key expansion is only performed once, when the key is changed, I am not going to accelerate the key expansion. I am only planning to accelerate the forward encryption path.

To ensure that the algorithm is implemented correctly, I will be using the NIST AES FIPS standard examples and then comparing against my results using the Zynq SoC.

Resultant Encryption Code

To ensure we can accelerate the encryption part of the AES code within the PL side of the Zynq SoC, we must develop the code from day one with this objective in mind (see the coding rules here). The first thing to consider is the architecture of the algorithm. We need to segment it properly. AES lends itself well to this approach because we can write functions for each of the steps and then call them as required. Then we can add acceleration one function at a time.

We must also write the function to be accelerated within its own file. Our SW architecture will look like the following:

main. c – This file contains the key expansion algorithm, the encryption key, and the plain-text input along with the call to the AES encryption function.

aes_enc.c – This file performs the encryption. Each of the steps will be coded as its own function so that it can be called as required for the AES Round. To ensure the design is common to those implemented on processors, we use look-up tables for the mixed step’s multiplications.

aes_enc.h – This file includes the definition of the aes_function and the parameters used to determine the size (e.g. mk, nb and nr).

sbox.h – This file includes the substitution box used for the substitute bytes, the look up table for the Rcon function that performs key expansion, and the multiplication look up tables for the mix column multiplications.

With this structure we can select the AES encryption function as the one we wish to accelerate as simply as we did in the last example by right clicking on the function and selecting toggle HW/SW.

To ensure that we are able to determine the baseline performance and the savings we get from the accelerating the function, we need to be able to time the execution of the function. To do this, we will be using the same timing function we used previously. We’ll use sds_clock_counter in sds_lib.h. After I had written the source code (I will make it available on the github once this example is complete), I recorded a time of 36662 processor cycles when the AES algorithm was executed in software running on the ARM Cortex-A9 processor on the PS side of the Zynq SoC.

PS execution time

To establish a baseline for the standard accelerated performance, I quickly enabled the acceleration of the AES encryption function with no optimisation pragmas. To my surprise the result was very similar to the initial software timing: 36010 processor cycles (yes I did check the reports to ensure that the function had been accelerated). This is unusual. The timing of the previous matrix multiplication example (p1, p2, p3) was three times as long as the PS execution initially when accelerated before we added in optimisation pragmas.

PL execution time

Over the next week I will optimize and report back with just how much acceleration accelerated I can get from the AES example, and what I did to achieve this.

Now, you can have convenient, low-cost Kindle access to the first year of Adam Taylor’s MicroZed Chronicles for a mere $7.50. Click here.

Please see the previous entries in this MicroZed Chronicles series by Adam Taylor:

Adam Taylor’s MicroZed Chronicles Part 96: SDSoC In-Depth Example Part 3

Adam Taylor’s MicroZed Chronicles Part 95: SDSoC In-Depth Example Part 2

Adam Taylor’s MicroZed Chronicles Part 94: SDSoC In depth Example Part 1

Adam Taylor’s MicroZed Chronicles Part 93: SDSoC Debugging with Linux Part 9

Adam Taylor’s MicroZed Chronicles Part 92: SDSoC Verification & Build Issues Part 8

Adam Taylor’s MicroZed Chronicles Part 91: More on High-Level Synthesis and SDSoC, Part 7

Adam Taylor’s MicroZed Chronicles Part 90: Introduction to High-Level Synthesis and SDSoC, Part 6

Adam Taylor’s MicroZed Chronicles Part 89: SDSoC Optimization, Part 5

Adam Taylor’s MicroZed Chronicles Part 88: SDSoC Part 4—a look under the hood

Adam Taylor’s MicroZed Chronicles Part 87: Getting SDSoC up and running Part 3

Adam Taylor’s MicroZed Chronicles Part 86: Getting SDSoC up and running

Adam Taylor’s MicroZed Chronicles Part 85: SDSoC—the first instalment

Adam Taylor’s MicroZed(ish) Chronicles Part 84: Simple Communication Interfaces Part 4

Adam Taylor’s MicroZed(ish) Chronicles Part 83: Simple Communication Interfaces Part 3

Adam Taylor’s MicroZed(ish) Chronicles Part 82: Simple Communication Interfaces Part 2

Adam Taylor’s MicroZed(ish) Chronicles Part 81: Simple Communication Interfaces

Adam Taylor’s MicroZed Chronicles Part 80: LWIP Stack Configuration

Adam Taylor’s MicroZed Chronicles Chronicles Part 79: Zynq SoC Ethernet Part III

Adam Taylor’s MicroZed Chronicles Chronicles Part 78: Zynq SoC Ethernet Part II

Adam Taylor’s MicroZed Chronicles Microzed Chronicles Part 77 – Introducing the Zynq SoC’s Ethernet

Adam Taylor’s MicroZed Chronicles Part 76: Constraints for Relatively Placed Macros

Adam Taylor’s MicroZed Chronicles, Part 75: Placement Constraints – Pblocks

Adam Taylor’s MicroZed Chronicles, Part 73: Physical Constraints

Adam Taylor’s MicroZed Chronicles, Part 73: Working with other Zynq-Based Boards

Adam Taylor’s MicroZed Chronicles, Part 72: Multi-cycle Constraints

Adam Taylor’s MicroZed Chronicles, Part 70: Constraints—Clock Relationships and Avoiding Metastability

Adam Taylor’s MicroZed Chronicles, Part 70: Constraints—Introduction to timing and defining a clock

Adam Taylor’s MicroZed Chronicles Part 69: Zynq SoC Constraints Overview

Adam Taylor’s MicroZed Chronicles Part 68: AXI DMA Part 3, the Software

Adam Taylor’s MicroZed Chronicles Part 67: AXI DMA II

Adam Taylor’s MicroZed Chronicles Part 66: AXI DMA

Adam Taylor’s MicroZed Chronicles Part 65: Profiling Zynq Applications II

Adam Taylor’s MicroZed Chronicles Part 64: Profiling Zynq Applications

Adam Taylor’s MicroZed Chronicles Part 63: Debugging Zynq Applications

Adam Taylor’s MicroZed Chronicles Part 62: Answers to a question on the Zynq XADC

Adam Taylor’s MicroZed Chronicles Part 61: PicoBlaze Part Six

Adam Taylor’s MicroZed Chronicles Part 60: The Zynq and the PicoBlaze Part 5—controlling a CCD

Adam Taylor’s MicroZed Chronicles Part 59: The Zynq and the PicoBlaze Part 4

Adam Taylor’s MicroZed Chronicles Part 58: The Zynq and the PicoBlaze Part 3

Adam Taylor’s MicroZed Chronicles Part 57: The Zynq and the PicoBlaze Part Two

Adam Taylor’s MicroZed Chronicles Part 56: The Zynq and the PicoBlaze

Adam Taylor’s MicroZed Chronicles Part 55: Linux on the Zynq SoC

Adam Taylor’s MicroZed Chronicles Part 54: Peta Linux SDK for the Zynq SoC

Adam Taylor’s MicroZed Chronicles Part 53: Linux and SMP

Adam Taylor’s MicroZed Chronicles Part 52: One year and 151,000 views later. Big, Big Bonus PDF!

Adam Taylor’s MicroZed Chronicles Part 51: Interrupts and AMP

Adam Taylor’s MicroZed Chronicles Part 50: AMP and the Zynq SoC’s OCM (On-Chip Memory)

Adam Taylor’s MicroZed Chronicles Part 49: Using the Zynq SoC’s On-Chip Memory for AMP Communications

Adam Taylor’s MicroZed Chronicles Part 48: Bare-Metal AMP (Asymmetric Multiprocessing)

Adam Taylor’s MicroZed Chronicles Part 47: AMP—Asymmetric Multiprocessing on the Zynq SoC

Adam Taylor’s MicroZed Chronicles Part 46: Using both of the Zynq SoC’s ARM Cortex-A9 Cores

Adam Taylor’s MicroZed Chronicles Part 44: MicroZed Operating Systems—FreeRTOS

Adam Taylor’s MicroZed Chronicles Part 43: XADC Alarms and Interrupts

Adam Taylor’s MicroZed Chronicles MicroZed Part 42: MicroZed Operating Systems Part 4

Adam Taylor’s MicroZed Chronicles MicroZed Part 41: MicroZed Operating Systems Part 3

Adam Taylor’s MicroZed Chronicles MicroZed Part 40: MicroZed Operating Systems Part Two

Adam Taylor’s MicroZed Chronicles MicroZed Part 39: MicroZed Operating Systems Part One

Adam Taylor’s MicroZed Chronicles MicroZed Part 38 – Answering a question on Interrupts

Adam Taylor’s MicroZed Chronicles Part 37: Driving Adafruit RGB NeoPixel LED arrays with MicroZed Part 8

Adam Taylor’s MicroZed Chronicles Part 36: Driving Adafruit RGB NeoPixel LED arrays with MicroZed Part 7

Adam Taylor’s MicroZed Chronicles Part 35: Driving Adafruit RGB NeoPixel LED arrays with MicroZed Part 6

Adam Taylor’s MicroZed Chronicles Part 34: Driving Adafruit RGB NeoPixel LED arrays with MicroZed Part 5

Adam Taylor’s MicroZed Chronicles Part 33: Driving Adafruit RGB NeoPixel LED arrays with the Zynq SoC

Adam Taylor’s MicroZed Chronicles Part 32: Driving Adafruit RGB NeoPixel LED arrays

Adam Taylor’s MicroZed Chronicles Part 31: Systems of Modules, Driving RGB NeoPixel LED arrays

Adam Taylor’s MicroZed Chronicles Part 30: The MicroZed I/O Carrier Card

Zynq DMA Part Two – Adam Taylor’s MicroZed Chronicles Part 29

The Zynq PS/PL, Part Eight: Zynq DMA – Adam Taylor’s MicroZed Chronicles Part 28

The Zynq PS/PL, Part Seven: Adam Taylor’s MicroZed Chronicles Part 27

The Zynq PS/PL, Part Six: Adam Taylor’s MicroZed Chronicles Part 26

The Zynq PS/PL, Part Five: Adam Taylor’s MicroZed Chronicles Part 25

The Zynq PS/PL, Part Four: Adam Taylor’s MicroZed Chronicles Part 24

The Zynq PS/PL, Part Three: Adam Taylor’s MicroZed Chronicles Part 23

The Zynq PS/PL, Part Two: Adam Taylor’s MicroZed Chronicles Part 22

The Zynq PS/PL, Part One: Adam Taylor’s MicroZed Chronicles Part 21

Introduction to the Zynq Triple Timer Counter Part Four: Adam Taylor’s MicroZed Chronicles Part 20

Introduction to the Zynq Triple Timer Counter Part Three: Adam Taylor’s MicroZed Chronicles Part 19

Introduction to the Zynq Triple Timer Counter Part Two: Adam Taylor’s MicroZed Chronicles Part 18

Introduction to the Zynq Triple Timer Counter Part One: Adam Taylor’s MicroZed Chronicles Part 17

The Zynq SoC’s Private Watchdog: Adam Taylor’s MicroZed Chronicles Part 16

Implementing the Zynq SoC’s Private Timer: Adam Taylor’s MicroZed Chronicles Part 15

MicroZed Timers, Clocks and Watchdogs: Adam Taylor’s MicroZed Chronicles Part 14

More About MicroZed Interrupts: Adam Taylor’s MicroZed Chronicles Part 13

MicroZed Interrupts: Adam Taylor’s MicroZed Chronicles Part 12

Using the MicroZed Button for Input: Adam Taylor’s MicroZed Chronicles Part 11

Driving the Zynq SoC's GPIO: Adam Taylor’s MicroZed Chronicles Part 10

Meet the Zynq MIO: Adam Taylor’s MicroZed Chronicles Part 9

MicroZed XADC Software: Adam Taylor’s MicroZed Chronicles Part 8

Getting the XADC Running on the MicroZed: Adam Taylor’s MicroZed Chronicles Part 7

A Boot Loader for MicroZed. Adam Taylor’s MicroZed Chronicles, Part 6

Figuring out the MicroZed Boot Loader – Adam Taylor’s MicroZed Chronicles, Part 5

Running your programs on the MicroZed – Adam Taylor’s MicroZed Chronicles, Part 4

Zynq and MicroZed say “Hello World”-- Adam Taylor’s MicroZed Chronicles, Part 3

Adam Taylor’s MicroZed Chronicles: Setting the SW Scene

Bringing up the Avnet MicroZed with Vivado