Data acquisition and signal processing are ubiquitous across a range of applications, from consumer and industrial to automotive, aerospace, and beyond. A common trend across these wide-ranging applications is the ability for the signal processing system to support an increasing number of different sensor modalities with higher sampling rates and faster response times.

This trend for a faster response time when working with larger quantities of diverse data places significant demands on the processing architecture.

High Level Synthesis

Data acquisition and signal processing requirements with growing complexity are motivating the migration of processing from PC to the embedded space, free from the bounds of a traditional PC platform. These embedded applications employ dedicated hardware capable of achieving their demanding performance needs.

This is where FPGAs can offer a significant advantage as they can implement a true signal processing path, thanks to the inherently parallel nature of logic. To boost computation performance even further, modern FPGAs provide several resources in the logic fabric such DSP and Block RAMs. FPGA development tool chains have also matured to support the creation of complex signal processing chains.

One modern toolchain trend is to generate RTL descriptions using a higher-level language such as C or C++, so called High Level Synthesis or HLS. The ability to perform HLS allows the developer to work at a higher level of abstraction. Working at this higher level of abstraction empowers the developer to focus on the algorithmic behaviour and performance, letting the HLS tool create the lower level implementation. Used properly, HLS can therefore allow for a much faster development flow than a traditional RTL based development. An additional advantage is that verification can be performed in the native implementation language without special fixturing that would otherwise be required with a “port” to an HDL.

HLS uses languages commonly used in PC based algorithms for signal processing, machine learning, and image processing. Thanks to this common language support these algorithms can be easily re-targeted to an FPGA-based solution. When these algorithms are accelerated in an FPGA based solution, the user gains a significant performance increase in throughput, determinism, and latency.

Of course, using an HLS based approach may not be as efficient as handcrafted RTL. However, device densities and timing performance have evolved over FPGA generations to a point where for many data flow applications HLS provides a “good enough” solution with a significantly reduced development time.

FrontPanel + HLS

To address the range of markets which require data acquisition and signal processing solutions, Opal Kelly provide production-ready FPGA modules aimed at three primary application areas: integration, evaluation, and acceleration. Using production-ready modules enables the solution developer to focus on their core competencies and delegate the FPGA hardware to a proven, cost-effective, off-the-shelf solution.

One of the key elements which accelerates development of these integration modules is the FrontPanel SDK and API. FrontPanel rapidly reduces the development time taken to create high speed interfaces from a PC to the embedded application.

To demonstrate the ease with which FrontPanel and High-Level Synthesis can be combined, the Opal Kelly engineering team recently created an HLS reference design. This reference design fuses FrontPanel with an HLS Implementation of an FIR filter, allowing signals to be downloaded to any FrontPanel-enabled module, processed, and uploaded for later analysis.

To integrate the HLS FIR Filter with FrontPanel several different FrontPanel endpoints are deployed in the XEM7320 FPGA design. These different endpoints were used to enable:

Transfer of data between the PC and XEM7320 – PipeIn / PipeOut these endpoints supports multiple byte transfers.

Control and monitoring of logic functions within the XEM7320 e.g. reset and monitoring the state machine – WireIn/WireOut these endpoints allow transfer of simple state information.

Triggering the HLS FIR filter to run – TriggerIn endpoint this is a one-shot data.

The resulting architecture of the implemented example design is shown below.

Generation of test data for the example is created via Octave scripts. Using Octave also allows the ideal performance of the filter to be modeled with the generated test data.

When the generated Octave data was applied to the example design running in the XE7320, a comparison was performed against the ideal filter implementation.

The error observed between the floating-point Octave implementation and fixed-point HLS implementation is small. This error is in line with what would be expected when a floating point model is quantized to a fixed-point implementation.

While this example outlines a simple method for integrating HLS with FrontPanel, the methodology would be similar for larger more complex designs such as machine vision, digital communications, and machine learning.

Combining HLS and FrontPanel in this manner really enables developers to get quickly to heart of the matter and focus on their value-added activities. Two of the main time-consuming development activities are addressed: HLS allows the focus to be on the performance of the algorithm, while FrontPanel allows for data to be moved on and off the integration module at speed with ease.

This fusion of HLS and FrontPanel also allows algorithms which have previously been implemented using PC based solutions to be accelerated with ease.

More Information

HLS Example https://github.com/opalkelly-opensource/frontpanel-hls

FrontPanel: https://www.opal-kelly.local/products/frontpanel/

XEM7320 Documentation: https://docs.opalkelly.com/display/XEM7320/XEM7320