By Adam Taylor

The Xilinx Zynq UltraScale+ MPSoC is good for many applications including embedded vision. It’s APU with two or four 64-bit ARM Cortex-A53 processors, Mali GPU, DisplayPort interface, and on-chip programmable logic (PL) give the Zynq UltraScale+ MPSoC plenty of processing power to address exciting applications such as ADAS and vision-guided robotics with relative ease. Further, we can use the device’s PL and its programmable I/O to interface with a range of vision and video standards including MIPI, LVDS, parallel, VoSPI, etc. When it comes to interfacing image sensors, the Zynq UltraScale+ MPSoC can handle just about anything you throw at it.

Once we’ve brought the image into the Zynq UltraScale+ MPSoC’s PL, we can implement an image-processing pipeline using existing IP cores from the Xilinx library or we can develop our own custom IP cores using Vivado HLS (high-level synthesis). However, for many applications we’ll need to move the images into the device’s PS (processing system) domain before we can apply exciting application-level algorithms such as decision making or use the Xilinx reVISION acceleration stack.

The original MicroZed Evaluation kit and UltraZed board used for this demo

I thought I would kick off the fourth year of this blog with a look at how we can use VDMA instantiated in the Zynq MPSoC’s PL to transfer images from the PL to the PS-attached DDR Memory without processor intervention. You often need to make such high-speed background transfers in a variety of applications.

To do this we will use the following IP blocks:

Zynq MPSoC core – Configured to enable both a Full Power Domain (FPD) AXI HP Master and FPD HPC AXI Slave, along with providing at least one PL clock and reset to the PL fabric.

VDMA core – Configured for write only operations, No FSync option and with a Genlock Mode of master

Test Pattern Generator (TPG) – Configurable over the AXI Lite interface

AXI Interconnects – Implement the Master and Slave AXI networks

Once configured over its AXI Lite interface, the Test Pattern Generator outputs test patterns which are then transferred into the PS-attached DDR memory. We can demonstrate that this has been successful by examining the memory locations using SDK.

Enabling the FPD Master and Slave Interfaces

For this simple example, we’ll clock both the AXI networks at the same frequency, driven by PL_CLK_0 at 100MHz.

For a deployed system, an image sensor would replace the TPG as the image source and we would need to ensure that the VDMA input-channel clocks (Slave-to-Memory-Map and Memory-Map-to-Slave) were fast enough to support the required pixel and frame rate. For example, a sensor with a resolution of 1280 pixels by 1024 lines running at 60 frames per second would require a clock rate of at least 108MHz. We would need to adjust the clock frequency accordingly.

Block Diagram of the completed design

To aid visibility within this example, I have included three ILA modules, which are connected to the outputs of the Test Pattern Generator, AXI VDMA, and the Slave Memory Interconnect. Adding these modules enables the use of Vivado’s hardware manager to verify that the software has correctly configured the TPG and the VDMA to transfer the images.

With the Vivado design complete and built, creating the application software to configure the TPG and VDMA to generate images and move them from the PL to the PS is very straightforward. We use the AXIVDMA, V_TPG, Video Common APIs available under the BSP lib source directory to aid in creating the application. The software itself performs the following:

Initialize the TPG and the AXI VDMA for use in the software application Configure the TPG to generate a test pattern configured as below Set the Image Width to 1280, Image Height to 1080 Set the color space to YCRCB, 4:2:2 format Set the TPG background pattern Enable the TPG and set it for auto reloading Configure the VDMA to write data into the PS memory Set up the VDMA parameters using a variable of the type XAxiVdma_DmaSetup – remember the horizontal size and stride are measured in bytes not pixels. Configure the VDMA with the setting defined above Set the VDMA frame store location address in the PS DDR Start VDMA transfer

The application will then start generating test frames, transferred from the TPG into the PS DDR memory. I disabled the caches for this example to ensure that the DDR memory is updated.

Examining the ILAs, you will see the TPG generating frames and the VDMA transferring the stream into memory mapped format:

TPG output, TUSER indicates start of frame while TLAST indicates end of line

VDMA Memory Mapped Output to the PS

Examining the frame store memory location within the PS DDR memory using SDK demonstrates that the pixel values are present.

Test Pattern Pixel Values within the PS DDR Memory

You can use the same approach in Vivado when creating software for a Zynq Z-7000 SoC iinstead of a Zynq UltraScale+ MPSoC by enabling the AXI GP master for the AXI Lite bus and AXI HP slave for the VDMA channel.

Should you be experiencing trouble with your VDMA based image processing chain, you might want to read this blog.

The project, as always, is on GitHub.

If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.

First Year E Book here

First Year Hardback here.

Second Year E Book here

Second Year Hardback here