Over the last year we’ve had several posts about the Lattice Semiconductor iCEstick which is shown below. The board looks like an overgrown USB stick with no case, but it is really an FPGA development board. The specs are modest and there is a limited amount of I/O, but the price (about $22, depending on where you shop) is right. I’ve wanted to do a Verilog walk through video series for awhile, and decided this would be the right target platform. You can experiment with a real FPGA without breaking the bank.

In reality, you can learn a lot about FPGAs without ever using real hardware. As you’ll see, a lot of FPGA development occurs with simulated FPGAs that run on your PC. But if you are like me, blinking a virtual LED just isn’t as exciting as making a real one glow. However, for the first two examples I cover you don’t need any hardware beyond your computer. If you want to get ready, you can order an iCEstick and maybe it’ll arrive before Part III of this series if published.

I’m not going to directly try to teach Verilog. If you know C you can pick it up quickly and if you don’t there are plenty of text-based and video-based tutorials to choose from (I’ll add a few at the end of this post). However, I will point out a few key areas that trip up new FPGA designers and by following the example code, you’ll be up to speed in no time.

Choose a Verilog Simulator

For part I, you need a Verilog simulator. I’m going to use EDAPlayground because it is free and it will run in your browser. No software to set up and no worry if you use some crazy operating system (like Windows). If you have a modern browser, you are all set.

I know some people don’t want to work on the Web or don’t want to create an account (honestly, though, this will just be tutorial code and making up a disposable e-mail address is easy enough). If you just can’t bear it, you can run all the examples on your desktop with Icarus Verilog (using GTKWave to display the results). I just won’t be talking about how to do that. You can read the Icarus introduction if you want to go that route. I still suggest you stick with EDAPlayground for the tutorial.

For the FPGA tools used in Part III, I’m using the open source Icestorm tools. I tried using the Lattice tools and it was heartbreakingly difficult to get them installed and licensed. I’ll have more to say about that in part III.

About the Target Hardware

The IceStick has a modest FPGA onboard. From its manual, it has the following features:

High-performance, low-power iCE40HX1K FPGA

FTDI 2232H USB device allows iCE device programming and UART interface to a PC

Vishay TFDU4101 IrDA transceiver

Five user LEDs (one green and four red)

2 x 6 position Diligent Pmod compatible connector enables many other peripheral connections

Discera 12Mhz MEMS oscillator

Micron 32Mbit N25Q32 SPI flash

16 LVCMOS/LVTTL (3.3V) digital I/O connections on 0.1” through-hole connections

The FPGA isn’t huge, but it is big enough to host a simple CPU (we covered the CPU earlier). We aren’t going to start with a CPU, though. We’ll start with something much more simple.

Let’s Build an Adder

There are two main kinds of circuits you build on any FPGA: combinatorial and sequential. The difference is simple: combinatorial logic is all logic gates. The past state of the circuit doesn’t matter. Given a certain set of inputs, the outputs will be the same. A sequential circuit (which almost always has a flip flop in it) has some memory of a previous state that changes the output. I wanted to show examples of both and how you map them to the board.

The example circuit we’ll build has three major parts. The first is a two-bit adder circuit that shows a binary sum and carry on two of the board’s five LEDs. This is a simple combinatorial circuit. It doesn’t make use of the onboard 12MHz clock. The other two portions will. The first sequential circuit will be a simple memory that latches true if a carry has ever been generated by the adder (after a reset, of course). The other sequential circuit is a set of counters that combine to provide a 1/2 second pulse from the 12MHz clock.

Verilog Versus Schematic Entry

For simple circuits, it is tempting to just draw a schematic like the one above and either machine translate that to the FPGA or hand translate it to Verilog. Some tools support this and you may think that’s the way to go. I know I did when I got started.

The truth is, through, that after you move away from simple things, the schematics can be very painful. For example, think of a seven segment decoder. If you took a few minutes you could probably work out the AND OR and NOT gates required to perform the function (that is, convert a four-bit binary number to a seven segment display). But it would take a few minutes.

If you use Verilog, you can take a simple approach and just write out the gates you want. That will work, but it is usually not the right answer. Instead, you should describe the circuit behavior you want and the Verilog compiler will infer what circuits it takes to create what you need. For the seven segment decoder this could be as simple as:

always @(*) case (number) 4'h0: dispoutput <= 7'b1111110; 4'h1: dispoutput <= 7'b0110000; 4'h2: dispoutput <= 7'b1101101; . . .

I promised I’d point out some of the stranger points of Verilog, so let’s look at that in a little more detail. The always statement tells Verilog that the code following should execute whenever any of the inputs you use in it change. This infers a combinatorial circuit since there is no clock. The case statement is like a switch statement in C. The funny looking numbers are four bit hex (4’h1) and 7 bit binary (7’b1101101). So the code instructs the FPGA (or, more accurately, the Verilog compiler) to examine the number and set dispoutput based on the input.

The <= character, by the way, are a non-blocking assignment. You could also use an equal sign here to create a blocking assignment. For now, the difference doesn’t matter, but we’ll revisit that topic when working with a sequential design.

From this description of what you want, the Verilog compiler will infer the right gates and may even be able to perform some optimizations. A key difference between an FPGA and building things on a microcontroller has to do with parallelism. If you wrote similar C code on, say, an Arduino, every copy of it would take some execution time. If you had, for example, 50 decoders, the CPU would have to service each one in turn. On the FPGA you’d just get 50 copies of the same circuit, all operating at once.

Key Verilog Point #1: Verilog isn’t Executable (Except when it is)

That’s a really important point. With an FPGA, the circuitry that drives each display just works all the time. Sure, there is a small delay through the gates (probably picoseconds) but that’s true even with discrete circuitry. It isn’t because the FPGA is executing lines of Verilog code or some equivalent structure. The Verilog becomes connecting wires that wire up circuit elements just as though you had a sea of gates on a PCB and you connected them with wire wrap.

There is an exception to this. During simulation, Verilog does act like a programming language, but it has very specific rules for keeping the timing the same as it will be on the FPGA. However, it also allows you to write constructs that would not be transferable to the FPGA. For example, a subroutine call doesn’t make sense in hardware, but you can do it during simulation. In general, you want to avoid non-synthesizable Verilog except when writing your testbench (the driver for your simulation; I’ll talk more about it in a minute).

Look back at the adder schematic. The sum is a simple XOR gate and the carry is an AND gate. I can express that in Verilog, if I want to, like this:

assign carry=inA&inB; assign sum=inA^inB;

It is smarter, though, to let Verilog figure that out. I can make a variable with two bits in it like this:

reg [1:0] fullsum;

Then I could say:

assign fullsum={1’b0, inA} + {1’b0, inB}; assign carry=fullsum[1]; assign sum=fullsum[0];

The braces turn the one bit wires inA and inB into two bit quantities. In this simple example, I might have actually stuck to the first method, but if you think back on the 7 segment decoder, you’ll see it makes sense to use this inferring style where possible.

Modules and Definitions

When you watch the video below or browse the code, you’ll notice there’s a few minor things I glossed over. For one, all of this code lives in a module. You can think of a module loosely as a subroutine or, better, a C++ class. Other modules can create copies of a module and map different signals to its inputs and outputs. There’s also definitions of all the nets used (we already talked about wires and regs):

module demo( output LED1, output LED2, output LED3, output LED4, output LED5, input PMOD1, // input A input PMOD2, // input B input PMOD3, // run/stop input PMOD4 // reset ); // Alias inputs wire inA; wire inB; assign inA=PMOD1; assign inB=PMOD2;

Note that I wanted the signals to have names associated with the physical hardware (like LED1 and PMOD2 ) but then later I wanted to use more meaningful names like inB . The assign statement makes this connection. This is a simple use of that statement. If you recall, one way to build the adder was to assign two bits using an expression. That kind of usage is far more common.

A Test Bench Makes The Simulation Possible

Before you commit your design to an FPGA, you’ll probably want to simulate it. Debugging is much easier during simulation because you can examine everything. When the Verilog simulator runs, it follows rules about timing that take into account how everything runs at the same time, so the behavior should be exactly what your FPGA will do.

The only thing many simulators won’t do is account for things like timing on the chip itself (although with the right tools, you can simulate that too). For example, your design may depend on an input changing before a clock edge (the set up time on the flip flop, for example) but because of the routing on the chip, the input won’t change in time.

This kind of timing violation is a real problem with large chips and high speeds. For this sort of small circuit, it shouldn’t be an issue. For now, we can assume if the simulation works, the FPGA should behave in the same way.

To test our code, we need a testbench which is just a way to say a piece of Verilog code that works like the outside world to our unit under test (in this case, the whole design). The code will never synthesize, so we can use strange Verilog features that we don’t normally use in our regular code.

The first thing to do is create a module for the testbench (the name isn’t important) and create an instance of the module we want to test:

`default_nettype none

module tb; reg a, b; wire led1, led2, led3, led4, led5; demo dut(led1,led2,led3,led4,led5,a,b);

Note that there is a reg for each input we want to feed the device under test and a wire for each output it will drive. That means all of those reg variables need to be set up to our test conditions.

The variables need to be initialized. Verilog provides an initial block that is usually not valid for synthesis, but will be the main part of most test benches. Here’s the first part of it:

initial begin $dumpfile("dump.vcd"); $dumpvars(0, dut);

a=1'b0; b=1'b0;

The two $ statements tell the testbench to dump variables from the device under test to a file called dump.vcd (this is where EDAPlayground looks for it, too, so don’t change it unless you are using your own Verilog simulator). We’ll be able to examine anything that gets dumped. You can also print things using $display, but I didn’t do that in this test.

The next thing you need is some test case stimulus. In the case of the counters, you don’t need anything other than the clock. But the adder-related circuitry needs some values:

#2 a=1'b1; #4 b=1'b1; #4 a=1'b0; #4 b=1'b0; #4 $finish; end

So at first, a=1 and b=0. Then after 4 cycles, a=1 and b=1. After another 4 cycles a=0, b=1. Then a=0, b=0. The $finish statement causes the simulation to end. Without this, the clock generator will cause the simulation to keep going forever.

You can find the code and the testbench on EDAPlayground (you can even run the simulation from there). When you run the simulation, a waveform will appear (see below). If you want to know more about how it works, check out the video below and I’ll walk through it step by step.

Next Time

In tomorrow’s installment of this series, I’ll show you how to add sequential (clocked) logic to the design and the testbench. Using clocks are an important part of making practical digital designs, as you’ll soon see. I’ll also have a few more Verilog key points.

Selected Tutorials

If you are looking for a detailed Verilog tutorial, try these:

You can also read the next post in this series.