We build the RISC-V software tools from their Git repositories and create & verify assembly instructions against the open source ISA specification for an RV32IM core.

Overview

We're evaluating the RISC-V open source ISA and various cores, including our own in-progress implementation. We need a software toolchain to easily create assembly instructions & sequences for the purpose of execution in both a simulator and FPGA. Therefore, we have posted this page to document some of the getting started steps required to work with the software tools and produce assembly. Our goal with this work includes implementing a 32-bit microcontroller for integer operations inside an FPGA combined with our Private Island™ project.

A good resource for reviewing the available RISC-V software related tools is the RISC-V Software Ecosystem Overview page. In the steps shown below, we'll be working with the RISC-V toolchain repos found on the Github page RISC-V GNU Compiler Toolchain.

Build the Toolchain

The following steps are performed on an Ubuntu 18.04 machine and closely follow the documentation available on the aforementioned Github page. Also, refer to this page for a list of required packages for Ubuntu (e.g., libtool).

We first recursively clone the suite of open source GNU tools for RISC-V:

$ cd /build $ git clone --recursive https://github.com/riscv/riscv-gnu-toolchain Cloning into 'riscv-gnu-toolchain'... ... Submodule 'qemu' (https://git.qemu.org/git/qemu.git) registered for path 'qemu' Submodule 'riscv-binutils' (https://github.com/riscv/riscv-binutils-gdb.git) registered for path 'riscv-binutils' Submodule 'riscv-dejagnu' (https://github.com/riscv/riscv-dejagnu.git) registered for path 'riscv-dejagnu' Submodule 'riscv-gcc' (https://github.com/riscv/riscv-gcc.git) registered for path 'riscv-gcc' Submodule 'riscv-gdb' (https://github.com/riscv/riscv-binutils-gdb.git) registered for path 'riscv-gdb' Submodule 'riscv-glibc' (https://github.com/riscv/riscv-glibc.git) registered for path 'riscv-glibc' Submodule 'riscv-newlib' (https://github.com/riscv/riscv-newlib.git) registered for path 'riscv-newlib' ... Submodule path 'riscv-binutils': checked out '2cb5c79dad39dd438fb0f7372ac04cf5aa2a7db7' Submodule path 'riscv-dejagnu': checked out '4ea498a8e1fafeb568530d84db1880066478c86b' Submodule path 'riscv-gcc': checked out '22b1bd36b05772863fd55d4056dbc739ff591942' Submodule path 'riscv-gdb': checked out 'fec47beb8a1f0a6c4a6b0c548cded5711d0c27da' Submodule path 'riscv-glibc': checked out '7395b0964db9cc4dd544926414960e9a16842180' Submodule path 'riscv-newlib': checked out 'f289cef6be67da67b2d97a47d6576fa7e6b4c858'

Let's make sure we understand that we just cloned a repository of repositories:

$ cd riscv-gnu-toolchain/ $ git log riscv-binutils commit 1af07f51a090ddb9e62ef26475b16503c1aa0358 Author: Nelson Chu Date: Tue Jul 28 21:39:21 2020 -0700 Bump binutils to 2.35. $ cd riscv-binutils $ git log commit 2cb5c79dad39dd438fb0f7372ac04cf5aa2a7db7 (HEAD, origin/riscv-binutils-2.35, origin/HEAD, riscv-binutils-2.35) Author: Nick Clifton Date: Fri Jul 24 10:36:01 2020 +0100 2.35 Release

Next we configure our build in a separate sub directory to produce a toolchain for a 32-bit RISC-V core (RV32IM):

RV32I: Base Integer Instruction Set

M: Instructions that multiply and divide values held in two integer registers

$ cd /build/riscv-gnu-toolchain/ $ mkdir build; cd build $ ../configure --help | grep abi --with-abi=lp64d Sets the base RISC-V ABI, defaults to lp64d $ ../configure --prefix=/opt/riscv32 --with-arch=rv32im --with-abi=ilp32 checking for gcc... gcc ... config.status: creating Makefile config.status: creating scripts/wrapper/awk/awk config.status: creating scripts/wrapper/sed/sed

Note that ilp32 specifies that int, long, and pointers are all 32-bits

After configure is complete, we can make our code. Note that make also performs an install into the path specified by --prefix: /opt/riscv32.

$ make $ ls build-binutils-newlib build-gcc-newlib-stage2 build-newlib config.log install-newlib-nano scripts build-gcc-newlib-stage1 build-gdb-newlib build-newlib-nano config.status Makefile stamps

Let's take a look at what we built & installed:

$ cd /opt/riscv32 $ tree -L 3 -d . ├── bin ├── include │ └── gdb ├── lib │ └── gcc │ └── riscv32-unknown-elf ├── libexec │ └── gcc │ └── riscv32-unknown-elf ├── riscv32-unknown-elf │ ├── bin │ ├── include │ │ ├── bits │ │ ├── c++ │ │ ├── machine │ │ ├── newlib-nano │ │ ├── rpc │ │ ├── ssp │ │ └── sys │ └── lib │ └── ldscripts └── share ├── gcc-10.1.0 │ └── python ├── gdb │ ├── python │ ├── syscalls │ └── system-gdbinit ├── info ├── locale │ ├── bg │ ├── ca │ ├── da │ ├── de ... ... │ ├── vi │ ├── zh_CN │ └── zh_TW └── man ├── man1 ├── man5 └── man7

Next we set up an env-riscv script that we can source when we need to work with our toolchain. Later we'll add environment variables like CFLAGS to it.

export PATH=/opt/riscv32/bin:$PATH

Let's make sure we can execute our tools:

$ mkdir -p ~/Projects/riscv $ source /opt/riscv32/env-riscv32 $ riscv32-unknown-elf-gcc --version riscv32-unknown-elf-gcc (GCC) 10.1.0 ... $ riscv32-unknown-elf-objcopy --version GNU objcopy (GNU Binutils) 2.35

Great, we see that we're ready to go with our compiler and binutils. However, before we move on, let's do some inspection of our compiler to see how it's been configured:

$ riscv32-unknown-elf-gcc -dumpmachine riscv32-unknown-elf $ riscv32-unknown-elf-gcc -print-sysroot /opt/riscv32/riscv32-unknown-elf $ riscv32-unknown-elf-gcc -print-libgcc-file-name /opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/libgcc.a $ riscv32-unknown-elf-gcc -print-search-dirs install: /opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/ programs: =/opt/riscv32/libexec/gcc/riscv32-unknown-elf/10.1.0/:/opt/riscv32/libexec/gcc/riscv32-unknown-elf/10.1.0/:/opt/riscv32/libexec/gcc/riscv32-unknown-elf/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/../../../../riscv32-unknown-elf/bin/riscv32-unknown-elf/10.1.0/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/../../../../riscv32-unknown-elf/bin/ libraries: =/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/../../../../riscv32-unknown-elf/lib/riscv32-unknown-elf/10.1.0/:/opt/riscv32/lib/gcc/riscv32-unknown-elf/10.1.0/../../../../riscv32-unknown-elf/lib/:/opt/riscv32/riscv32-unknown-elf/lib/riscv32-unknown-elf/10.1.0/:/opt/riscv32/riscv32-unknown-elf/lib/:/opt/riscv32/riscv32-unknown-elf/usr/lib/riscv32-unknown-elf/10.1.0/:/opt/riscv32/riscv32-unknown-elf/usr/lib/

Let's confirm we're working with the newlib-nano library:

$ ls /opt/riscv32/riscv32-unknown-elf/lib crt0.o libc_nano.a libgloss_nano.a libm_nano.a libstdc++.a libsupc++.a nosys.specs ldscripts libg.a libg_nano.a libnosys.a libstdc++.a-gdb.py libsupc++.la sim.specs libc.a libgloss.a libm.a libsim.a libstdc++.la nano.specs

Build a simple function and analyze it against the specification

Shown below is a very simple C program that has a multiply function mult() for the purpose of obtaining the RV32IM instructions used to multiply two integers. This is certainly something we want to do in our FPGA with our RISC-V.

int mult() { int a=1000,b=3; return a*b; } int main() { mult(); }

We build the simple C application "tst.c" with our new RISC-V GCC compiler:

$ export PATH=/opt/riscv32/bin/:$PATH $ riscv32-unknown-elf-gcc -g tst.c -o tst $ file tst tst: ELF 32-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, with debug_info, not stripped

Since our ELF executable isn't stripped, it has section headers and objdump can be used to analyze the code:

$ riscv32-unknown-elf-objdump -d tst ... 00010150 <mult>: 10150: fe010113 addi sp,sp,-32 10154: 00812e23 sw s0,28(sp) 10158: 02010413 addi s0,sp,32 1015c: 3e800793 li a5,1000 10160: fef42623 sw a5,-20(s0) 10164: 00300793 li a5,3 10168: fef42423 sw a5,-24(s0) 1016c: fec42703 lw a4,-20(s0) 10170: fe842783 lw a5,-24(s0) 10174: 02f707b3 mul a5,a4,a5 10178: 00078513 mv a0,a5 1017c: 01c12403 lw s0,28(sp) 10180: 02010113 addi sp,sp,32 10184: 00008067 ret ...

We can see in the dump of mult() above that our two operands are retrieved into registers using load immediate (li) but then pushed onto the stack before retrieving them again using load word (lw) into registers a4 and a5. The actual multiply operation is perfomed by the mul instruction.

We can find the definition of these instructions in the Unprivileged ISA specification. Specifically, the mul instruction is defined in Chapter 7. This is the "M" extension for our RV32IMA core.

We can see that the "mul a5,a4,a5" instruction is encoded as 0x02f707b3. Keep in mind that RISC-V is a little-endian system, especially when working with debuggers and viewing memory.

Referring to Chapter 25 (RISC-V Assembly Programmer’s Handbook) of the Unprivileged ISA Specification, we find that registers a2 through a7 are considered function argument registers and are mapped to x12 through x17. Therefore, a4-a5 are registers x14-x15 respectively.

Next, let's refer to Chapter 24 (RV32/64G Instruction Set Listings) and compare our instruction's encoded value against what is shown for MUL:

31-25 24-20 19-15 14-12 11-7 6-0 MUL 0000001 rs2 rs1 000 rd 0110011 15 14 15

So, now we have confirmed that the master branch of the GNU RISC-V tools do indeed create assembly that matches the RV32IM specification (at least for MUL). Perhaps it's now a little easier to envision some of the stages of a RISC-V pipeline (e.g., instruction/data fetch and instruction decode).

Some natural next steps for an FPGA-based microcontroller includes creating a bare metal software build environment, linker script, along with the proper startup code and exception table.

We'll continue to update this page as we progress with testing, developing, and integrating a RISC-V core.

Common RISC-V Terms and Acronyms

Terms

Chisel: Constructing Hardware in a Scala Embedded Language

Hart: Hardware Thread

SemVer: Semantic Versioning

Tile: (Rocket) Core + Private Caches

Acronyms

AEE: Application Execution Environment

DSL: Domain-Specific Language

FESVR: Front End Server

FIRRTL: Flexible Intermediate Representation for RTL

HTIF: Host-Target Interface

IR: Intermediate Representation

MTVEC: Machine Trap-Vector Base-Address Register

SEE: Supervisor Execution Environment

Additional RISC-V and Embedded Programming References