An introduction to RISC-V

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

LWN has covered the open RISC-V ("risk five") processor architecture before, most recently in this article. As the ecosystem and tools around RISC-V have started coming together, a more detailed look is in order. In a series of two articles, I will look at what RISC-V is and follow up with an article on how we can now port Linux distributions to run on it.

The words "Free and Open RISC Instruction Set Architecture" are emblazoned across the web site of the RISC-V Foundation along with the logos of some possibly surprising companies: Google, hard disk manufacturer Western Digital, and notable ARM licensees Samsung and NVIDIA. An instruction set architecture (ISA) is a specification for the instructions or machine code that you feed to a processor and how you encode those instructions into a binary form, along with many other precise details about how a family of processors works. Modern ISAs are huge and complex specifications. Perhaps the most famous ISA is Intel's x86 — that specification runs to ten volumes.

More importantly, ISAs are covered by aggressive copyright, patent, and trademark rules. Want to independently implement an x86-compatible processor? Almost certainly you simply cannot do that without making arrangements with Intel — something the company rarely does. Want to create your own ARM processor? You will need to pay licensing fees to Arm Holdings up front and again for every core you ship.

In contrast, open ISAs, of which RISC-V is only one of the newest, have permissive licenses. RISC-V's specifications, covering user-space instructions and the privileged instructions are licensed under a Creative Commons license (CC BY 4.0). Furthermore, researchers have determined that all RISC-V instructions have prior art and are now patent-free. (Note this is different from saying that implementations will be open or patent-free — almost certainly the highest end chips will be closed and implementations patented). There are also several "cores" — code that compiles to Verilog and can be programmed into an FPGA or (with a great deal more effort) made into a custom chip — licensed under the three-clause BSD.

Unlike earlier open ISAs, RISC-V's main features are that it is scalable and that it is primarily a specification that allows for multiple implementations. RISC-V starts with a choice of 32-, 64- or 128-bit integer-only specifications that we call "RV32I", "RV64I", or "RV128I". (I'm not going to cover the 128-bit ISA any further in this article because it is still in the design phase and there is only one software implementation, written by the inimitable Fabrice Bellard.) The "I" stands for "integer" and includes the basic processor features like loads, stores, jumps, and integer arithmetic. The architecture however is scalable and other extensions are common. Most Linux-capable RISC-V chips will be "RV32IMAFDC" or "RV64IMAFDC" where the letters mean:

I Integer and basic instructions M Multiply and divide A Atomics F IEEE floating point (single precision) D IEEE floating point (double precision) C Compressed instructions

For convenience "IMAFD" can be written "G" (for "general purpose") and so you will more commonly see those chips described as "RV32GC" or "RV64GC".

Most Linux-capable designs have skipped 32-bit variants entirely; in the second article I will describe Fedora on RISC-V, which is entirely concentrating on RV64GC. For completeness I should also say there is a cut-down embedded specification called "RV32E" that has half the number of general-purpose registers but is otherwise identical to RV32I. Since RV32E machines are likely to have only a few kilobytes of RAM and lack a "supervisor" mode, they are unlikely to ever run Linux.

RISC-V has 31 general purpose registers (15 for RV32E), approximately double the number visible to the programmer on x86-64. This simple unoptimized loop counting to 1000 demonstrates some features of the instruction set:

- binary - - mnemonic - fe042623 sw zero,-20(fp) # store zero into stack slot a031 j L2 # compressed jump L1: fec42783 lw a5,-20(fp) # load stack slot into a5 2785 addiw a5,a5,1 # compressed increment fef42623 sw a5,-20(fp) # store back to stack L2: fec42783 lw a5,-20(fp) # load stack slot into a5 0007871b sext.w a4,a5 # sign extend a5 into a4 3e700793 li a5,999 # load immediate fee7d5e3 ble a4,a5,L1 # compare and branch

Registers are named x1 through x31 (with x0 being logically wired to zero), but the assembler provides a set of names like a0 - a7 for function arguments and return values, t0 - t6 for temporaries, fp for the frame pointer, sp for the stack pointer, zero for the zero register, and others. These are just aliases for the x -names. The floating-point extensions (if present) add 32 more registers, and it is expected that future extensions like vectorization will add more.

Instructions are variable length, with the basic length being 32 bits. Many common instructions can be compressed to 16 bits when using the compressed extension (that is expected to be present in all Linux-class chips). Longer instructions are possible too, with the more obscure extensions expected to use them. Unlike x86, variable length does not have to mean "horribly complex to decode". The encoding ensures that the processor can easily see the length of every instruction in its prefetch queue by decoding a few bits in a uniform location. This is even the case where the code is using extensions that the processor does not understand (e.g. for handing them off to a co-processor or to trap and emulate them).

Although the architecture is (by design) simple, boring, and similar to others that have gone before, one interesting area is the approach to complex instructions such as specialized instructions for string handling, video decoding, or encryption. Some of these may be implemented in future extensions. For others, the designers have expressed a preference not to add complex instructions to the specification but instead to rely on macro-op fusion for performance. (Note there is a patent claim on a limited version of this technique, although it expires in 2022.) Processors are expected to detect sequences of simpler instructions that together perform some complex operation (e.g. copying a string) and fuse them together at run time into a single more efficient macro operation. How this wish will meet reality is yet to be seen, but it does mean that, for now, writing a RISC-V emulator is relatively easy because there are only simple instructions.

To make a real computer you need a lot more than just a core, and RISC-V is at least beginning to supply more of those pieces. Code is available for an L1 cache, a cache-coherence and inter-core communication protocol called TileLink, ChipLink, which is an inter-socket version of TileLink, an external hardware debugging interface, and the beginnings of an interrupt controller. But there are many missing pieces: everything from DDR4 interfaces for memory, to ethernet, to GPUs. In the first silicon, and perhaps for a long time to come, these will all be proprietary even if paired with open-source CPUs.

Linux kernel 4.15 added basic RISC-V support, which is sufficient to boot but not much else (there are no interrupts and hence no significant device support). For now you have to use the out-of-tree riscv-linux kernel, although it is expected that most things will be upstream by 4.17. GCC and binutils support has been upstream for over a year, but you are recommended to use at least GCC 7.3.1 and binutils 2.30.

The final missing piece for Linux was a stable glibc ABI, which was added in February 2018 with glibc 2.27. This allows Linux distributions to start to compile packages, knowing that we won't have to recompile everything from scratch if there's a change to the glibc ABI.

And finally, where can you get RISC-V hardware to run Linux on? At the time of this writing almost no hardware is available. A few lucky people have SiFive's HiFive Unleashed development board that has four 64-bit application cores (RV64GC) plus a power management core (RV32IMAC), but costs at least $999. However there is QEMU support in 2.12 that can be used to run Fedora. There are also plenty of FPGA implementations, although you will find that they run much more slowly than QEMU and have limited RAM and device support.

It's expected that the hardware landscape will change quickly in the coming year, with much cheaper iterations of the HiFive Unleashed and several other companies announcing hardware. One surprise though: you might have a RISC-V chip in your PC in the near future. Western Digital has announced that it will transition the cores used in its hard disks and other storage devices to RISC-V; currently it ships over a billion cores each year.

Look for the second article in this series, where I will cover how Fedora was ported to RISC-V.