Resurrecting the SuperH architecture

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

Processor architectures are far from trivial; untold millions of dollars and many thousands of hours have likely gone into the creation and refinement of the x86 and ARM architectures that dominate the CPUs in Linux boxes today. But that does not mean that x86 and ARM are the only architectures of value, as Jeff Dionne, Rob Landley, and Shumpei Kawasaki illustrated in their LinuxCon Japan session "Turtles all the way down: running Linux on open hardware." The team has been working on breathing new life into a somewhat older architecture that offers comparable performance to many common system-on-chip (SoC) designs—and which can be produced as open hardware.

The architecture in question is Hitachi's SuperH, whose instruction set was a precursor to one used in many ARM Thumb CPUs. But the patents on the most important SuperH designs have all expired—and more will be expiring in the months and years to come—which makes SuperH a candidate for revival. Dionne, Landley, and Kawasaki's session [PDF] outlined the status of their SuperH-based "J2" core design, which can be synthesized in low-cost FGPAs or manufactured in bulk.

Dionne started off the talk by making a case for the value of running open-source software on open hardware. That is a familiar enough position, of course, but he went on to point out that a modern laptop contains many more ARM and MIPS processors than it does x86 processors. These small processors serve as USB and hard-drive controllers, run ACPI and low-level system management services, and much more. Thus, the notion of "taking control of your hardware" has to include these chips as well.

He then asked what constitutes the minimal system that can run Linux. All that is really needed, he said, is a flat, 32-bit memory address space, a CPU with registers to hold instructions, some I/O and storage (from which the kernel and initramfs can be loaded), and a timer for interrupts. That plus GCC is sufficient to get Linux running—although it may not be fast, depending on the specifics. One does not even need a cache, floating-point unit, SMP, or a memory-management unit (MMU).

At this point, Landley chimed in to point out that Dionne had been the maintainer of uClinux, which was an active project maintaining Linux on non-MMU systems up through 2003, when Dionne handed off maintainership to others where, unfortunately, development slowed down considerably. The requirements for running Linux are quite low, though; many of the open-hardware boards popular today (such as the Raspberry Pi) throw in all sorts of unnecessary extras.

That brings us to SuperH, which Dionne said was developed with a "massive research and development outlay." The SuperH SH2 was a highly optimized design, employing a five-stage Harvard RISC architecture with an instruction-set density considerably ahead of its contemporaries. That density is a common way to measure CPU efficiency, he explained; a dense architecture requires fewer instructions and thus fewer clock cycles to perform a given task. Most of a CPU's clock cycles are spent waiting for something, he said; waiting for instructions is such a bottleneck that if you can get them fast enough, "it almost doesn't matter what your clock speed is."

The SuperH architecture is so dense that a 2009 research paper [PDF] plotted it ahead of every architecture other than x86, x86_64, and CRIS v32. ARM even licensed the SuperH patent portfolio to create its Thumb instruction set in the mid-1990s.

Fortunately, the patents are now expiring. The last of the SH2 patents expired in 2014, with more to come. The SH2 processor was, he said, used in the Sega Saturn game console; the SH4 (found in the Sony Sega Dreamcast) will have the last of its patents expire in 2016. Though they are older chips, they were used in relatively powerful devices.

In preparation for this milestone, Dionne, Landley, and others have been working on J2, a clean-room re-implementation of the SH2 that is implemented as a "core design kit." The source for the core is written in VHDL, and it can be synthesized on a Xilinx Spartan6 FPGA. The Spartan6 is a low-cost platform (boards can be purchased for around $50), but it also contains enough room to add additional synthesized components—like a serial controller, memory controller, digital signal processor, and Ethernet controller. In other words, a basic SoC.

The other main advantage of the J2 project is that the work for implementing SuperH support is already done in the kernel, GCC, GDB, strace, and most other system components. By comparison, there are a few other open CPU core projects like OpenRISC and RISC-V, but those developers must write all of their code from scratch—if the CPU core designs ever become stable enough to use. As Landley then added, "we didn't have to write new code; we just had to dig some of it up and dust it off."

The project has thus "inherited" an excellent ISA, and has even been in contact with many of the former Hitachi employees that worked on SuperH. But that is of little consequence if a $50 FPGA is the only hardware target. The Spartan6 is cheap as FPGAs go, but still more than most customers would pay for an SoC. So the J2 build chain not only generates a Xilinx bitstream (the output which is then synthesized onto the FPGA); it also generates an RTL circuit design that can be manufactured by an application-specific integrated circuit (ASIC) fabrication service.

Chip fabrication is not cheap if one shops around for the newest and smallest process, Dionne said—but, in reality, there are many ASIC vendors who are happy to produce low-cost chips on their older equipment because the cost of retooling a plant is exorbitant. A 180nm implementation of the J2 design, he said, costs around three cents per chip, with no royalties required. "That's disposable computing at the 'free toy inside' level."

As of today, the J2 is sufficient to build low-end devices, but the roadmap is heading toward more complex designs as more SuperH patents expire. In 2016, the next iteration, called J2+, will add SMP support and an array of DSPs that will make it usable for signal-processing applications like medical devices and Internet-of-Things (IoT) products like the oft-cited home electricity monitor. A year or so further out, the J4 (based on the SH4 architecture) will add single instruction, multiple data (SIMD) arrays and will be suitable for set-top boxes and automotive computing.

Landley and Dionne then did a live demonstration, booting Linux on a J2 core that they had synthesized onto an off-the-shelf Spartan6 board purchased the day before in Tokyo's Akihabara district. The demo board booted a 3.4 kernel—though it took several seconds—and started a bash prompt. A small victory, but it was enough to warrant a round of applause from the crowd. Dionne noted that they do have support in the works for newer kernels, too. Landley said that he was still in the process of setting up the two public web sites that will document the project. The nommu.org site will document no-MMU Linux development, he said (hopefully replacing the now-defunct uClinux site), while 0pf.org will document the team's hardware work.

In an effort to reduce the hardware cost and bootstrap community interest, the team is also planning a Kickstarter campaign that will produce a development board—hopefully with a more powerful FPGA than the model found on existing starter kits—in a Raspberry-Pi–compatible form factor. By including a larger FPGA, these boards should be compatible with the J4 SMP design; the Lx9 version of Spartan6 (which was used for the J2 development systems) simply does not have enough logic gates for SMP usage.

At the end of the talk, an audience member voiced concern that SuperH was old enough that support for it is unmaintained in a lot of projects. He suggested that the J2 team might need to act quickly to stop its removal. Landley noted that, indeed, the latest buildroot release did remove SuperH support, "but we're trying to get them to put it back now." Luckily, Dionne said, there are other projects keeping general no-MMU support in the kernel up-to-date, such as Blackfin and microblaze. The team has been working on getting no-MMU support into musl and getting some of the relevant GCC build tools "cleaned up" from some minor bit rot.

Another audience member asked whether or not the SuperH ISA was getting too old to be relevant. In response, Dionne handed the microphone over to Kawasaki, who had remained off to the side for the entire presentation up to that point. Kawasaki was one of the original SH2 architects and is now a member of the J2 project. There have been some minor additions, he said: the J2 adds four new instructions. One for atomic operations, one to work around the barrel shifter, "which did not work the way the compiler wanted it to," and a couple that are primarily of interest to assembly programmers. There are always questions about architecture changes, he said, but mostly the question is whether to make the changes mandatory or simply provide them as VHDL overlays. For the most part, though, the architecture already had everything Linux needs and works well, despite its age.

As of today, the nommu.org site is online and has an active mailing list, although the Git repository Landley promised is not yet up and running. The 0pf.org site is also up and running, and contains much more in the way of documentation. While the project is still in its early stages, it seems to be generating considerable interest, and with several more iterations of open CPU designs still to come.

[The author would like to thank the Linux Foundation for travel assistance to attend LCJ 2015.]

