Opening up the GnuBee open NAS system

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

GnuBee is the brand name for a line of open hardware boards designed to provide Linux-based network-attached storage. Given the success of the crowdfunding campaigns for the first two products, the GB-PC1 and GB-PC2 (which support 2.5 and 3.5 inch drives respectively), there appears to be a market for these devices. Given that Linux is quite good at attaching storage to a network, it seems likely they will perform their core function more than adequately. My initial focus when exploring my GB-PC1 is not the performance but the openness: just how open is it really? The best analogy I can come up with is that of a door with rusty hinges: it can be opened, but doing so requires determination.

A mainline kernel on the GnuBee?

Different people look for different things when assessing how open or free some device is, so I should be clear about what my own metrics are. I am interested primarily in the pragmatics of openness: whether I can examine, understand, and modify the behavior of my device with no informational, technological, or legal impediments — cognitive and temporal impediments I'll take responsibility for myself. A good first measurement is: can I run the latest upstream kernel on the device? I can, but there is plenty of room for improvement.

The heart of the GB-PC is the MT7621 SoC (System On a Chip) from Mediatek. It provides a dual-core, 32-bit MIPS processor together with controllers for memory, flash ROM, serial ports, USB, SD cards, audio, Ethernet, and most of the connections you might expect. It doesn't control SATA drives directly, but it provides a PCI Express interface to which the GnuBee board connects an ASM1061 SATA controller. This SoC is mostly used in WiFi routers and similar network hardware and it is supported by Linux distributions focused on those devices, such as OpenWrt, LEDE, and libreCMC, but this support is only partially upstream.

There are some specifications and documentation available for the MT7621 on the web, but most of the PDFs have a watermark saying "Confidential", so their legality seems unclear and, while useful, they are incomplete. The main source of driver code for this SoC appears to be a software development kit (SDK) from Mediatek. The OpenWrt-based distribution that GnuBee provides as one source for a bootable kernel builds Linux from a GitHub repository provided by mqmaker Inc. This is not a clone of the upstream Git tree with patches added, but is rather a code-dump for Linux 3.10.14 with lots of changes. It seems a reasonable guess that this code was part of the Mediatek SDK. This code appears to be completely functional; all the MT7621 hardware works as expected when using this kernel. It is a little old though.

GnuBee also provides a 4.4.87-based kernel as part of a libreCMC distribution. This contains the MT7621 support broken out as individual patches — 83 in total, though several of those are not specific to the hardware. This is a much easier starting point when aiming to use the latest kernel and John Crispin, author of many of the patches, deserves thanks. He has not been idle and several of these patches have already landed upstream; they are not enough to boot a working kernel, but it is a useful start. One remaining weakness with this set of patches is that the driver for the MMC interface, which is used to access the microSD card, isn't reliable. It can read data from the card, but it can also fail to read. Given that the 3.10.14 code is reliable, this should be fixable given time and patience.

It should be possible to start with the latest mainline kernel, apply those patches from libreCMC that seem relevant and haven't already been applied upstream, fix merge conflicts and compiler errors, and get a working kernel. Unfortunately this didn't quite work. The resulting kernel did nothing — nothing on the console at all, so no indication of what might be wrong, just that something was wrong.

The standard approach to analyzing this sort of problem is to use git-bisect. I had a 4.4 kernel that worked and a 4.15-rc kernel that didn't. I just need to try the kernel in the middle, then continue searching earlier or later depending on how things turned out. While git-bisect is an invaluable tool, it can be a bit of a painful process even when working with the upstream kernel. When you have a pile of patches to apply to each kernel before testing, and when that set is different for each different kernel (as some have already been included upstream at different point), it requires real determination.

I was lucky and tested 4.5 early and found that it didn't work, thus narrowing my search space more quickly than I had any reason to expect. I eventually discovered commit 3af5a67c86a3 ("MIPS: Fix early CM probing"). This commit makes a tiny change that probably makes sense to people who understand what the code is supposed to do, but which breaks booting of the GnuBee board. Reverting this patch gave me a kernel that booted enough to print useful error messages about the next problem it hit, which was then easy to fix (something to do with maximum transfer sizes for the SPI bus driver).

With a 4.15-rc kernel that boots and a minimal initramfs that can find the device with the root system, I only have two kernel issues to fix. One is the MMC controller that is still unreliable. The other is the Ethernet interface.

The Ethernet interface in the MT7621 comes with an integrated 6-port switch. One port connects to the processor and the other five can be wired externally (the GB-PC1 only provides connectors for two, the GB-PC2 has three). The switch can understand VLANs so different ports on the switch can be on different logical networks, and the MT7621 port can send packets to any VLAN.

When the SDK was created, Linux didn't have an API for integrated switches, so Mediatek used an out-of-tree implementation called "swconfig". Now Linux has "switchdev", which was introduced in late 2014. When Crispin posted his patches in early 2016 to support the Mediatek Ethernet controller, though with minimal support for the switch and no switchdev integration, he hit a roadblock. Dave Miller made it quite clear that they could not be accepted without proper switchdev integration, and reminded readers that help was available from various people who were quite familiar with the inner working of switchdev. No further progress on these patches can be found.

I could simply include Crispin's latest patches and I suspect I could get the network working with minimal pain, but I don't think that is the right way forward. If I do ever dig into those patches, it will be as part of learning switchdev and creating a proper upstreamable solution. For now, I dug through my drawer and found a USB-attached Ethernet adapter that only needed a little bit of convincing to work. This sits nicely beside my USB-attached card reader that holds the micro-SD card and my root filesystem. This isn't the most elegant solution, but hopefully it will be temporary.

How about mainline U-Boot?

So I have the door open now, but it still squeaks a bit. I can run a mainline kernel (important if I want to benefit from the latest filesystem developments, or avoid the more well-known exploits) and I have code and some documentation, which should be enough to develop and test new kernels. The only remaining barrier is that testing new kernels isn't quite as easy as I would like. My bar for "easy" is rather high here. When I'm doing a git-bisect to find the cause of a regression (and experience assures me I'll need to do that again one day), I need every step to be as smooth and automatic as possible: if I have to do anything manually I will get it wrong occasionally and end up late for dinner.

The easiest method for installing a new kernel is to copy the kernel file onto a USB storage device as " gnubee.bin ", plug that in and turn on the board. The U-Boot firmware will notice this, write the kernel to flash memory, then ask you to unplug and power-cycle. When you do that the new kernel will boot. This is conceptually easy enough, but I don't really want to write the kernel to flash (which takes several seconds), I just want to boot it. It is possible to do this by interrupting the boot (type "4" on the console) and issuing a couple of commands to the U-Boot CLI (which I can copy/paste with the mouse instead of typing), but this is still manual interaction which I would like to avoid. "U-Boot" can load a new kernel over the network (it even supports a simple HTTP server for uploading firmware) but all this still requires manual interaction.

The obvious answer would be to replace U-Boot — it is open source after all. There is a difference with updating U-Boot though: the fear of turning my NAS into a brick.

U-Boot, for those not familiar with the term, is a suite of code designed to fill a similar role to the BIOS on a traditional PC. It is stored in flash memory that the processor can read directly (or at least can copy directly to RAM) and it performs all early configuration (such as enabling and mapping the DRAM, and setting up various clock signals etc.), and then finds a storage device to load your kernel from. Modern U-Boot can be quite sophisticated; it is able to read from USB, IDE, SATA, or the network, and can understand a variety of filesystems (FAT, ZFS, ext4, etc.) and network protocols (such as TFTP or NFS).

The U-Boot that is provided with the GnuBee looks like it was probably part of the Mediatek SDK. It is fairly old, contains a lot of hackish code, and looks like a separate development path; the HTTP server it contains is not present in the mainline code, for example. The ideal way forward would be to extract all the hardware drivers from the old U-Boot installation and add them upstream. There will undoubtedly be bugs at first, which, just as with the kernel, will be hard to analyze. A bad kernel can easily be replaced because U-Boot is still working. If you break U-Boot, instead, your hardware becomes what we like to call a "brick". There is no easy mechanism to replace that broken U-Boot code.

By "easy" here I mean easy for the home hacker. A professional with a fully kitted-out lab will have a device that can use a JTAG port to take control of the processor and load anything into memory directly. Even a fairly sophisticated home hacker might be able to attach a secondary flash ROM to the board as shown in these two pictures and boot from that. But as I prefer working with software, I like a software solution. I need the main U-Boot to normally jump straight into a secondary U-Boot. If that is missing, or if the reset button is held (for example), it can continue to the default behavior, which doesn't need to be sophisticated, it just needs to work.

The U-Boot in the GnuBee does have something that looks just enough like this functionality that it might be usable. The "uImage" file (a container format used by U-Boot) that it expects to load the kernel from can, instead, hold a standalone program. This will be copied from flash into RAM and run. It even has access to some of the functionality of the original U-Boot so that I don't have to get new code working for everything at once.

There are a few challenges with this approach. One of the more frustrating so far is that it doesn't work for small test programs, only larger programs. Like many processors, MIPS has separate instruction and data caches. If you write some bytes to memory, they will be stored in the d-cache. If you try to execute code, it will be read from memory through the i-cache. If you don't flush the d-cache out to main memory between writing out the code and running the code, the wrong code will be run. The old U-Boot on the GnuBee doesn't do that flush, so I spent quite a few hours wondering what could possibly be wrong. Fortunately I don't have enough hair for it to be worth pulling out. This was eventually fixed by the simple expedient of adding 32K of zeros to the end of the test program, this being the size of the d-cache.

Another challenge is that the USB code in the main U-Boot is a bit unreliable and I cannot seem to get it to work from a separate standalone program, so booting straight from USB is not an option until I can include my own USB driver. Similarly, booting directly from an SD card won't work as U-Boot doesn't have an MMC driver. That leaves the network. I suspect it will be possible to write a standalone program, loaded by U-Boot, which uses the TFTP functionality from U-Boot to load a kernel and then boot it. This will allow me to build a kernel on my workstation, then boot it by simply turning the GnuBee on. I'm not yet desperate enough to have arranged for a remotely controllable power switch, but I have thought about it.

Is this open enough?

While I would like proper SoC documentation that didn't have "Confidential" watermarks and which was reasonably complete, code that works reliably and isn't deliberately obfuscated is at least a good start. Working code and sketchy documentation is probably enough to keep me happy. Having an easy path to experiment with U-Boot code without the risk of creating a brick would be ideal, I suspect I can get close enough with what I have. And who knows, maybe one day I'll be brave enough to try flashing a whole new U-Boot.

Certainly I can run a current mainline kernel (with just a few patches) and I can run the Linux distribution of my choice, so that is open enough for now.