Subtle interactions in the embedded world - what bugs can teach us

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

I must admit that I love bugs. I also hate bugs which leads to a very conflicted relationship with the blighters, but for now I just want to focus on the positive - I love bugs. The bugs I am talking about are, of course, the incorrect behavior of software systems due to incorrect code and the reason that I love them is that they are so educational. There are a number of reasons for this, the significant one for the present being that they provide motivation and focus to read and study new code.

One of the freedoms that free and open source software provides is the freedom to study the source code. However having that freedom doesn't mean it's easy to use it. When faced with a large body of code such as the Linux kernel it can be hard to know where to start, or when to move on. There is no story-line, no curriculum, no plot to guide your study. This is where a bug comes in: it can provide a story line. This may be a particular sequence of events or a particular combination of features, or just a starting point to spiral out from. But by providing a clear focus and a clear finishing point, it makes study easier and more rewarding.

More than a phone

I have recently been spending some time trying to make the Linux kernel run well on the GTA04 "Phoenux" replacement motherboard for the Openmoko "Freerunner" and "Neo 1973" mobile phones. This is an attempt to revitalize the Openmoko effort to provide an open (or at least "as open as possible") platform for a mobile computing device. It fits in the same case, uses the same display and speakers, but provides more memory, faster processor, and faster networking plus a few other extras. During this (ongoing) effort I have discovered a number of bugs and missing features, most of which only came with smaller learning experiences. One, though, stands out for leading me on a somewhat longer path of discovery than most, and so I would like to share that path and those discoveries so that others might benefit. I should note first that the path is still open. I have not actually fixed the bug, so others could follow the path themselves. Anyone keen to do that should stop after the next paragraph, get a GTA04, and start hunting themselves. Others can read on.

The symptom of the bug is easy to describe. Like most computer systems the GTA04 has an RTC (real time clock) which has an alarm function. When the time in the clock matches the time in the alarm, it generates an interrupt. The problem is that, though this interrupt works perfectly while the system is awake, it does not reliably wake the device when it is asleep (i.e. in suspend-to-RAM state). Other interrupts configured much the same way (for example data received on the serial console) wake the system with 100% reliability. However the RTC alarm almost always fails. Understanding this bug leads on a path through the interrupt management code, the suspend management code, and even the USB code. But it must start with a brief description of the hardware.

In the GTA04 there are two particular integrated circuits (ICs) that are relevant. One is the OMAP3 SoC (system-on-a-chip) from TI. The other is the PMIC (power management IC) which is referred to as "twl4030" in the various drivers - various because, like the OMAP3, it is a multi-function chip combining diverse functions such as battery charging, keyboard control, and the USB electrical interface. There are a number of connections between these chips but only two are relevant for now. These are a single interrupt line and an I2C bus.

The interrupt line signals the interrupt controller (INTC) in the OMAP3 that something has happened. The I2C (inter-integrated-circuit) bus is a simple 2-wire bus that allows the CPU to read and write registers in the PMIC. In particular it can be used to find out which of the several functions generated the interrupt, and why. It can also be used to clear the source of the interrupt.

Having one interrupt line that indicates any of a number of possible events like this is not uncommon. For example the UARTs (Universal Asynchronous Receiver Transmitter - a serial port) in the OMAP3 each have a single interrupt, but it can be generated by a character arriving, a character transmit having completed, or an error condition. As with most such devices there is a register that can be inspected which has one bit for each possible interrupt source.

When all the interrupt sources are related to a single device and handled by a single driver this is all quite easy to manage. However when the different interrupt sources are related to logically separate devices, having to manage all these behind a single interrupt could get clumsy.

So many interruptions

To address this situation, Linux has an abstraction called an " irq_chip ". An irq_chip represents a set of interrupt sources each of which is assigned an interrupt number (as listed in /proc/interrupts ). It provides functions to enable or disable each interrupt, to allow the interrupt to wake the device from suspend, to set the trigger type (edge or level), and various other functions. It also arranges that the interrupt handler for each interrupt will be called when appropriate.

irq_chip s with sources

Using this, the driver for the INTC in the OMAP3 defines an irq_chip for all the interrupts that it knows about directly, and the driver for the PMIC can define a separate irq_chip which describes the various interrupt sources in the PMIC. Each irq_chip will provide very different functions to configure and control the interrupt sources, but will provide a uniform interface to all device drivers. When the single PMIC interrupt arrives at the INTC, it will call an interrupt handler which will examine the PMIC registers and then call the appropriate handler that was registered with the second irq_chip . This way the drivers for the various parts of the PMIC don't need to know too much about each other - the irq_chip mediates between them.

In our PMIC there is a "Primary Interrupt Status Register" which has one bit for each module that can generate interrupts. Each of these modules has their own "Secondary Interrupt Status Register" to report what actually caused the interrupt, much like the UART described earlier. There is normally no need for a tertiary irq_chip to represent these registers as the one driver manages the whole module and all of its interrupts.

There are two exceptions, only one of which interests us. There is a "power interrupts" module in the PMIC which combines interrupts from a diverse range of sources: press of the power button, insertion of USB cable, over-temperature warning, and the RTC alarm. These are sufficiently varied that a separate irq_chip makes sense here. The twl4030-irq driver manages all this quite elegantly, providing a secondary irq_chip for the device as a whole, and supporting tertiary irq_chip s, including for the Power Interrupts module.

So when the RTC alarm triggers, the primary interrupt is handled by the INTC. This calls an interrupt handler which inspects the PMIC, determines that a power interrupt is active and calls the relevant handler. It in turn examines the PMIC again and finds that the RTC alarm has fired, and so calls the handler that was registered for that particular interrupt. This all works perfectly. But it doesn't seem to wake the device from suspend. At least, not always. It seems that something happens on the way to suspend that interferes with this.

A funny thing happened on the way to suspend

One of the many things that happens on the way to suspend is that each individual interrupt gets disabled - unless it is flagged as IRQF_NO_SUSPEND in which case it is left alone. However this doesn't mean exactly what it sounds like it means. Being "disabled" just means that the handler routine will not be run, it doesn't mean that the interrupt will not be generated. We have a different word for that, which is "masking". When an interrupt is masked the originating source of the interrupt is told to never post that interrupt.

Linux uses a lazy scheme for disabling interrupts. When the disable request is made, the fact is recorded in internal data structures, but that is all. If the interrupt is subsequently delivered, only then might the interrupt be masked. This can be a useful optimization as masking an interrupt can take a lot longer than just setting a flag in memory.

So, on the way to suspend, interrupts are disabled but not masked. If the interrupt does actually arrive before we reach full suspend, the fact is recorded. If it was an interrupt that should wake from suspend, this is detected in check_wakeup_irqs() and suspend aborts. If the interrupt doesn't arrive before full suspend, then it is still unmasked and will successfully wake up the device, which will resume and then handle the interrupt. This might all seem a bit complex, but once it is fully understood it is actually quite neat and it works well ... except for my RTC alarm.

Who turned out the lights?

As promised we will need to glimpse at the USB code as well and now is the time to bring that in. Like most modern phones, the GTA04 can be attached to a computer or a charger via a USB port. When the USB interface detects 5 volts on the power line, it will route this to the battery charger (BCI) which will use it to charge the battery and power the device.

This is relevant because - when tracing is added to the twl4030-irq driver - we see that during suspend we get an interrupt from the PMIC which turns out to be due to the battery charger losing power and saying "Hey, just thought you should know that I'm running on battery again now". What is happening is that when the USB port is told to go to sleep it turns off its various power supply regulators. One of these (USB3V1) also powers the voltage comparator which determines if 5V is present. As soon as that presence is no longer signaled, the current stops being routed to the BCI, and the BCI justifiably complains.

This results in the interrupt line from the PMIC being asserted which causes the interrupt management code to run. This notices that the interrupt was disabled, and so masks it. When we get to check_wakeup_irqs() , the PMIC IRQ is pending, but as it has not been set for wakeup it is just ignored. The net result here is that all interrupts from the PMIC are ignored during suspend, so the RTC alarm doesn't work, the power button doesn't wake the device up, and plugging or unplugging the USB cable has no effect either.

So now that we understand the problem ...

So, what to do? We could just unplug the USB cable when testing the RTC alarm. Then the BCI would not notice the current disappearing and so wouldn't generate an interrupt. This certainly works - once. It doesn't seem to work a second time, but we'll leave that issue for the moment.

We could tell the USB controller not to turn off USB3V1 when the BCI is active and power is available on the USB bus. This is certainly a good idea and would mean that the battery can charge during suspend. It also brings up the interesting issue of how these separate drivers should communicate, but it doesn't really solve the general problem, just this specific one.

It could be that an interrupt occurs that we really do want to ignore during suspend, that we have even explicitly disabled (but due to lazy-masking has not been masked). So we want to be able to mask those interrupts without masking the main interrupt. A few moments reflection on what we have learned so far suggests that setting IRQF_NO_SUSPEND on the intermediate interrupts should help. That way they will not be disabled during suspend so the interrupt management will trace through the chain of irq_chip s far enough to find the originating interrupt source and will mask just that. The other interrupt source will still be available.

This may sound convincing, but reality has a way of getting in the way: testing shows that just setting IRQF_NO_SUSPEND is not enough. Diving back into the code and the data sheets reveals an interesting and important fact. The PMIC allows each individual interrupt source to be masked, but it does not allow a whole module as a set of interrupt sources to be masked. So when an interrupt arrives from the BCI (one of the modules) the IRQ manager notices that interrupt is disabled and so the secondary irq_chip is asked to mask it. But as the irq_chip cannot mask it (the hardware doesn't support that), it simply ignores the request. You can imagine what happens next - an unending stream of attempts to mask the BCI interrupt, none of which are effective and the interrupt keeps re-firing.

The root problem seems to be that while the TWL4030 PMIC appears to have an interrupt structure that lends itself to being represented as a small tree of irq_chip s, this isn't actually the case. The details of the behavior of suspend makes it essential that a working "mask" function be provided, and that cannot be provided for modules in the twl4030, only for individual interrupts. So it seems that a complete fix requires some change to the structure of the irq_chip s in the twl4030 driver. Possibly we could make the structure simpler (no tree), or possibly we could make the implementation more clever in some way. The important point is that each interrupt that is visible to the Linux IRQ management code must support being masked in the hardware in some way.

Another funny thing on the way to resume

As mentioned, after one successful wake-from-alarm, the RTC alarm in the GTA04 doesn't work a second time. In fact it stops working completely, even when not suspended. It seems that something has gone wrong on the way to resume.

During resume all the interrupts that were disabled by suspend are re-enabled. The three interrupts which relate to the RTC - the base PMIC interrupt, the interrupt for the "power interrupts" module, and the actual RTC interrupt - are enabled in that order with PMIC first. As this interrupt is still asserted (as it is what triggered resume) the PMIC interrupt handler runs as soon as it is enabled. It reads the interrupt status register to see which module was responsible, and triggers the relevant interrupt for that module.

This secondary interrupt is still disabled so nothing happens. In particular the interrupt source is not cleared, nor is it masked, so we again get a repeated sequence of attempts to handle an interrupt which cannot be handled. This one isn't unending as the loop that enables all the interrupts is still running and it eventually gets to the one for the target module (the power interrupts module). This is triggered again and this time it actually does something. It reads the final status register (thus clearing the interrupt) and calls the interrupt handler for the RTC alarm. Unfortunately that is still disabled so nothing else happens.

Eventually the RTC interrupt is enabled but by now it is too late. In some cases an interrupt that fires while disabled gets the IRQS_PENDING flag set so that when it is re-enabled, the interrupt is resent. But that is not the case when the twl4030-irq interrupt controller tries to handle a tertiary IRQ. Whether this is a bug or a misunderstood feature is as yet unclear. Another journey of discovery would be needed to fully understand that issue. Meanwhile, because the interrupt is delivered but the interrupt service routine isn't run, the handshaking between the driver and the device gets confused and no other interrupt is ever seen from the RTC.

We can work around this problem by a simple expedient. When re-enabling the interrupts during resume, do it in the reverse order. This ensures that the RTC interrupt is enabled before it is called, and everything works. This even feels like the "right" thing to do - it is balanced for "enable" to happen in the reverse order of "disable". But even if it is "right", it is not really a fix. There would still be bugs lurking and ready to spring out again.

Journey's end

The point of this little tour is not to point out bugs in the code or to shame the relevant developers. In fact the code is in general quite elegant, well structured, largely very effective, and the developers are only to be congratulated. Rather, the point is to highlight the complexity of the task being tackled by developers working the embedded space:

There are complex interactions between distinct hardware. Here the USB interface and the battery charger have important interdependencies that need to be reflected in the drivers. Some interdependencies already are but there are subtleties that are easy to miss.

The requirements of an " irq_chip " are not really documented anywhere, and given the current rate of development there is a good chance that such documentation would be out of date. Extracting the requirements from the code is far from trivial. In this case a seemingly correct and obvious implementation was only found to be deeply wrong by a fairly obscure test case.

" are not really documented anywhere, and given the current rate of development there is a good chance that such documentation would be out of date. Extracting the requirements from the code is far from trivial. In this case a seemingly correct and obvious implementation was only found to be deeply wrong by a fairly obscure test case. The IRQF_NO_SUSPEND flag is clearly important, but not easy to understand. It is fairly obvious what it does, but much less obvious why you would want to do that. I imagine that as a developer my approach would be to not set it until I found a bug which could be fixed by setting it. Then I would set it and hope it didn't break anything. This is not really a robust way to do development.

With complex hardware interactions and complex software dependencies it is not surprising that we seem to see a new class of problems in the embedded space. As we have observed, there are often ways to work around the bug rather than fix it, and that is usually much simpler and quicker. They may be tempting, but for this developer at least they are nowhere near as satisfying. Because if you follow the bug all the way to the root cause, and if you understand all that is happening and find the right fix, then you have not only helped other future developers, but have learned a whole lot in the process. And that is why I love bugs.