There are literally hundreds, if not thousands, of subtle concepts that contribute to high quality software design. Many of them are well-known, and can be found in books or the Internet. I’m going to highlight a few of the ones I think are important and often overlooked.

But first let’s start with a short diversion. I’m going to make a bold statement: unless you’re a novice, there’s at least one thing in computer programming about which you’ve picked up a bad habit.

Whether you’ve been programming for 4 months or 40 years, once you’ve gotten past the initial stages of learning how to program, now you feel confident in your abilities, and that is a double-edged sword. On the one hand, you can stop focusing on every little thing and concentrate on the task at hand. On the other hand, that means some of your work is your mind running on autopilot. The human mind is lousy at being disciplined. It likes shortcuts. It’s lazy. Why should it bother trying to be perfect, when a computer will squawk when it sees something wrong? Let the computer do the dirty work. And while a computer is merciless at rejecting outright the smallest of syntax errors, it accepts anything that gets past its basic standards. At a low level, static analysis tools like lint and FindBugs will help catch more nuanced mistakes that get programmers into trouble. (Classic example in C: if (age = 13) { printf("Congrats! You're a teenager!

"); } — can you spot the error?) These tools won’t help you deal with high-level software design errors, however.

So there’s really no substitute for learning — and re-learning — good programming practices to drive out the bad ones and help you stay on top of your game. Books like Code Complete, The Pragmatic Programmer, Effective C++, and Effective Java are some of the best at pointing out good programming techniques. And find coworkers to learn from! A book is a great way to expose you to good practices, but there’s no substitute for having a person who can look at your code and give you constructive feedback.

Other articles in this series:

Now onto our featured presentation.

Idempotence

Okay, we need to sit down and have a serious talk about idempotence.

No, I said idempotence! It’s a fancy word that just means that there’s no difference between performing an operation once or twice or five times in a row.

Imagine you are an absent-minded professor. You have a horrible memory. In fact, if you want to make sure you get things done, you have to write them down. And even then, sometimes you forget. Here is your list of things for today:

Group A Sell 100 shares of Glyptocratic Incorporated (GLYP) Wait for one of your students to call and spend 20 minutes helping them with this week’s homework; let the rest of them suffer

Group B There should be \$1180 left in bank account; if there is, withdraw \$60 Have the car painted red Remove stickers from car windshield



You’ve got to get these things done today. It’s important. And whenever you look at that list, you forgot if you did some of them. So if you’re not sure, just assume they need to be done.

At the end of the day, here’s what happened:

You sold 300 shares of GLYP

You spent an hour helping 3 students with their homework

There’s \$1120 in your bank account — the first time you went to the bank, you took out \$60, and the second time you went to the bank, you left feeling somewhat unfulfilled

Your car is red, and you have two bills from Ed’s Body Shop (and you’re not sure if your car was red yesterday, too)

Your car windshield is free of stickers — the ones that were there are now in the trash after an hour of grueling work in the garage, and the three additional times you went out to your car, you were so glad to see that you didn’t have to do anything after all.

Okay, that’s kind of a contrived example. But I hope you get the point. The tasks in Group B are idempotent: it doesn’t matter whether you do them once or twice or several times, in the end the result is the same. (Well, basically the same. Too bad you had to pay twice to have your newly-painted red car painted red again.) The tasks in Group A are not.

In programming, this sort of thing happens with distributed systems. And before you say that distributed systems programming is one of those fancy things you don’t have to deal with, anytime you work with a network socket, or a serial port, or a SPI / I2C chip — anywhere it’s not 100% guaranteed that your data arrived properly, that’s an unreliable communications channel, and you have to be prepared to retry any communications transaction.

The classic bad example of how NOT to handle unreliable communications: that Internet commerce web page you’ve all seen, the one that says “Do not click SUBMIT more than once” and promises dire consequences if you do, like mistakenly ordering two Ronco Salad Shooters because your finger slipped on the mouse button. It’s a bad example because it’s so easy to fix. The non-idempotent approach is to trigger a new purchase order every time your customer clicks a button. The web browser sends a request to the server (“ORDER 1 RONCO SALAD SHOOTER FOR JAMES Q MCGILLICUDDY”), and each time it does, the server honors it and takes action by initiating an order for another Ronco Salad Shooter. Click twice, order twice.

There are good and bad solutions to this problem.

The bad solutions involve trying to design webpages in a way that treats this as a stupid customer problem, where the customer should be given visual cues to make it less likely for them to double-click on a SUBMIT button, or where the merchant must proactively look out for duplicate transactions on the customer’s behalf. Another one that’s less problematic, but still not great, is to disable the SUBMIT button after the first click so the customer can’t click SUBMIT twice, which is just a simple solution… except if the customer has momentary Internet connection problems, has to reload the webpage, and doesn’t know whether the order has been placed, so he clicks SUBMIT, and as a result gets two Ronco Salad Shooters.

The correct solution is to design a communications protocol which is idempotent, and avoids depending on the customer or the web browser to be error-free. We’ll assign a unique ID for the transaction early on in the ordering process. When the customer wishes to finalize the order, we use a request with a transaction ID, like “TRANSACTION 123456 ORDER 1 RONCO SALAD SHOOTER FOR JAMES Q MCGILLICUDDY”, and design the server so that it will never execute a given transaction more than once. It doesn’t matter if the customer submits the order once or 17 times. The first one that is successfully received will be executed, and subsequent orders will be ignored, or perhaps returned to the customer with a message saying “We’ve already received your order.”

And if you think this is just an issue with HTTP client-server applications, think again. Idempotence shows up even in really simple embedded systems, like microwave ovens or garage door openers. Keys and push buttons have evolved so they’re now using capacitive touch sensors — on the plus side, no mechanical parts to wear out, but on the minus side, no mechanical feedback to let you know that you’ve successfully engaged the push button. And this makes it hard to tell whether you’ve pressed a button once or twice or not at all. While garage door openers still seem to use the old-fashioned snap-action pushbuttons with tactile feedback, at the heart of it they are still communication devices. Sometimes I have to press mine two or three times before the sensor in the garage picks it up and opens the door. And sometimes if I press it more than once, I miss seeing the door open and my next press causes the garage door to reverse directions and close. ARGH!

Here again, there’s a simple solution, at least in theory. Replace the pushbutton with a toggle switch.

When you flip the toggle switch, instead of a toggle message sent to the garage door mechanism — meaning “OPEN DOOR IF IT IS CLOSED, OR CLOSE DOOR IF IT IS OPEN” — if the toggle switch is up, the message sent to the garage door mechanism is “OPEN THE GARAGE DOOR”, and if the toggle switch is down, the message sent to the garage door mechanism is “CLOSE THE GARAGE DOOR”. Of course, this would require the garage door mechanism to know whether the door is up or down (I’m not sure there’s a sensor in there), and would require a change both in the transmitter hardware (toggle switches would seem kind of weird to use in a garage door opener) and software (the remote would have to do something like send repeated messages for 5 seconds after the switch has flipped states). And if you had more than one transmitter, you’d have to figure out to treat conflicting inputs… so it’s probably not very practical. But the system implications of a toggle switch are at least worth considering.

The next time you have to design a device with inputs, think carefully before you add a push button. Knobs and toggle switches are very effective devices that don’t have a problem with idempotence. You can’t mistakenly turn a light switch on twice, or mistakenly turn a thermostat dial twice to the same position.

I’ve just mentioned two solutions that are idempotent: one is to send messages with a transaction ID, and the other is to send a message with a goal state, that describes the desired end result. A third idempotent technique, which lends itself to concurrent programming, is to send a message with a “before” and an “after” state: include the assumed state before the operation in question, in addition to the desired end result. This approach is used in the compare-and-swap operation used in implementing semaphores.

These three techniques, and minor variants, show up in good programming practices all the time. As a recap:

Non-idempotent operations: increment a variable toggle a state advance to the next state

Idempotent operations: If the current state matches X, change it to Y Set a variable to Y Execute the transaction with unique ID X if it has not already been executed



Finally, don’t forget to take advantage of idempotence when you can. The most common use is just to perform redundant writes to increase system reliability. Here’s an example I ran into a few years ago:

We had a circuit board with a microcontroller and a SPI DAC. Pretty easy, just clock out some bits and the DAC updates its analog outputs. The problem was that the DAC did not provide a serial out pin (MOSI), so there was no way to read back any data, aside from feeding the DAC signals into a spare ADC channel, which we did not have the option of doing.

Why is this a problem? Because communications over SPI, even if the DAC and the microcontroller are right next to each other, are not guaranteed to be error-free, and the SPI protocol for this DAC didn’t have any checksum. Checksums and the ability to read back the contents of a transaction are ways of enhancing overall system reliability. In our case there was maybe a 4” distance between the chips, and there were switching power electronics nearby. So let’s say there’s a 10-9 possibility on each DAC write that the upper 3 bits of the value received by the DAC were not the same as in the value sent by the microcontroller. What this means is about every billion DAC writes, you will get a bad DAC value that does something. The duration and effect of that bad DAC value will depend on the frequency of writing.

If you are writing the DAC once an hour, that’s a really small failure rate, a one in 114,080 chance during any given year that you’ll see a bad DAC value. But it will be present for an hour, which is a pretty long time in an embedded system.

If you’re writing the DAC once a second, the numbers turn into a one in 31.7 chance during a given year you’ll see an error, but it will last a second.

If you’re writing the DAC once a millisecond, it’s now pretty likely over the course of a year that you’ll see an error (mean time between errors is 11.57 days), but it will only last a millisecond.

In our case, the system design only required us to set the DAC at powerup. But that small risk of failure would have had an effect that persisted for days, and we decided to write the DAC periodically every few milliseconds. I would have rather used a DAC that had a readback feature, but we overlooked that when we designed the circuit board, so instead we used idempotence to mitigate the overall system risk to errors.

And that wraps up today’s topic.

Next concept: immutability…

© 2014 Jason M. Sachs, all rights reserved.