Source Citing: Making Examples Work

Example isn’t another way to teach, it is the only way to teach. (Albert Einstein) If the example is correct, that is. In every release. So don’t take chances. Cite your examples from automated tests.

We’ve all used products where the examples in the documentation simply didn’t work. On quickly evolving products, this is almost a given. At best, we’re annoyed, but continue to use the product; at worst, we stop using it.

People learn best from examples. They’ll scan documentation for example code, read it, and only resort to reading the surrounding text when the code isn’t clear to them by itself. They’ll often copy/paste this example code directly into their own projects and expect it to compile. Examples are a crucial part of a user’s early experience with your product.

When working on a hot, fast moving project, how can we developers ensure our documentation will help build trust and acceptance instead of putting people off?

Why Examples Go Bad

Let’s take a look at why examples in documentation go bad:

Some examples never were good; this is the example code we type directly into the document – right out of our heads. Such code – no matter how short – is rarely correct because we have come to rely on sophisticated IDEs to catch typos and other minor errors immediately. While these IDEs free us to look at the bigger picture, we are no longer trained to automatically spot minor errors ourselves. We have to actively look for them. That is not what we do when we are – rightly – focused on writing good documentation.

Assuming the example code was both syntactically and semantically correct at some point, it can go out of date for a number of reasons:

The underlying API has changed so that the example code won’t even compile anymore. Given the ease with which modern IDEs let us refactor, this happens much more quickly than you’d think – especially on fast moving projects. The “release early, release often” mantra implies that we also must document early, but that means that code in early stages of development is documented. This kind of code goes through API changes often (“refactor constantly”). Even in later stages APIs can still change, though less drastically. For example, if you deprecate elements within the API, you’d like your examples to use the non-deprecated API, wouldn’t you?

The semantics have changed so that the example code still compiles, but produces incorrect results at runtime. These changes in semantics were intentional. All the tests have been updated to reflect this. It is just the example code that got out of date.

In both of the cases above it may even be the API or semantics of a third party library, which changed when you updated to a newer version.

Quality Assurance

Of course, in a project with ideal quality assurance, these problems are detected and corrected. Right. Maybe in some well funded and well managed projects. In fact, the ideal API needs no documentation at all. You should continue reading only if your project is not perfect.

In smaller, developer-driven projects, documentation often flounders. There may have been a laudable initial attempt at good documentation, but keeping it current is just not a top priority – despite its critical effect on our potential users’ first impression. This is particularly true for many open source projects.

Why do we let errors in our documentation slip through? Because checking it quickly becomes very time consuming and tedious. This is not the kind of task programmers typically excel at. Often it’s downright impossible to maintain. Why is that?

Snippets of example code from the documentation must be put into a larger, compilable and testable context for verification.

To test semantic correctness, the surrounding text must be read to see precisely what the snippet under test was supposed to do.

The documentation may have grown to the point where even just proofreading everything becomes prohibitive.

Our ever shorter development iterations, while otherwise beneficial, exacerbate the above problems even further.

A Note On Sample Applications

An approach often taken with example code is to provide complete sample applications for users to study. These, however, suffer from a number of problems:

They often bury the salient points in lots of surrounding code.

They are not well suited to the inclusion of accompanying explanatory text.

Depending on the programming language, it is difficult to structure the example code for didactic effect.

A bunch of sample apps looks much less finished than well structured, nicely formatted documentation with embedded sample code.

Though complete sample apps are a worthy addition, they are only an addition. And though they ostensibly compile on their own and can be included in release builds, they are still prone to failure if semantics change without accompanying syntactical changes.

It should be obvious by now. What automated testing (and unit testing in particular) did for code quality, it will also have to do for documentation quality. Only in this way can we keep the blessings of short iterations, constant refactoring and frequent releases whithout sacrificing the quality of our documentation.

So, how do we automatically test documentation? For a start, we can test the examples. And that is likely the most important test we should have in our entire product anyway. It doesn’t matter if a test suite of thousands of tests runs “all green” if our users can’t use the product because the examples don’t work.

For the moment, let’s confine this discussion to API documentation and code examples – we’ll expand it later. Consider the following piece of documentation:

The separation of the Engine and Computation interfaces allows the concurrent use of an engine across multiple threads. Each thread simply creates its own computations on the central engine. Engines are thus fully thread-safe and non-blocking. Engine engine = getEngine(); Inputs inputs = new Inputs( 4, 40 ); Outputs outputs = (Outputs) engine.newComputation( inputs ); double result = outputs.getResult();

How do we write an automated test for this fragment of code? By not writing it as a fragment at all. Instead, we write a full test, like any other test we write, using our testing framework of choice. Then we cite the relevant lines from this test in the documentation. Since we write the test in our IDE, we enjoy all of its benefits when writing the code. Here’s the full test method:

public void testSample1() throws EngineError { // ---- sample1 Engine engine = getEngine(); Inputs inputs = new Inputs( 4, 40 ); Outputs outputs = (Outputs) engine.newComputation( inputs ); double result = outputs.getResult(); // ---- sample1 assertEquals( 160, result ); }

JCite

In this example, I use JCite, an open source Java source code citation tool. It relies on markers, // ---- sample1 in the example above, to extract code fragments. The corresponding documentation source is:

The separation of the `Engine` and `Computation` interfaces allows the concurrent use of an engine across multiple threads. Each thread simply creates its own computations on the central engine. Engines are thus fully thread-safe and non-blocking. [jc:ch.arrenbrecht.myproject.tests.SampleTests:---- sample1]

During release builds, JCite automatically replaces the last line by the source snippet cited from the actual test source code, properly formatted using Java2Html.

Let’s look at what we have achieved so far:

No more silly errors in example code which was written right in the documentation.

Instead, much better comfort when writing the code inside the IDE.

No more non-compiling code after refactorings or other API changes.

No more non-working code after semantic changes.

Elevation of usage examples to same status as actual product code, meaning it gets tested automatically before every release.

Examples As Tripwire

What remains is that the text of the documentation may still get out of sync. Imagine, in the code example shown above, that we renamed Engine to Computer . In modern IDEs, this refactoring would automatically change our sample test method as well. But the references to the word Engine in the documentation text would not be affected.

It’s unlikely that we will ever be able to reliably automate such changes in text. But the example code can act as a tripwire, alerting us when changes might affect surrounding text. If we use examples liberally, sprinkling our documentation with fragments from actual code, chances are high that we will catch all outdated text without having to reread the whole documentation before every release. And since people learn well from examples, this is a good thing anyway.

For this scheme to work, the citation tool must archive a copy of the cited source after every run. Then, when it is run again, it compares the new source to the archived version. If the source has changed, it alerts us. Only when we have confirmed that all changes are OK, meaning that we have checked the text surrounding every change, do we tell the tool to go ahead and overwrite the archived versions with the new ones. (As of release 1.9, JCite supports this.)

Testing Non-Code Examples

Not all products are driven by source code accessing APIs. How do we ensure the correctness of examples given in the documentation for

a command-line tool,

a web application, or

a GUI application?

Simply put, in order for automated testing, and especially unit testing, to work, we had to learn to make our products easily testable. This led to less monolithic design and better internal APIs.

Citability

Now we shall have to learn to make our products citable. While not identical, citability is closely related to testability because both rely on the system under inspection being able to supply snapshots of defined states, and to be brought from state to state using code or scripts, that is, automation.

For a command-line tool, this will be fairly easy. A web application, with its clear demarcation of requests and responses, should also be relatively straightforward to handle.

GUI applications will likely offer more resistance. GUI test tools will have to support automated generation of partial screenshots during test runs. Screenshot annotations (like lines, arrows, labels, and flyouts) will likely have to be maintained as a separate layer on top of the generated screenshot.

Nevertheless, we should strive to achieve this goal. While a programmer may often be able to infer a correction for broken examples of API usage, users of GUI products may be less forgiving.

JDemo is one project I know that is working towards making Swing GUI applications citable. And since I am working on a project that involves Excel sheets as user input, I am considering writing a sibling to JCite for Excel sheets.

Conclusion

The approach I describe here (let’s call it source citing) is certainly not new, nor is it especially complex or sophisticated. Like most of the disciplines of agile methodologies, it is fairly straightforward. And, like these other methodologies, it must be part of a concerted approach. It’s a natural extension of the automated testing discipline, extended to product documentation.

A crucial advantage of source citing is that it leverages the best of our existing tools – IDEs for code, editors for text and diagrams – while being as non-intrusive as possible. Given broader acceptance, integration into IDEs – as happened for unit testing – might even further improve the usability and efficiency of citing source. In particular, visiting triggered tripwires in the documentation could certainly benefit from integration.

More and more developers regard with suspicion projects and job offerings where automated testing of code is not part of the daily routine and release builds. It’s time we started to have equal misgivings about projects where documentation is denied this automated support.

To comment, please join the discussion over at javalobby.org.

With many thanks to Marco von Ballmoos for editing this article.