The US National Security Agency has released a case study showing how to develop zero-defect code in a cost-effective manner. The researchers of the project conclude that, if adopted widely, the practices advocated in the case study could help make commercial software programs more reliable and less vulnerable. I examined a small part of the case study's code, and was not impressed.

The Tokeneer case study involves an NSA-funded project carried out by the U.K.-based Praxis High Integrity Systems and Spre Inc. The project's materials, such as requirements, security target, specifications, designs, and proofs, and the code are now available online.

According to an article in the Government Computer News, Tokeneer meets or exceeds the Common Criteria Evaluation Assurance Level (EAL) 5. A related paper presented at the 1st IEEE International Symposium of Secure Software Engineering concludes that the case study has shown that software-based security products can be built so that they are reliable, verifiable and cost effective against Common Criteria guidelines, thus raising the bar for both procurers and suppliers.

I'm not the most qualified person to judge the project's requirements analysis, the formal specification, and formal refinement of the specification. However, intrigued by the statements regarding the project's software quality, and based on the legendary reputation of Praxis's work, I decided to download and have a look at the released code. I examined one of the largest source code files (tis_release/code/core/auditlog.adb), and in it I found difficult to maintain, error-prone code, a poor naming choice, inconsistent code formatting, and, even, what I think is a logic error (I sincerely hope somebody proves me wrong on the last one). My judgment criteria are relatively strict, but I think they are commensurate with the extraordinary claims made by the study.

Difficult to Maintain and Error-Prone Code

Here are excerpts of the code used for mapping the number of a log file to its name.

subtype LogFileIndexT is LogFileCountT range 1 .. MaxNumberLogFiles ; subtype FileNameI is Positive range 1 .. 16 ; subtype FileNameT is String ( FileNameI ); type LogFileNamesT is array ( LogFileIndexT ) of FileNameT ; LogFileNames : constant LogFileNamesT := LogFileNamesT '( 1 => "./Log/File01.log" , 2 => "./Log/File02.log" , 3 => "./Log/File03.log" , 4 => "./Log/File04.log" , 5 => "./Log/File05.log" , 6 => "./Log/File06.log" , 7 => "./Log/File07.log" , 8 => "./Log/File08.log" , 9 => "./Log/File09.log" , 10 => "./Log/File10.log" , 11 => "./Log/File11.log" , 12 => "./Log/File12.log" , 13 => "./Log/File13.log" , 14 => "./Log/File14.log" , 15 => "./Log/File15.log" , 16 => "./Log/File16.log" , 17 => "./Log/File17.log" );

See how more than 20 lines of code are used for what I could express in C or Java with a couple of lines. like the following.

static String logFileName ( int n ) { Formatter f = new Formatter (); return f . format ( "./Log/File%02d.log" , n ). toString (); }

There are a number of problems with this part of the study's code. First, the code violates the DRY (don't repeat yourself) principle, making it error prone, and difficult to change and maintain. Each array element initialization must be separately inspected by hand to verify that it matches the corresponding array index. The code also violates the single point of truth principle. Each time MaxNumberLogFiles changes, an entry must also be added to or removed from the array initialization. If the log file name or location changes, it must be changed 17 times, and if the string's length changes, one must also adjust the magic number 16 appearing in the declaration of FileNameI, which is the length of the filename used for initializing the array. Finally, the large number of repetitive lines inflates the productivity figure cited in the study (10,000 lines of code in 260 person-days, or about 38 lines of code per day).

A Poor Name Choice

A data structure named and documented as a list is actually what is commonly called a circular or ring buffer.

------------------------------------------------------------------ -- NextListIndex -- -- Description: -- Returns the next index, wrapping if necessary. -- -- Implementation Notes: -- None. -- ------------------------------------------------------------------ function NextListIndex ( Value : LogFileIndexT ) return LogFileIndexT is Result : LogFileIndexT ; begin if Value = LogFileIndexT ' Last then Result := LogFileIndexT ' First ; else Result := Value + 1 ; end if ; return Result ; end NextListIndex ;

The above 21 lines implement what is commonly written inline using a simple modulo division.

Inconsistent Code Formatting

The code's developers, despite the development environment they use, seem to have trouble maintaining a consistent formatting style for spacing the code's elements. As an example, notice the spacing after LogFiles. We have cases where there is a space before the bracket, after the bracket, and no space at all.

LogFilesStatus ( Index ) := Free ; LogFiles ( Index ) := TheFile ; FileH := LogFiles ( I );

A Logic Error?

A function, SystemFaultOccurred, is documented to return "True exactly when a critical system fault has occurred while attempting to maintain the audit log." This matches the corresponding specification (tis_release/docs/50_2_INFORMED_Design/50_2.pdf p. 43): "The operation SystemFaultOccurred indicates whether or not a critical system fault has occurred whilst writing to the log."

This is implemented by having a global variable, AuditSystemFault , set very conservatively to true whenever something goes wrong in the system. Here are the cases in the code where AuditSystemFault is set. (OK is a variable set by various system functions.)

AuditSystemFault := AuditSystemFault or not OK ; AuditSystemFault := AuditSystemFault or not OK ; AuditSystemFault := AuditSystemFault or not OK ; AuditSystemFault := AuditSystemFault or not OK ; AuditSystemFault := True ; AuditSystemFault := True ; AuditSystemFault := True ; AuditSystemFault := True ; AuditSystemFault := not OK ;

However, when a log file is deleted the previous value of AuditSystemFault is ANDed instead of ORed with the operation's result, thus clearing it if the deletion is successful, and failing to set it if the deletion fails, but no fault was detected before.

File . Delete ( TheFile => TheFile , Success => OK ); AuditSystemFault := AuditSystemFault and not OK ;

The following table illustrates this strange behavior.

AuditSystemFault OK not OK AuditSystemFault and not OK False False True False False True False False True False True True True True False False

If the implemented behavior is indeed correct (I doubt it), at the very least what looks like a strange divergence from the specification should have been documented in the code.

I found these problems in less than an hour, in the second source code file I looked (the first was extremely short). At the very least my findings indicate that formal methods are not a substitute or a guarantee of good programming practices.

Comments

Even the Mil standard for software from 1994 was ridiculously bad (be sure to check off that you've done stress testing and notified a supervisor that you've written complete units tests, etc)

Posted by: Jack9 on Sun Oct 19 03:44:34 2008 UTC — permalink

Interesting and valuable analysis.

What this shows, of course, is that there's much more to "good" code than being error-free. We can see the same thing by noting that quality in code is about much more than just reliability, even though far too many see quality as only reliability.

I am not suprised that code sought to be error-free falls short in other respects. Typically, when we concentrate heavily on one quality criteria, the others may suffer.

Plus your analysis also illustrates that one person's good code is another person's trash. I agree with your analysis of the flaws in this code, but others might not - I can remember some of my own code that I thought exceptionally brilliant which others thought was truly awful. "To each his own," "different strokes for different folks"...

Posted by: Robert L. Glass on Sun Oct 19 06:07:34 2008 UTC — permalink

your concerns are pretty orthogonal to theirs. You haven't really established that their code will do something other than what it is supposed to do... which is exactly what formal methods are supposed to verify will not happen. I am not aware of any automated formal methods, or any other highly reliable software engineering practice, that stops bad or misleading comments. To do that in an automated or semi-automated fashion would probably take some form of AI, and the current wetware AI's we have right now can't even get universal agreement on whether many comments are bad or not. your statement that formal methods do not ensure good software engineering practice is probably the most accurate and telling about the code. I am guessing (without proof, the irony) that the multiple file names comes from some kind of loop unwinding, either manual or automated. Loops and recursion are extra work for the kinds of proof assistants that were probably used on this code, and by strategically eliminating them you can prevent having to do more complicated proofs for the code.

Posted by: CS Student on Sun Oct 19 06:15:15 2008 UTC — permalink

Just a quick comment about your analysis of the filename mapping. The purpose of listing out each mapping explicitly is because it's much easier to prove the correctness of the code in that format (the mapping itself is the proof). Using a helper function requires you to also prove the correctness of that function. Simplicity in this case is reducing complexity, and you can't get any less complex than a direct 1 to 1 mapping.

Posted by: decibel on Sun Oct 19 06:16:40 2008 UTC — permalink

I apologize for the above run-on post, I have Javascript turned off, which I'm guessing fixes things like automatically inserting br tags.also, thank you for writing Code Reading. Excellent book.

Posted by: CS Student on Sun Oct 19 06:23:17 2008 UTC — permalink

I think you are wrong about the logical error. Maybe a system fault is not defined to be critical if the log file could be deleted without error. Given the very strict rules put on this code I would be highly surprised if it actually was a real error. Also, your part about poor naming is stupid. You don't even mention what you think is bad about the name(s). Perhaps rename that section?

Posted by: rustupid on Sun Oct 19 06:23:40 2008 UTC — permalink

Go search for safety-critical programming on Google for more on this subject.In general, code must do what was specified, ONLY what was specified, and must be completely deterministic even when passed bad data. As such, your example logging function does not serve as an equivalent implementation to theirs. Among other things, your example does not have a known maximum path length, which could itself be a fatal mistake on an embedded system. It also risks overflowing a simple filesystem's allowed number of directory entries.The big difference necessary for zero-defect/safety-critical coding is to expressly specify all behavior. This is orthogonal to the normal rules of programmer-efficient programming, where the one critical expense is programmer-hours. It is also orthogonal in many cases to processor-efficient programming as well, in part due to all the repetitive range-and-value checks that are necessary to guarantee that each module exactly implements its specified behavior.This kind of software development almost always results in behavior specifications that are even larger than the resulting code. It may not be suitable for Word, but it's a good idea for aircraft engine management computers, pacemakers, or nuclear warhead triggers.

Posted by: Robert on Sun Oct 19 07:25:20 2008 UTC — permalink

It may be that the log file naming code is satisfying a requirement for little or no dynamic memory allocation. This is a typical USGov requirement for embedded systems. If all memory must be statically allocated at initialization, then your code would not meet the spec. Or, they could be following Ken Thompson's dictum: "When in doubt, use brute force".

Posted by: Lurker on Sun Oct 19 08:18:36 2008 UTC — permalink

Your log file version won't trap errors at compile time - the ADA version will - that's why it's done that way, to ensure that *only* those items in the array can be used. One of the main advantages of ADA (and SPARK) is that most run-time errors can be eliminated at compile time. You can't do that in Java or C, unless you run it through something like PolySpace first.

Posted by: Dave on Sun Oct 19 09:45:47 2008 UTC — permalink

I'm not sure it is a logic error. It's only supposed to be set if it's a critical system error. If the error working with the file happened before it was deleted, but the deletion went ok, then things can carry on as normal.

For example, say the file was too big and couldn't be written to. This is a critical error. If the file is then deleted, it's not a problem any more.

It *could* be a logic error, but I haven't traced the code flow so I don't know if could overwrite an important error.

Posted by: Ian Calvert on Sun Oct 19 10:08:32 2008 UTC — permalink

First I see 1..16, and then 1 to 17 log file names. If that is correct it sure isn't very obvious why. At least to me.

Posted by: Tommy Hallgren on Sun Oct 19 10:23:30 2008 UTC — permalink

Re: I think you are wrong about the logical error. Maybe a system fault is not defined to be critical if the log file could be deleted without error.

I did not find this in the specifications. It a plausible guess though. Another thought that occurred to me is that deletions may initially fail before the log files are created, and the expression is an attempt to work around that.

Re: Using a helper function requires you to also prove the correctness of that function.

I would think that having a repertoire of helper functions proved to be correct would an excellent way to increase the software's reliability.

Re: your part about poor naming is stupid. You don't even mention what you think is bad about the name(s).

Given that the so-called list wraps around, NextRingIndex, would be more appropriate.

Re: It may be that the log file naming code is satisfying a requirement for little or no dynamic memory allocation.

I agree, but the log file number could be patched over a statically allocated string.

Posted by: Diomidis Spinellis on Sun Oct 19 11:02:29 2008 UTC — permalink

hi,thank god so many people already realized that the data-structure bound 1:1 mapping is easier to verify. this is THE goal of such projects. also, problems within the formatter might be a problem for verification (e.g. buffer overflows in sprintfs and the like). to be comparable, you would have at least apply sanity checks w.r.t. to the integer parameter n. regarding the location changes: this would not affect the system, if the "Log" directory would be a symlink; so a change could be done externally w/o even the need for recompiling the code at hand; i have seen many people doing this and it totally makes sense to me. (whereas in this scenario i think that this is specified somewhere, which in goverment scenarios probably comes second to engravement in stone...)i fully agree with the naming problems in your second statement; if for some reason this had to fit with some naming scheme, it should have at least been noted in the documentation.same with the third problem, though given comments suggest meaningful alternatives to your interpretation, at least a hint in the source code would have been necessary.

Posted by: jacques on Sun Oct 19 12:19:54 2008 UTC — permalink

Re: code must do what was specified, ONLY what was specified, and must be completely deterministic even when passed bad data.

Bad data can come from outside the application or from within it. It is reasonable for the application to assume (or, better, prove) that data passed to a function from within the application is always correct. For instance, the NextListIndex function assumes that the value passed is less than or equal to LogFileIndexT'Last. This is not even specified in a precondition. If the value is larger than LogFileIndexT'Last, then the function will return a wrong result.

Similarly, in the case of mapping a file number to a file name, assuming that the log file's number will not be "bad" is a reasonable assumption, one that could be proved statically. The original code also doesn't ensure that the log files will not overflow a directory with a limited number of entries, if that number is less than 17. One way I can think of meeting such a constraint is to add a constant in the code (MaxNumberOfAllowedDirectoryEntries), and add a compile-time assertion (MaxLogFileNumber <= MaxNumberOfAllowedDirectoryEntries).

Posted by: Diomidis Spinellis on Sun Oct 19 13:59:49 2008 UTC — permalink

The code you have presented as being "corrections" would not result in a verified zero-defect deliverable. Formal methods such as these were developed, in part, to counter arrogant programmers such as yourself who have the hubris to look at verified zero-defect code, then, believing themselves to be "improving it", introduce potential defects. Secondly, please see the definition of "defect". Naming conventions and the other petty whinges you made about this code are not considered defects.

Posted by: Kurt on Sun Oct 19 14:02:34 2008 UTC — permalink

subtype LogFileIndexT is LogFileCountT range 1 .. MaxNumberLogFiles; -- ... function NextListIndex(Value : LogFileIndexT) return LogFileIndexT

Re: "For instance, the NextListIndex function assumes that the value passed is less than or equal to LogFileIndexT'Last. This is not even specified in a precondition. If the value is larger than LogFileIndexT'Last, then the function will return a wrong result."NextListIndex takes a LogFileIndexT as its argument and returns a LogFileIndexT. LogFileIndexT is a constrained integer type. You don't need an explicit runtime check there, the language takes care of that.

Posted by: Ilmari on Sun Oct 19 14:43:39 2008 UTC — permalink

Re: Naming conventions and the other petty whinges you made about this code are not considered defects

As somebody pointed above, my concerns are orthogonal to those of the study. Maintainability (which includes appropriate naming and correct formatting), flexibility, and developer productivity are not optional attributes of software development. In most environments they are as important as zero defects. If a development method doesn't address them (and I think I've shown that this study falls short in this respect), then people who can choose will use the more functional and up-to-date software, even if it is full of bugs and vulnerabilities. The end result is the nightmare of millions of Windows zombie machines we're currently facing.

In environments where people aren't allowed to choose they will actively try to circumvent dated and spartan zero-defect software, using their own laptops and iPhones to get their job done, again at the cost of their organization's overall security and reliability.

Posted by: Diomidis Spinellis on Sun Oct 19 14:43:55 2008 UTC — permalink

Re: LogFileIndexT is a constrained integer type

Thanks for pointing this out. My C background shows!

Posted by: Diomidis Spinellis on Sun Oct 19 14:45:42 2008 UTC — permalink

I worked on a zero-defect project. It underwent 2 years of statistical testing with a group of 3 people. It was the kind of project where a screw up could kill someone. The testing never uncovered a single defect, and the device is deployed in Iraq today in an automated "doc-in-a-box". Live testing involved shooting pigs and hooking them up and seeing if the device could extend their lives when shot and bleeding to death. I came into the project half way through. Various proofs were constructed about the code.

The code did none of the best practices you mention above. It performed flawlessly--it was zero defect and measurably 6-sigma (real statistical 6-sigma, not that let's make 4.5 and call it 6 that the b...s... 6-sigma movement does). I've always wondered after that, just how good are the best practices when they don't necessarily lead to less defect code. Programmers have this great ability to argue what's good and bad to the end of project billable time, but no evidence that it actually improves the code.

That being said, when it came time to make a change to the above code base (change in requirements), it was god-awful difficult on top of redoing the zero-defect process. A lot of the things you mention above, got changed.

Posted by: Shawn Garbett on Sun Oct 19 15:32:08 2008 UTC — permalink

Learn ada. once you do that it will make a lot more sense. Ease of writing does not equal quality.

Posted by: bd on Sun Oct 19 15:54:53 2008 UTC — permalink

There are also many interesting comments on a related thread at reddit.

Posted by: Diomidis Spinellis on Sun Oct 19 17:16:00 2008 UTC — permalink

The code is intentionally written in a strict subset of Ada called Spark, that is amenable to verification. The code probably doesn't use the call to 'format' because it cannot be readily verified, and if the number of log files change, I the verification process will probably detect that. I doubt the logical problem you found is a real error, given the simplicity of detecting simple logical errors like that with a verification framework and given Praxis' famously low error rate generally.

Posted by: Greg on Sun Oct 19 18:39:50 2008 UTC — permalink

Re: I doubt the logical problem you found is a real error, given the simplicity of detecting simple logical errors like that with a verification framework

I would think the same, and this is why I went to the specifications, trying to find a statement saying that critical errors can be cleared when a log file is deleted. I didn't find anything like that in the specification of the corresponding function. Also, I would expect that such an error would be uncovered during testing, but testing the code's error handling is a notoriously difficult task.

Posted by: Diomidis Spinellis on Sun Oct 19 18:47:36 2008 UTC — permalink

Comparing VHDL to Java isn't really fair.

Posted by: Sean on Sun Oct 19 21:44:44 2008 UTC — permalink

Really... what's with the 1..16 and 1..17 that Tommy Hallgren mentioned?

Posted by: Doc on Mon Oct 20 10:39:05 2008 UTC — permalink

Re: First I see 1..16, and then 1 to 17 log file names.

1..16 specifies the characters in the file name string, while 1..17 specifies the various log files.

Posted by: Diomidis Spinellis on Mon Oct 20 11:05:15 2008 UTC — permalink

Re: Comparing VHDL to Java isn't really fair.

I'm not a Java funboy (but some of my research students are). I code more often in C/C++, and when I thought about the problem of generating a log file name based on its ordinal number I realized that writing a demonstratably correct function in C/C++ was not trivial. So, as an intentionally provocative (partly to myself) example I included the Java code.

I'm quite familiar with the restrictions placed on dynamic memory by various standards. I first heard about them as a student in the 1980's, and I also mention them in my book Code Quality (p. 287). However, I think it's time to think rationally about this issue and move forward. Dynamic memory allocation can lead to non-determinism and errors, but is also a big help in creating systems that are flexible, robust, and efficient. The concerns over dynamic memory allocation might have been valid in the 1980s when memory was scarce and experience, tools, libraries, and runtime environments to manage dynamic memory limited. This picture however has now changed, and, I believe, the benefits of dynamic memory allocation outweigh its problems. Besides, modern systems have many other sources of non-determinism, like I&D caching, branch prediction, and PCI enumeration.

Posted by: Diomidis Spinellis on Mon Oct 20 11:24:35 2008 UTC — permalink

I'd like to address Diomidis' points in order, but firstly I'd like to address the use of the phrase "zero defects" in relation to the Tokeneer system. This has caused lots of confusion in the past, and reading the posts here and over at reddit, it seems this still needs some clarification.

Let me be clear: we have never claimed that there actually are "zero defects" in the TIS Core software. That would, rather obviously, be a ridiculous thing to do.

In contrast, we have reported the number of defects found by a) SPRE Inc, during their initial independent testing of the system back in 2003, and b) the number of defects reported by the NSA between delivery of the system in 2003 and August 2008. Both of these numbers are "zero". (These data beg the question "Well..how hard did they look?" which is a good point, of course.) One thing we're hoping for from the public release of this material is a much larger body of people to look much harder than that...it seems Diomidis has already made a good start.

While preparing the material for release in August 2008, I found one defect - a potential for an overflow in an Integer multiplication. This is documented in the Overview and Reader's Guide that accompanies the release.

OK...moving on to Diomidis' specific points:

Difficult to Maintain and Error-Prone Code

In this section, Diomidis makes several fair points about the style of this code. His solution in Java is undoubtedly the right way to do it in Java, but the crucial thing here is that this isn't Java - it's SPARK - which is fundamentally different beast.

A quick intro to some of the design goals of SPARK - I don't want to turn this into a tutorial on SPARK, but this might be necessary to explain what's going on here.

SPARK is mostly used in embedded, hard real-time systems, which are either security- or safety-critical. It also tends to be used in systems where the "Size" of the problem domain is known in advance and on target hardware with well-known and fixed limits on resources such as CPU time and physical RAM. In this example, we're exploiting the fact that the maximum number of log files is a well-defined constant to constrain the problem.

SPARK's design goals include the provision of a sound verification framework based on Hoare-Logic, and allowing for a runtime model with basically zero overhead, "footprint", COTS or libraries of any kind. This might seen nuts if you're used to Java with its myriad of libraries, but it makes sense for us folk writing stuff that has to get past certification with the FAA, evaluation by the NSA or any other stringent regulatory regime. For these reasons, SPARK currently excludes Ada95's "&" (concatenation) operator for Strings, the 'Image attribute (which turns a scalar value into a String equivalent), and all sorts of complexity to do with anonymous and dynamic subtypes. It's also worth noting that Ada's "&" and 'Image can drag in a _lot_ of run-time library code since they both involve returning an object whose size isn't known at compile-time.

The TIS Core software is written in SPARK as if it could be run on a bare-board machine with little or no operating system, runtime library or anything. In particular, the software requires no dynamic memory allocation at all, and is also designed to be amenable to the static analysis of worst-case memory usage and execution time - so avoiding the use of "&" and 'Image makes sense in that context.

So...with that in mind, I hope the oringinal SPARK version of the code makes more sense. Here we see the development team favouring the simplicity of a repetitive-but-simple static lookup table over a more complex (but shorter) dynamic approach like you would write in Java.

Note also the use of static sub-types in this code to restrict the range of indexes and the length of strings to be wholly static - this is very common (and vitally important) programming style in SPARK, but often seems foreign to those coming from a background in C, C++ or Java. I can't help noting that Domidis' solution takes an "int" type parameter, for example....what happens in his code if I pass a negative value for the argument? I'm afraid my knowledge of Java's formatter class isn't up to knowing what would happen in that case.

A Poor Name Choice

Diomidis points out this function could be implemented with a simple modulo operator. It would come out in SPARK as:

return (Value mod MaxNumberLogFile) + 1;

turning something like 5 logical lines of code into 1 (or 21 physical lines into 16 if you prefer to count that way.)

(Note that the range of LogFileIndexT starts at 1, not zero, so some real care is needed here to make sure that the range of the result is in 1 .. 17, not 0 .. 16).

Is this easier to read? Is it more obviously correct? I guess that's a matter for readers to decide for themselves. I accept the point that the name could be better to reflect the modular/circular nature of this abstraction.

Inconsistent Code Formatting

Good point. We normally used gcc's "-gnaty" switches to enforce this kind of basic style checking. (See the gcc docs for what these switches actually do.) I looked at this during while preparing the release, and had it on my "to do" list, but I basically ran out of time, since we had a deadline to release the material in time for the VSTTE conference in Toronto earlier this month.

Don't know why the development team didn't use these checks during the original project - I will ask and see what they have to say.

A Logic Error?

This one will take some more time to check. I will ask Dr Janet Barnes - the Tokeneer project leader - for her view on this. This might take a little while longer, though, since Janet is working on another project, and, besides, her work on Tokeneer was over 5 years ago, so her memory may not be perfect.

I hope to write more on this topic soon.

Thanks to you all for taking the patience to read this far...

Yours, Rod Chapman, SPARK Team, Praxis

Posted by: Rod Chapman on Mon Oct 20 14:22:35 2008 UTC — permalink

Rod, many thanks for your response; I think it clarifies matters and this blog's readers will benefit from it.

I know you people have real work to do, and I hope you've not regretted releasing the code. At the very least it's generating positive publicity. There are over 100 comments on this subject at reddit.com, and the majority support your views. More importantly, the release is also a valuable resource for educators and the software engineering community. I've already included a pointer in my lecture notes.

Posted by: Diomidis Spinellis on Mon Oct 20 14:58:22 2008 UTC — permalink

This seems like a classic "clash of cultures" situation. The good news is that both communities have a lot to learn from each other. The bad news is that there is a period of negotiation first, to establish common ground. I've been impressed with the SPARK programmers I've talked with. We had a lot of values and principles in common, including the importance of concrete feedback from frequent integrations. One thing I notice in the programming style of the code is that many things are expressed imperatively that could be expressed declaratively. The lesson I take from this is that languages like SPARK could be more powerful with more declarative features. Simply relying on cut-paste-edit to declare the 17 log files is error prone. A declarative expression of the same intention would be less susceptible to errors and easier to change. My final observation is that languages like Java try to encourage change while languages like SPARK try to encourage reliability. There is enough experience with both to design a language that loses none of the reliability of SPARK, but adds enough declarative features to make its programs amenable to change.

Posted by: Kent Beck on Mon Oct 20 18:33:26 2008 UTC — permalink

The interesting thing is that a declarative style of programming is typically associated with fewer errors and amenability to formal proofs. So there's definitely room for innovation.

Posted by: Diomidis Spinellis on Mon Oct 20 21:04:21 2008 UTC — permalink

A Logic Error? (Revisited...) OK...having completed our analysis, let's get straight to the point: it's a bug. Clearly a typing slip by the developer - substituting "and" when "or" was required. I've looked at the printed code review records for that unit (yes...we really do keep that stuff), and it was missed by the reviewer as well. Could the proof system have helped? Well...yes...if we'd gone as far as an attempt at more partial-correctness proof for that unit. The original project only did "just enough" partial correctness work to verify security property 1 before we basically ran out of money. The bug can be revealed by considering what a sensible invariant for AuditSystemFault might be. As Diomidis points out, this flag should basically have the property that once set True, it can never return to being False. This can be expressed as a post-condition on any subprogram that modifies AuditSystemLog: --# post AuditSystemFault~ -> AuditSystemFault; where "~" means "the initial value of", and "->" is the Boolean implication operator. You can find all such subprograms by simply scanning the global annotations for "in out AuditSystemFault". I added the above post-condition to all such subprograms. The Theorem-Prover then reveals exactly ONE undischarged verification condition for our buggy procedure. Moral: if you introduce state at implementation-time, then it's worth also spending time to derive and verify an invariant to go with it. Yours, Rod Chapman, SPARK Team

Posted by: Rod Chapman on Fri Oct 24 09:52:42 2008 UTC — permalink

I agree that the restrictions on dynamic memory allocation are overkill. The B-2 bomber had the entire flight control system restart _during_ takeoff in Palmdale. Yet the crew noticed no change in flying qualities or loss of control, because the system restart was designed to cover this case.

So what would be the worst thing to happen if you allowed malloc? If your system is safety-critical, you're going to be dealing with far more remote and worse sources of error. So just deal with it and move on.

Diomidis is correct; when used properly, dynamic memory management can produce a _more_ reliable and flexible system.

None of which, unfortunately, helps when your contract specifically requires conformance to a standard which bans malloc.

Posted by: Lurker on Sat Oct 25 17:03:59 2008 UTC — permalink

I think the major concern with dynamic memory allocation is not that it has nondeterministic execution time � that can be avoided with an object pool allocator � but that the use of dynamic allocation implies that you don't know how much memory your code is going to need at run-time, and so you can't guarantee that it doesn't crash. That is, dynamic memory allocation has effectively nondeterministic behavior � it nondeterministically works correctly or crashes, depending on the amount of memory remaining. With enough cleverness, of course, you can modify a program using dynamic allocation to have provable space bounds and construct a proof of those bounds, but it's not necessarily trivial. And once you do, you can transform it into a program that internally allocates entries out of arrays that are big enough to meet the program's proven space requirements.

Maybe Rod can correct me about whether that's the concern SPARK (or TIS Core?) is addressing with its ban on dynamic allocation.

I think Diomidis's statements about I/D caching and branch prediction are true in a restricted subset of "modern systems". 8051s, PICs, 8-bit AVRs, and a lot of ARM7TDMIs don't have caches or branch prediction or do PCI enumeration, etc., in part because they are marketed to folks who really care a lot about things like deterministic execution times. And there are still a lot more of those processors being sold than there are PowerPCs and 386 clones.

Now, I've never worked in this deeply-embedded space in any serious way. But I suspect that the tradeoff between the costs and benefits of dynamic memory allocation is still situation-dependent. Maybe there are some programs where an out-of-memory crash and restart will take long enough, or lose enough information, to wreck an expensive workpiece being machined, or a million-dollar CNC machine tool, or a jet engine propelling a passenger airplane, or lock up the wheels on a minivan and send it into a skid. Maybe the benefits of dynamic allocation in those programs aren't great enough to justify those risks. Maybe someone with experience in that area can comment?

Posted by: Kragen Javier Sitaker on Sun Oct 26 09:03:49 2008 UTC — permalink

I have worked on C/C++ embedded systems where dynamic memory allocation was forbidden as part of company policy. This rule may or may not have been inspired by the US Government requirement for embedded systems. It ended up meaning that we knew that we would not fail on a memory error, but also meant that in a lot of situations, we were memory constrained in situations where we did not really need to be. This rule also lead to a lot of reinventing of the wheel. For instance I had to write my own LZ-77 + Adaptive Huffman Coding implementations that worked with pre-allocated memory segments. We would have liked to have used zlib instead, but with no allocator implemented we had no other option. The end result was that we had a slower implementation that we also had to maintain a version to run on desktop computers as well since the resulting format was not compatible with an existing standards. One advantage was that we could look at a memory dump of the ELF file and see where all the memory was going and when memory was getting tight, figure out where we needed to optimize. Blanket rules like "No dynamic memory allocation" are a double edge sword. They can limit some kinds of errors, but they can also cause you to have to write extra code that is likely to contribute to the bug count, rather than use well tested existing code. In our case, we were not using formal methods to prove the correctness of the system and my comments are directed specifically at using such rules in an attempt to provide reliability to software *not* using formal methods in situations that are not safety critical.

Posted by: Christopher Lambacher on Mon Oct 27 23:46:40 2008 UTC — permalink

aye, there's the rub Kent Beck's post hit the nail on the head - different engineering approaches for differing systems and requirements with lots that each camp can learn from each other. You know, 25 years ago my University lecturer fell off his chair in laughter after he looked at my design implementation of a microwave oven controller. My only "mistake" was to employ multi tasking processing in my design. Not a good thing considering the size of the smallest multi-tasking system at the time was the size of a refrigerator. I argued that I was just simply way ahead of my time.

Posted by: Thanasi on Wed Oct 29 07:53:44 2008 UTC — permalink

"The interesting thing is that a declarative style of programming is typically associated with fewer errors and amenability to formal proofs." Partly academic bullshit. Declarative programming introduces its own set of bugs (I've programmed with Prolog). And they are more difficult to track. The "formal verification" (disgusting ambiguous terminology) is like yet another type checking system. The assertions are incomplete, easiest to construct when you need them least, and local. It is possible to create a declarative programming language based on 'assertions', but of course it doesn't make the programs bugfree. By the way, have you ever thought how bogus mathematical proofs must be, because they have not been formally verified? "Formal verification" is from 1980's and replaced with specification languages, which provided faster and easier ways to improve the quality of software. The specification languages attack the local-deficit of assertions. A bunch of assertions doesn't guarantee that they make globally sense. It does occur too, that specifications itself can have holes, mindlessness or be irrelevant to the problem at hand. Linking "formal verification" to "zero defect code" is as dishonest as claiming that nobody foresaw about subprime bomb. And I think that G�del had his objections too. It seems that not much progress has occurred in the last 20 years. The guys doing formal verification still don't know how to program. They should learn that first and start to develop formal verification, if they ever learn to program. It takes just 10 years and few ever learn. ************** What comes to removing 'malloc' makes sense socially and in the average. Most of the programmers are not very good at programming and 'malloc' is dangerous in their incapable hands. For social reasons some other explanation than "we don't give this tool to idiots" had to be invented. ************************ The commercial programs nowadays do not even intend to be bug-free. They look for statistically satisfactory purchaser experience. When the payer and the user of the program are different persons, the user experience is certainly crappy. Let's send a beta to the users, let's collect the most crash points of the code, let's fix them and call it the final release.

Posted by: anonymous on Fri Oct 31 16:41:41 2008 UTC — permalink

paper presented at the 2010 Embedded Real Time Software and Systems conference details the results of a more extensive verification effort.

Posted by: Diomidis Spinellis on Mon Dec 13 17:21:17 2010 UTC — permalink

Further commenting disabled

Sadly, due to the huge amount of spam received, which a home-brew captcha and reCAPTCHA were unable to stop, commenting gets disabled 10 days after an entry's posting. Sorry.