The Astrée Static Analyzer

Participants:

Former participants:

Astrée stands for A nalyseur s tatique de logiciels t emps- ré el e mbarqués (real-time embedded software static analyzer). The development of Astrée started from scratch in at the Laboratoire d'Informatique of the École Normale Supérieure (LIENS), initially supported by the ASTRÉE project, the Centre National de la Recherche Scientifique, the École Normale Supérieure and, since September 2007, by INRIA (Paris—Rocquencourt).

Objectives of Astrée

Astrée is a static program analyzer aiming at proving the absence of Run Time Errors (RTE) in programs written in the C programming language. On personal computers, such errors, commonly found in programs, usually result in unpleasant error messages and the termination of the application, and sometimes in a system crash. In embedded applications, such errors may have graver consequences.

Astrée analyzes structured C programs, with complex memory usages, but without dynamic memory allocation and recursion. This encompasses many embedded programs as found in earth transportation, nuclear energy, medical instrumentation, aeronautic, and aerospace applications, in particular synchronous control/command such as electric flight control [30], [31] or space vessels maneuvers [32].

Industrial Applications of Astrée

The main applications of Astrée appeared two years after starting the project. Since then, Astrée has achieved the following unprecedented results on the static analysis of synchronous, time-triggered, real-time, safety critical, embedded software written or automatically generated in the C programming language:

In , Astrée was able to prove completely automatically the absence of any RTE in the primary flight control software of the Airbus A340 fly-by-wire system, a program of 132,000 lines of C analyzed in 1h20 on a 2.8 GHz 32-bit PC using 300 Mb of memory (and 50mn on a 64-bit AMD Athlon™ 64 using 580 Mb of memory). From on, Astrée was extended to analyze the electric flight control codes then in development and test for the A380 series. The operational application by Airbus France at the end of 2004 was just in time before the A380 maiden flight on Wednesday, 27 April, 2005. In , Astrée was able to prove completely automatically the absence of any RTE in a C version of the automatic docking software of the Jules Vernes Automated Transfer Vehicle (ATV) enabling ESA to transport payloads to the International Space Station [32].

Exploitation license of Astrée

Starting Dec. 2009, Astrée is available from AbsInt Angewandte Informatik ( www.absint.de/astree/) .

Theoretical Background of Astrée

The design of Astrée is based on abstract interpretation, a formal theory of discrete approximation applied to the semantics of the C programming language. The informal presentation Abstract Interpretation in a Nutshell aims at providing a short intuitive introduction to the theory. A video introduces program verification by abstract interpretation (in French: « La vérification des programmes par interprétation abstraite » ). More advanced introductory references are [1], [2] and [3].



Briefly, program verification — including finding possible run-time errors — is undecidable: there's is no mechanical method that can always answer truthfully whether programs may or not exhibit runtime properties — including absence of any run-time error —. This is a deep mathematical result dating from the works of Church, Gödel and Turing in the 1930's. When faced with this mathematical impossibility, the choice has been to design an abstract interpretation-based static analyzer that will automatically:

signal all possible errors ( Astrée is always sound);

is always sound); occasionally signal errors that cannot really happen (false alarms on spurious executions e.g. when hypotheses on the execution environment are not taken into account).

Of course, the goal is to be precise, that is to minimize the number of false alarms. The analysis must also be cost-effective, e.g. being a small fraction of the costs of running all tests of the program. In the context of safety-critical reactive software, the goal of zero false alarm was first attained when proving the absence of any RTE in the primary flight control software of the Airbus A340.

Astrée is based on the theory of abstract interpretation [1,2,3] and so proceeds by effectively computing an overapproximation of the trace semantic properties of analyzed programs and then proving that these abstract properties imply the absence of runtime errors. The program analysis is iterative [5], structural [10] (by induction on the program syntax), interprocedural and context-sensitive for procedures [6], and extremely precise for memory [24]. It combines several abstractions of a trace semantics [7,19] with a weak form of reduced product [7,26]. The basic general-purpose abstractions are either non-relationals (like intervals [4,5])) or weakly relational (like octagons [16]) with uniform interfaces [23]. Astrée precision comes from a clever handling of disjuctions [12,14,19] and domain-specific abstract domains [13,17] for control/command. Most abstractions are infinitary which requires convergence acceleration with widening/narrowing [5,9]. The soundness of the abstractions is based on Galois connections [5,7] or concretization-based [8] in absence of best abstraction.

Which Program Run-Time Properties are Proved by Astrée ?

Astrée

Any use of C defined by the international norm governing the C programming language (ISO/IEC 9899:1999) as having an undefined behavior (such as division by zero or out of bounds array indexing),

Any use of C violating the implementation-specific behavior of the aspects defined by ISO/IEC 9899:1999 as being specific to an implementation of the program on a given machine (such as the size of integers and arithmetic overflow),

Any potentially harmful or incorrect use of C violating optional user-defined programming guidelines (such as no modular arithmetic for integers, even though this might be the hardware choice), and also

Any violation of optional, user-provided assertions (similar to assert diagnostics for example), to prove user-defined run-time properties.

Astrée

Three Simple Examples ... Hard to Analyze in the Large

The examples below show typical difficulties in statically analyzing control/command programs. Of course, the real difficulty is to scale up!

Booleans

Control/command programs, in particular synchronous ones, manipulate thousands of boolean variables. Analyzing which program run-time properties hold when each such boolean variable is either true or false rapidly leads to a combinatorial explosion of the number of cases to be considered, that is prohibitive time and memory analysis costs.

For example, the analysis of the following program by Astrée:

/* boolean.c */ typedef enum {FALSE = 0, TRUE = 1} BOOLEAN; BOOLEAN B; void main () { unsigned int X, Y; while (1) { /* ... */ B = (X == 0); /* ... */ if (!B) { Y = 1 / X; }; /* ... */ }; }

B

X

Astrée has shown to be able to handle successfully thousands of boolean variables, with just enough precision to avoid both false alarms and combinatorial explosion has shown to be able to handle successfully thousands of boolean variables, with just enough precision to avoid both false alarms and combinatorial explosion [12]

Floating point computations

Command programs controlling complex physical systems are derived from mathematical models designed with real numbers whereas computer programs perform floating point computations. The two computation models are completely different and this can yield very surprising results, such as:

/* float-error.c */ int main () { float x, y, z, r; x = 1.000000019e+38; y = x + 1.0e21; z = x - 1.0e21; r = y - z; printf("%f

", r); } % gcc float-error.c % ./a.out 0.000000 % /* double-error.c */ int main () { double x; float y, z, r; /* x = ldexp(1.,50)+ldexp(1.,26); */ x = 1125899973951488.0; y = x + 1; z = x - 1; r = y - z; printf("%f

", r); } % gcc double-error.c % ./a.out 134217728.000000 %

2.0e21

2.0

Astrée handles floating point computations precisely and safely. For example, Astrée proves the following program free of run-time error (but for a "Loop never terminates" warning) when running on a machine with floats on 32 bits:

/* float.c */ void main () { float x,y,z; if ( (((*((unsigned*)&x) & 0x7f800000) >> 23) != 255 ) /* not NaN */ && (x >= -1.0e38) && (x <= 1.0e38) ) { while (1) { y = x+1.0e21; z = x-1.0e21; x = y-z; }} else return; }

Astrée is sound for floating point computations in that it takes all possible rounding errors into account (and there might be cumulative effects in programs computing for hours) [ is sound for floating point computations in that it takes all possible rounding errors into account (and there might be cumulative effects in programs computing for hours) [ 12 13 ].

Digital filters

Control/command programs perform lots of digital filtering, as shown by the following example:

/* filter.c */ typedef enum {FALSE = 0, TRUE = 1} BOOLEAN; BOOLEAN INIT; float P, X; void filter () { static float E[2], S[2]; if (INIT) { S[0] = X; P = X; E[0] = X; } else { P = (((((0.5 * X) - (E[0] * 0.7)) + (E[1] * 0.4)) + (S[0] * 1.5)) - (S[1] * 0.7)); } E[1] = E[0]; E[0] = X; S[1] = S[0]; S[0] = P; } void main () { X = 0.2 * X + 5; INIT = TRUE; while (1) { X = 0.9 * X + 35; filter (); INIT = FALSE; } }

The absence of overflow (and more precisely that P is in [-1327.05, 1327.05] as found by Astrée) is not obvious, in particular because of 32/64 bits floating point computations. The situation is even more inextricable in the presence of boolean control, cascades of filters, etc.

Astrée knows enough about control theory to make precise analyzes of filters [ knows enough about control theory to make precise analyzes of filters [ 12 13 ].

Astrée is sound, automatic, efficient, domain-aware, parametric, modular, extensible and precise

Some static analyzers consider only some of the possible run-time errors while others sort out the most probable ones. The aim is then static testing (that is to find out the most frequent bugs) rather than verification (that is to prove the absence of any run-time error). In contrast Astrée is sound. Astrée will always exhaustively consider all possible run-time errors and never omit to signal a potential run-time error, a minimal requirement for safety critical software.

Some static analyzers (e.g. using theorem provers) require programs to be decorated with inductive invariants. In contrast Astrée is fully automatic, that is never needs to rely on the user's help.

Some static analyzers have high computational costs (typically hours of computation per 10,000 lines of code) while others may never terminate or terminate out of memory. Astrée has shown to be efficient and to scale up to real size programs as found in the industrial practice. Since 2005, Astrée can run on multicore parallel or distributed machines In contrasthas shown to be efficient and to scale up to real size programs as found in the industrial practice. Since 2005,can run on multicore parallel or distributed machines [21]

General-purpose static analyzers aim at analyzing any program written in a given programming language and so can only rely on programming language-related properties to point at potential run-time errors.

Specialized static analyzers put additional restriction on considered program and so can take specific program structures into account. Astrée is domain-aware and so knows facts about application domains that are indispensable to make sophisticated proofs. For example, Astrée takes the logic and functional properties of control/command theory into account as implemented in control/command programs [13]. In contrast,is domain-aware and so knows facts about application domains that are indispensable to make sophisticated proofs. For example,takes the logic and functional properties of control/command theory into account as implemented in control/command programs [12] Moreover, Astrée is parametric. This means that the rate (cost of the analysis / precision of the analysis) can be fully adapted to the needs of Astrée 's end-users thanks to parameters and directives tuning the abstraction. Astrée is modular. It is made of pieces (so called abstract domains ) that can be assembled and parameterized to build application specific analyzers Astrée is made easy thanks to is modular. It is made of pieces (so called) that can be assembled and parameterized to build application specific analyzers [27] , fully adapted to a domain of application or to end-user needs. Written in OCaml , the modularization ofis made easy thanks to OCaml 's modules and functors. Finally, Astrée is extensible. In case of false alarms, it can be easily extended by introducing new abstract domains enhancing the precision of the analysis.

A consequence of generality may be low precision. Typical rates of false alarms (i.e. spurious warnings on potential errors than can never occur at runtime) are from 10% to 20% of the C basic operations in the program.

Specialized static analyzers achieve better precision (e.g. less than 10% of false alarms).

Even a high selectivity rate of 1 false alarm over 100 operations with potential run-time errors leaves a number of doubtful cases which may be unacceptable for very large safety-critical or mission-critical software (for example, a selectivity rate of 1% yields 1000 false alarms on a program with 100 000 operations). Astrée , being modular, parametric and domain-aware can be made very precise and has shown to be able to produce no false alarm, that is fully automated correctness proofs.

Theoretical work was done on locating the origin of alarms [22]. In contrast, being modular, parametric and domain-aware can be made very precise and has shown to be able to produce no false alarm, that is fully automated correctness proofs.Theoretical work was done on locating the origin of alarms [20]

Rapid overviews of Astrée is proposed in [14] and [18].

Presentations of Astrée

Astrée Flyer

Introductory Bibliographic References on Abstract Interpretation

Abstract Interpretation foundations of Astrée

Bibliographic References on Astrée

Bibliographic References on the Industrial Use of Astrée

News on Astrée in the press

Support of Astrée

The development of the Astrée Analyzer was supported in part by the French exploratory project ASTRÉE of the Réseau National de recherche et d'innovation en Technologies Logicielles (RNTL, now Agence Nationale de la Recherche, ANR) (2002—2006). The final review of the ASTRÉE project was on July 7th, 2006.

Pictures of Astrée

New York City, 6 Jan. 2007 Astrée poster, 9 Oct. 2007 Presentation by AbsInt at ESOP'2011

All spam emails to not containing ASTREE (in uppercase) in the subject are automatically filtered out.