Comparing Development Costs of

C and Ada March 30, 1995 Stephen F. Zeigler, Ph.D. Rational Software Corporation

Contents:

Programming Languages often incite zealotry because of theoretic advantages, whether from market acceptance or from intrinsic features. Practical comparisons of languages are more difficult. Some projects, notably prominently failing ones, cite choice of language and tools as a reason for their failure. Analysis is complex however: big projects aren't done twice in parallel just to see which language/tool choice is better, and even then there would be questions about the relative talent, teamwork, and fortunes of the efforts. This article discusses one case where most variables were controlled enough to make a comparison between development costs of C versus development costs of Ada.

Two companies, Verdix and Rational, merged in 1994 to form Rational Software Corporation. In the process of Due Diligence, records were examined to determine the value of the two companies but also to judge the best technologies and methodologies to continue into the future of the merged company. During that process, the extensive and complete update records of the Verdix Aloha development showed that there might be quantitative basis for comparing the success of two languages, C and Ada, under the fairest known conditions. A cursory pre-merger evaluation resulted in the following assertion:

"We have records of all changes ever made in the development of Verdix products; those records show us that for the lifecycle Ada seems to be about 2x or better for cost effectiveness than C..."

We have now completed a much more extensive evaluation. Our data applies more to development costs than to whole lifecycle costs. Even so, the above cost estimates are substantiated in comparing C and Ada development.

Verdix existed eleven years from March 1983 till its merger in March 1994 as, primarily, a vendor of Ada-related development tools. It entered the Ada market convinced that Ada would form the best language basis for developing reliable software, and that reliability would become the most important concern of large software in the 1990s. The resulting VADS productline of compilers, builders, runtimes, and debug tools became recognized in the industry as one of the finest available. Verdix, and its merge partner Rational, now sell a wide range of development tool products for Ada, C, C++ and Ada95. The Rational software development environment, called Apex, has been merged with the VADS technology described here. The result, Apex2.0, is obsoleting a substantial portion of the C-based code in VADS.

This article is based on internal VADS data, sanitized to protect individual developers. This data was and is used during development to understand code and its history, and for accounting purposes with Rational's certified auditor, Ernst & Young, for the purpose of software capitalization.

The VADS product base was begun in March of 1983. Records of every change were kept from that time, but were not annotated with explanatory notes and categorizations until early in 1986. This study and the relatively low problem rates of the VADS line are in part due to these records. Prior to this 1993-4 study, these records were not examined for comparing C and Ada, nor was there ever an intent to make such a comparison beyond idle speculation.

VADS was begun in the C language. Ada was not used until 1986 since no good Ada compilers were available. As time went on, Ada was used increasingly with the general rule similar to that of the DoD "Ada Mandate": use Ada if more than 30% of a project will be new code. By mid-1991, the amount of Ada and the amount of C in VADS had reached approximate parity.

VADS is built in a common source baseline, meaning that all VADS products are composed from the same source. VADS version 6.2 for the Sun SPARC is about the same as VADS v6.2 for the DEC Alpha, except for architecture-and operating system-specific optimizations.

The development team for VADS has preserved a relatively free-form style that encourages engineers to cooperate and move among the different functional areas of the line. Thus a person might add to an Ada-based tool when they first join, then learn how to build and test an entire tool set, and then move towards an area of specific interest such as "trace tools"; their work in trace tools might have them adding code to the Ada-based runtime system and linker, to the C-based code generators, and to the Ada-based test frameworks. Of the 62 contributors measured in this study, only one did no Ada updates and only one did no C updates; in each case these employees had been with the team for less than six months. Of the 62 contributors, only six are no longer working on the code as of mid-1994. Hiring has been mostly steady with growth of about 5 people per year.

Of the team members, most have Master's Degrees from good Computer Science schools. Most were considered excellent students. The more experienced contributors tend to work on the C parts of VADS because the C parts were begun first and because of the dictum that developers continue responsibility for any code they write.

The VADS tools, supporting both C and Ada, are used for their own development. The host platform C compiler and linker was used for C builds, but the developers would not normally be aware of this since these foreign tools were hidden within the common build apparatus (vmake), the common source code control system (dsc) and the common debugger (a.db). Contributors therefore see about the same debug/test/edit capability in each C/Ada section, and indeed may often be debugging both C and Ada at the same time. The same design methods were used regardless of language. The test apparatus also applied equally to each of C and Ada.

In summary, the C and Ada areas are worked by about the same people (with a slight advantage to C) and using the same tools under approximately the same conditions.

(Up to Oct. 15, 1994)

C_FILE ADA_FILE SCRIPT_FILE OTHER_FILE TOTALS all_lines: 1925523 1883751 117964 604078 4531316 SLOC: 1508695 1272771 117964 604078 3503508 files: 6057 9385 2815 3653 21910 updates: 47775 34516 12963 12189 107443 new_features: 26483 23031 5594 6145 61253 Fixes: 13890 5841 4603 1058 25392 Fixes/feature: .52 .25 .82 .17 .41 Fixes/KSLOC: 9.21 4.59 39.02 1.75 7.25 devel_cost: $15,873,508 $8,446,812 $1,814,610 $2,254,982 $28,389,856 cost/SLOC: $10.52 $6.62 $15.38 $3.72 $8.10 defects: 1020 122 (A) (B) 1242 defects/KSLOC: .676 .096 (A) (B) .355

Definitions:

C_FILE: Files of C-based source.

ADA_FILE: Files of Ada-based source.

SCRIPT_FILE: Files of scripts for Make, VMS-DCL, and internal tools such as "vmake" used for virtual make. The Make files are used primarily for C since Ada tools have automated build manager tools, but since release tools and VMS require work for Ada as well, these results are not just lumped in with C.

OTHER_FILE: Files used for documentation, or otherwise indeterminate purpose - all classifications were done automatically.

all_lines: Results from the Unix "wc" command. These include comments and blank lines for both C and Ada.

SLOC: Non-blank, non-comment lines of code. This is sometimes called SLOC, for Source Lines Of Code. Comments and blank lines were not measured for scripts and "other" files.

files: Unix files. C and Ada are distinguishable by their unique suffices. Script files are sometimes indistinguishable and are therefore under-reported with the balance in "OTHER_FILE".

updates: The source code control system tracks every change updated into each baseline (in this case, the main development "dev" baseline.)This row lists all updates of any kind, into the dev baseline.

new features: The subset of updates that added new features to the "dev" baseline. Features are similar to "function points" as defined in [].

Fixes: The subset of updates that fixed bugs in the "dev" baseline. More than 90% of these Fixes were found during unit test, before beta product release, and are called "internal fixes." Defects (customer-discovered bugs) are discussed later.

Fixes/feature: This row gives the ratio of the "new features" and "fixes" rows. It means that overall, we could expect to find .52 Fixes in each feature added in C, but only .25 in each feature added in Ada. Features took about 80 lines of additional code, on average.

Fixes/KSLOC: The measure of internal Fixes per 1000 SLOC over all time.

devel_cost: This is the approximate burdened costs for the people spending time on these various projects. This figure is approximated from the base salaries for the people involved. Since the base salary figures do not take into account salary changes during the development period, nor inflation, nor exact assignments, nor time spent on other (e.g. sales and mentoring) activities, nor many other burden costs, we calculate the burdened devel_cost by multiplying base salary information by the company's maximum burden factor of 2.0. It is meant only as a rough guideline giving an approximation of the bias of different salaried people. Dollars are adjusted for inflation as about 1992 valued.

cost/SLOC: This gives the approximate burdened cost of each SLOC. This value is based on the approximate devel_cost above.

defects: This row is condensed from customer support records. There were 16,440 customer interactions (called "tickets") which produced about 5072 possible defects (called "CR"s), which eventually produced 2004 deficiency reports (called "DR"s), which eventually resulted in about 1242 actual bugs (called "defects".)

defects/KSLOC: The classic measure of visible defects (customer-reported bugs) per 1000 SLOC.

1) This summary does not include obsoleted directories, such as for end-of-life'd products (e.g., the VADS for the Mil-STD 1750a 16 bit military computer) or for replaced components (e.g., the original C-based optimizer and runtime.) 2) This summary does not include source from the infrastructure support baselines (e.g., for our internal source code control and documentation) nor from our very large test baselines (e.g., ACVC tests or regression test suites) nor from our user documentation baselines. Also, it includes only the "dev" baseline, rather than released version baselines; these baselines occasionally have separate work done for them, but normally receive a subset of the fixes and a small subset of the features added to the dev baseline. 3) As explained below, fixes for makefiles and others could not be automatically distinguished. (A)+(B) together is 100.

This data is collected automatically by the source code control system in the same way regardless of language. The source code system does not take effect until developers make an update. That is, only updated code is tracked. It is expected that unit tests and a base automated test suite would pass before updates. In practice, some developers update readily and show more fixes, and some update infrequently with (normally) fewer fixes. The decision to update is influenced by how many other people might need or benefit from the results, how many others might be developing and therefore making updates in the same files, and a variety of other factors. Updates cannot be completely tested because the complexity of the product and its many switches, variants, hosts, targets, add-ons and usages make complete testing impossible

The process of updating requires several inputs from the developer, including a categorization of the general reason for the update Developer update records are code-reviewed by peers. Even so, the "fixed bug" categories may be underreported because some developers do several work items at once and update several fixes in amongst other feature addition-type changes; most of these updates seem to be recorded as "new feature" probably because of the minor stigma of recording bugs; on the other hand, at least one developer marks everything as a bug fix unless it is very clearly new development.

On the surface, Ada appears more cost-effective than does C. Ada lines cost about half as much as C lines, produce about 70% fewer internal fixes, and produce almost 90% fewer bugs for the final customer. But there are many variables that might explain these effects. We start with the question of whether a C line and an Ada line are comparable.

A first observation is that Ada has rigid requirements for making entities such as subprograms and variables visible globally. This leads to a separation of Ada code into specifications or "specs" and bodies. Could it be that there really wasn't that much Ada as far as functional lines, and that many are repeated specifications in the specs?

C_SPEC ADA_SPEC C_BODY ADA_BODY lines: 205087 781921 1720436 1101830 SLOC: 158911 453782 1349784 818989 files: 2252 5443 3805 3942 updates: 6541 15886 41234 18630 new_features: 4844 11253 21639 11778 fixes: 890 1894 13000 3947

The above data supports the conclusion that Ada has more "specification" file lines than C. Are these "redundant" lines? ADA_SPEC files often provide the body, as in the case of packages with inlined definitions or containing library subprograms, so over half of the above ADA_SPEC lines are actually ADA_BODY lines. In addition, the types, variables, and in fact all non-subprogram definitions are not redundant since their single definition in the ADA_SPEC serves all users.

C bodies also contained significant redundant code. C allows entities such as variables and subprograms to be imported either by definition in a ".h" C_SPEC file, or by an explicit "extern" definition in the C_BODY. When new entities are added to C bodies, some developers choose to avoid changing their C_SPEC .h file because compilers may recompile many files. Thus C_BODY files collect some redundant lines for shared variables and constants. This is not recommended coding practice and is no longer allowed since smarter recompilation and faster machines rebuild quickly even with .h file changes. However, for the purposes of these statistics, we must consider that some C_BODY files have some inflation.

The relationship between comments, blank lines and SLOC reveals a consistent pattern:

C_SPEC ADA_SPEC C_BODY ADA_BODY comments/KSLOC 186 483 169 181 blanks/KSLOC: 143 261 137 179

We see that Ada specification files are consistently more commented and have more white space. This effect is not by requirement. It appears to result from the use of Ada specification files for "understanding" the programs; developers seem to add comments to specification files because since the subprogram prototypes have to stand alone without code, the readers can't fall back on reading code as they would with C. In contrast, C header files are not normally used to navigate C programs; developers tend to go right to the actual subprograms and read code.

Comparing aggregate bug fixes, we see:

C_SPEC ADA_SPEC C_BODY ADA_BODY fixes/feature: .18 .16 .60 .33 fixes/KSLOC: 5.60 4.17 9.63 4.81

This table indicates that even comparing C_BODY and ADA_BODY, we see that fix rates remain twice as low for Ada compared to C. As expected, the avoidance of C specification changes reduced the number of C header file changes, while the presence of real code in ADA_SPECs increase the fix rate of Ada while decreasing the apparent fix rate of C.

We can understand the effective SLOC in C and Ada a bit more by studying the cost, in lines, SLOC and files, of implementing features. Once again the reluctant additions to C_SPEC files reduces the SLOC in C header files, while body files come out relatively comparable. It is surprising, however, that Ada generally takes more lines to implement features than does C.

C_SPEC ADA_SPEC C_BODY ADA_BODY lines/feature: 42 69 79 93 SLOC/feature: 32 40 62 69 files/feature: .46 .48 .17 .33

This raw data indicates that Ada is slightly more verbose either in SLOC or including all lines. Although a strength of Ada is its high-level, powerful features such as tasking and exceptions, VADS was designed before such features as tasks were available or effective. VADS makes less use of these powerful features, and therefore derives less of the full benefits possible with Ada.

C ADA cost/feature: $299 $183

In this estimate we amend our cost estimates by measuring feature-to-feature. On a feature basis, for every dollar we spent on Ada features we spent $1.63 on C.

The feature-by-feature costs seem more reliable, but may be complicated by automated code generation and code reuse

In both C and Ada we have tried to make use of reusable and automatically generated code. Some of this code is kept in the same baseline sections as active code. By auto-generated lines we mean lines counted in our line counts that are produced automatically by other programs. By reused code we mean sources that were obtained from partners or as public property but that are then taken over and used for our purposes, including modifications and repairs. Note that about 70% of all code is used in more than one product, as for example the core compiler pieces are all reused for every compiler variant; we don't need to compensate for direct reuse because the source code control system already accounts for this kind of reuse of identical internal code.

C_FILE ADA_FILE reused lines: 134844 175856 auto-generated 276442 7802 lines:

These reused and autogenerated lines change the statistics for SLOC in different ways. Reused lines are entered into the normal source/change tracking systems, showing up as a single "feature" with a lot of code associated with it; it may then have repairs and enhancements as would any other active code. Auto-generated code does not show up as a feature addition nor does it ever have any fixes or enhancements.

The presence of reused and auto-generated lines underscores the importance of considering features rather than SLOC. It is difficult to adjust for these lines, since they represent work yet do not participate equally with other lines. Since C has by far the greater number of autogenerated lines, its figures will certainly show improved fixes/line ratios as well as higher apparent productivity.

The cost per line of scripts was the highest of any category even measured with automated cost distribution. A more detailed analysis would likely reveal that script costs were higher, since they have few tools to support their organization or debugging, and since their effects are often widespread yet their accounting here reflects only the cost of a repair, not the cost to developers who are impeded by the bugs. Scripts have the highest bug rates of any category.

Our cost/feature figure above counts only the cost of developing the code itself, and does not account for the cost of managing C's makefiles and build apparatus (among other things). If we assume that at least half of the script costs are unique to C, then we can calculate cost per equivalent feature as:

C_FILE ADA_FILE cost/feature: $316 $183

We do not again include costs for C's makefiles in figures that follow. For more accurate cost measures of our historic development, we could take the costs beyond the simple code itself. Complex C programs like ours have become dependent on hand-crafted makefiles; Ada, with compilation order, elaboration order, exceptions, generics and real-time features, was considered too complex to link by hand, and so Ada tools have auto-build capabilities. With ANSI C and C++, "make" complexity is much higher for these languages as well, so makefiles are of reduced importance in the future of C/C++.