The Annotated Annotated C Standard

C.D.W.Feather

This is a review of The Annotated ANSI C Standard, annotated by Herbert Schildt.

This review is made possible by the generosity of Raymond Chen <raymondc@microsoft.com>, who provided the review copy of the book, and is dedicated to the Dream Inn, Santa Cruz, CA, whose staff supplied uncounted cups of coffee while I wrote this review.

This version was modified on 1995-03-03. Thanks to the following for pointing out errors:

Stan Brown <brown@ncoast.org>

Jutta Degener <jutta@pobox.com>

Mark-Jason Dominus <mjd@saul.cis.upenn.edu>

Sue Meloy <suem@hprpcd.rose.hp.com>

Christopher R Volpe <volpe@ausable.crd.ge.com>

Alan Watson <alan@bernie.sal.wisc.edu>

Introduction

Since The Annotated ANSI C Standard first appeared, many people have commented on errors in the book. After reading several of these, I obtained a copy of the book and have read it in its entirety.

Many of these comments might appear to be relatively trivial. In response to this, I can only point out that the book is commenting on a very carefully designed document, and one that has to be read precisely. If the annotator cannot get things right, then the book is not just useless, but is a positive danger to those who do not have the time to read and analyse every word of the standard. In other contexts, such as a tutorial on C, some of the errors in this book could be allowed to pass, but not in this.

When I state that no mention is made of a topic, this indicates that I feel that the topic is at least as important as ones that were commented on; quite often this refers to the features of the standard which are less easy to understand.

Text quoted directly from the book is indicated by ## in the left margin.

General comments

Quite often, the book gives the impression that annotations were omitted because they couldn't be fitted into the format of "standard on the left, comments on the right". Whilst many pages of the standard have no annotations at all, there are no pages with annotation but no standard. I note at least one case below where I believe that a function was not annotated because the comments on the previous section took up too much space.

The front cover of the book shows, amongst much clutter and someone's half-eaten muffin, page 147 of the standard. It is intriguing to note that, not only is this the obsolete ANSI standard rather than the ISO standard, but that it corresponds to half of page 146 in the book.

The major divisions of the standard are referred to as "Part 1", "Part 2", etc. In actual fact, they are "clause 1", "clause 2", and so on. One has to wonder about an author who can't even get that right.

For a year after first writing this review, I believed that at least the left hand pages (the extracts of the Standard) were correct. It turns out that even this isn't the case! [See 6.1.3.1.]

Specific comments

Numbers at the start of each comment are the ISO subclause numbers of Schildt's annotations, which are not always the same as the subclause actually being annotated.

Introduction

3.10, 3.16, 3.17

3.13

## However, this limits the total character set to 255 characters.



3.14

## An object is either a variable or a constant that resides at a

## physical memory address.



5.1.1.3

## The standard requires that a compiler issue error messages when

## an error in the source code is encountered.



5.1.2.2

## You are therefore free to declare main() as required by your

## program.



void main (void)

struct foo { int i; double d; } main (double argc, struct foo argv)

Most of the examples in the book declare main() as void . I won't bother to point them out individually.

5.1.2.2.1

## Though most compilers will automatically return 0 when no other

## return value is specified (even when main() is declared as

## void ), you should not rely on this fact because it is not

## guaranteed by the standard.



main()

void

5.1.2.3

5.2.1.2

## Therefore, a multibyte character is a character that requires

## more than one byte.



## First, the null character may not be used except in the first

## byte of a multibyte sequence.



strcpy

There was an opportunity here to explain multibyte characters and how to use them, something that most books omit. Unfortunately, this one omits it as well.

5.2.3

## In other words, one copy of a library function in memory may

## not be used by two or more currently executing programs.



What this section of the standard is talking about is re-entrancy. The functions in the library are not re-entrant, and so may not be called from within themselves. For example:

qsort() cannot be called from within the compare function passed to qsort();

cannot be called from within the compare function passed to if a signal can be raised within a library function (perhaps by an external event such as the user pressing a BREAK key), then the signal handler must not call that library function.

malloc

malloc

5.2.4.1

## A compound statement is a block of code.



5.2.4.2

## First, notice that a character is defined as 8 bits (1 byte).

## All other types may vary in size, but in C a character is always

## 1 byte long.



The assumption that 1 byte = 8 bits occurs at several other points in the book. I won't always bother to point it out.

6.1

6.1.1

## No other keywords are allowed in a conforming program.



Of course, no other keywords are allowed in a strictly conforming program.

6.1.2

6.1.2.1

## * File scope begins with the beginning of the file and ends with

## the end of the file

## * Block scope begins with the opening { of a block and ends with

## its associated closing }.



/* Line 1 */ { /* Line 2 */ int i = 10; /* Line 3 */ { /* Line 4 */ int j = i; /* Line 5 */ int i = 5; /* Line 6 */ printf ("i = %d, j = %d

", i, j); /* Line 7 */ } /* Line 8 */ }

In particular, the "i" on line 4 refers to the one in the outer block, and so j has the value 10, not 5.

6.1.2.2

## Identifiers with external linkage are accessible by your entire

## program



6.1.2.3

6.1.2.5

## An unsigned integer expression cannot overflow. This is because

## there is no way to represent such an overflow as an unsigned

## quantity.



UINT_MAX+1

unsigned int

ULONG_MAX+1

unsigned long

## fractional-constant:

## digit-sequence[opt] . digit-sequence

## digit-sequence

6.1.3.4

## x = 'A'; /* give x the value 65 */



This, plus the comments assuming 8-bit bytes, and use of the terms "high byte" and "low byte" of integers later on, makes me wonder whether a better title for the book is: The ANSI C Standard annotated for some MSDOS compilers :-).

6.1.4

## In other words, the executable version of a C program contains

## a table that contains the string literals used by the program.



## Further, the effect of changing the string literal table is

## implementation dependent. The best practice is to avoid

## altering the string table.



6.2.1.2

## In the most general terms, when you convert from a larger

## integer type to a smaller type, high-order bytes are lost.



A simpler way to state what this section means is:

If the source value can be represented in the destination type, it is unaltered.

Otherwise, if the destination type is unsigned, reduce the value modulo U<type>_MAX+1.

Otherwise the destination type is signed and the value is implementation defined.

6.2.1.4

## When converting a larger [floating] type into a smaller one, if

## the value cannot be represented, information content may be lost.



6.2.1.5

## these automatic conversions are also intuitive.



6.2.2.1

## First, an array name without an index is a pointer to the first

## element of the array and is not an lvalue.



6.2.2.3

6.3

## The standard states that when an expression is evaluated, each

## object's value is modified only once. In theory, this

## means the compiler will not physically change the value of a

## variable in memory until the entire expression has been

## evaluated. In practice, however, you may not want to rely

## on this.



i = ++i + 1

i += 2

As anyone who has survived the " i = i++ " thread on comp.lang.c knows, this is not only nonsense, but dangerous nonsense. The correct way to discuss this part of the standard is to point out what can and can't be done in a strictly conforming program, and leave it at that. Suggesting that such code can ever have a defined answer is asking for trouble.

## The rest of this section formally defined what type of lvalue

## can refer to an object.



char *cp; int *ip; void f (double *d) { *d = 3.14159; *cp = 1; *ip = 2; }

*cp

*d

*ip

*d

*ip

6.3.2.2

## When no prototype for a function exists, it is not an error if

## the types and/or number of parameters and arguments differ.

## The reason for this seemingly strange rule is to provide

## compatibility with older C programs in which prototypes do not

## exist.



6.3.2.3

6.3.6

6.3.7

## When right-shifting a negative value, generally, ones are

## shifted in (thus preserving the sign bit), but this is

## implementation dependent.



6.3.13

6.3.16.2

+=

a += b

a = a + b

*a++ *= 2

*a++ = *a++ * 2

6.3.17

6.5

## In simple language, a declarator is the name of the object being

## declared.



static int *p[5];

*p[5]

6.5.1

## A variable declared using extern is not a definition.



extern int count = 10;

## In essence, a static local variable is a global variable with

## its scope restricted to a single function.



## When static is applied to a global variable or function, it

## causes that variable or function to have file scope



static

static

## The register specifier is only a request to the compiler, which

## may be completely ignored.



register

6.5.2.1

## This padding must occur at the end, not at the beginning, of the

## object.



6.5.3

## (Many compilers display a warning about this fragment, but still

## accept it.)

## const int i = 10;

## int *p;

## p = &i;

## *p = 0; /* modify a const object through p */



6.5.4

## The information and constraints in this section are mostly

## applicable to compiler implementors.



6.5.4.3

6.5.5

unsigned char *v[5];

unsigned char *[5];

6.5.7

## The general form of an initialization is

## type var = initializer;



int a [5] = { 1, 2, 3, 4, 5 };

6.6.4.2

default

break

I would also have appreciated a warning that ordinary labels are still allowed within the body of a switch statement, so:

switch (i) { /* ... */ defualt: j = 0; break; }

6.7.1

## To understand the difference between the modern and old forms,

## here is the same function defined using both forms:

##

## /* Modern function definition. */

## float f (int a, char c)

## {

## /* ... */

## }

## /* Old-form function definition. */

## float f (a, c)

## int a;

## char c;

## {

## /* ... */

## }



Unfortunately, these two aren't exactly the same. With the modern function definition, the argument corresponding to c is converted to type char and passed to the function. With the old-form definition, it is converted to int , passed to the function as an int , and then converted to char .

Why does this matter, you may ask ? Well, it matters when we're trying to write a prototype for the function. The prototype for the new form definition is:

float f (int a, char c);

float f (int a, int c);

6.8.2

## The #include statement has these two forms:



6.8.3

printf ("%d ", ABS (((-20) < 0 ? -(-20) : (-20)));

printf ("%d ", ((-20) < 0 ? -(-20) : (-20)));

6.8.6

#pragma

#ifdef

7.1.2

## All conforming C compilers will supply all of the functions

## described here.



7.1.3

## Frankly, many C programmers are not aware of the rules described

## in this section.



7.1.4

## If errno is zero, then no error has been detected.



7.1.6

offsetof()

offsetof()

7.3

## x is alphanumeric



7.4.1.1

## The setlocale() function sets all or a specified portion of

## those items described in the lconv structure



setlocale()

strcoll()

lconv

7.6

setjmp()

## result = setjmp (jumpbuf);



setjmp

while (setjmp (jumpbuf)) while (setjmp (jumpbuf) < 42) while (!setjmp (jumpbuf)) setjmp (jumpbuf);

while

if

switch

for

The standard also puts limitations on what can be done with local variables in functions that call setjmp() . I am surprised to find no mention of these limitations at all.

7.7

sig_atomic_t

7.9

## The type fpos_t is some type of an unsigned integer.



fpos_t

unsigned long

fpos_t

7.9.2

## Thus, it is permissible for a text stream to treat all

## characters as part of one long, uninterrupted line, if it

## so chooses.



The standard states that an implementation may treat spaces at the end of lines in text files specially, and may add and remove zero bytes at the end of binary files. Neither of these rules are mentioned.

7.9.5.2

fflush(NULL)

fflush

7.9.6.1

## Note that if stream is a pointer to stdout ,



stdout

stream

Here, and in many other places, printf() is called with a format of "%lf" and a corresponding argument which is a double . Unfortunately, the standard states that "%f" is the correct format for a double, and "%lf" is undefined. This is a particularly bad sin because the description of the "l" flag is missing (left page 132 of the book is a repeat of page 131).

While I cannot of course just copy the missing text, I have summarised what has been lost separately.

7.9.6.2

fflush(stdin)

## /* clear crlf from input buffer */

7.9.7

unsigned char

The first example calls fgetc() and assigns the result to a char variable. This means that an error or end-of-file will cause the program to loop forever.

7.9.10.2

## The following fragment illustrates how files are commonly read:

## do {

## ch = fgetc (fp);

## /* ... */

## } while (!feof (fp));



feof()

EOF

"feof

(fp)"

ch

EOF

EOF

feof()

7.9.10.3

## Also, for files opened for binary operations, EOF is a valid

## binary value and does not necessarily indicate an error or

## end-of-file condition.



char

int

fgetc()

fgetc()

EOF

It is true that EOF , cast to the type unsigned char , is identical to a value that can be read from a binary file (or even a text file). However, this is just the effect of bad programming; anyone with experience in C file handling should be aware of this.

7.10.1

## Also, remember that if the string does not contain a valid

## numeric value as defined by the function, then 0 is returned.

## Although strtod() , strtol() , and strtoul() set errno when an

## out-of-range condition exists, there is no requirement that

## errno be set when the string does not contain a number. Thus,

## if this is important to your program, you must manually

## check for the presence of a number before calling one of the

## conversion functions.



*endptr

nptr

7.10.2

time_t

time()

long

double

time()

unsigned int

time_t

unsigned int random_from_time (time_t t) { unsigned int i, j, k; char *p; i = 0; p = (char *) &t; /* Divide t up into pieces each the size of an unsigned int */ for (k = 0; k + sizeof j <= sizeof t; k += sizeof j) { /* Copy the bits of the piece into j and add the value to i */ memcpy ((char *) &j, p + k, sizeof j); i += j; } /* Do the same with any remnant (e.g. if j is 4 bytes and t is 11) */ if (k < sizeof t) { j = 0; memcpy ((char *) &j, p + k, sizeof t - k); i += j; } return i; }

7.10.7 and 7.10.8

## Since multibyte characters are implementation-specific, you

## should refer to your compiler's user manual for details.



7.11.4

strxfrm()

strxfrm

The example compares two arrays of floats using memcmp . While such a comparison is strictly conforming, it is not useful - the result of the comparison depends on the details of the encoding of floats, and is in no way related to which number is greater or smaller. (For example, it is possible to have an encoding in which 0 < 2 , but 2 > 3 , as far as this comparison works. In the same way, comparing integers with memcmp is equally useless on a little-endian system.)

7.12.2.3

mktime

mktime