Article 7844 of comp.lang.c:

From: dmr@alice.UUCP

Newsgroups: comp.lang.c

Subject: noalias comments to X3J11

Message-ID: <7753@alice.UUCP>

Date: 20 Mar 88 08:37:58 GMT

Organization: AT&T Bell Laboratories, Murray Hill NJ

Lines: 333



Reproduced below is the long essay I sent as an official comment to X3J11. It is in two parts; the first points out some problems in the current definition of `const,' and the second is a diatribe about `noalias'.

By way of introduction, the important thing about `const' is that the current wording says, in section 3.3.4, that a pointer to a const-qualified object may be cast to a pointer to the plain object, but "If an attempt is made to modify the pointed-to object by means of the converted pointer, the behavior is undefined." Because function prototypes tend to convert your pointers to const-qualified pointers, difficulties arise.

In discussion with various X3J11 members, I learned that this section is now regarded as an inadvertant error, and no one thinks that it will last in its current form. Nevertheless, it seemed wisest to keep my comments in their original strong form. The intentions of the committee are irrelevant; only their document matters.

The second part of the essay is about noalias as such. It seems likely that even the intentions of the committee on this subject are confused.

Here's the jeremiad.

Dennis Ritchie

research!dmr

dmr@research.att.com



This is an essay on why I do not like X3J11 type qualifiers. It is my own opinion; I am not speaking for AT&T.

Let me begin by saying that I'm not convinced that even the pre-December qualifiers (`const' and `volatile') carry their weight; I suspect that what they add to the cost of learning and using the language is not repaid in greater expressiveness.

`Volatile', in particular, is a frill for esoteric applications, and much better expressed by other means. Its chief virtue is that nearly everyone can forget about it. `Const' is simultaneously more useful and more obtrusive; you can't avoid learning about it, because of its presence in the library interface. Nevertheless, I don't argue for the extirpation of qualifiers, if only because it is too late.

The fundamental problem is that it is not possible to write real programs using the X3J11 definition of C. The committee has created an unreal language that no one can or will actually use. While the problems of `const' may owe to careless drafting of the specification, `noalias' is an altogether mistaken notion, and must not survive.

1. The qualifiers create an inconsistent language

A substantial fraction of the library cannot be expressed in the proposed language.

One of the simplest routines,

char *strchr(const noalias char *s, int c);

Unfortunately, there is no way in X3J11's language for strchr to return the value it promises to, because of the semantics of return (3.6.6.4) and casts (3.3.4). Whether the stripping of the const and noalias qualifiers is done by cast inside strchr , or implicitly by its return statement, strchr returns a pointer that (because of `const') cannot be stored through, and (because of `noalias') cannot even be dereferenced; by the rules, it is useless. (Incidentally, I think this observation was made by Tom Plum several years ago; it's disconcerting that the inconsistency remains.)

Although the plain words of the Standard deny it, plastering the appropriate `non-const' cast on an expression to silence a compiler is sometimes safe, because the most probable implementation of `const' objects will allow them to be read through any access path, and will diagnose attempts to change them by generating an access violation fault at run time. That is, in common implementations, adding or taking away the `const' qualifier of a pointer can never create any bugs not implicit in the rule `do not modify a genuine const object through any access path.'

Nevertheless, I must emphasize that this is not the rule that X3J11 has written, and that its library is inconsistent with its language. Someone writing an interpreter using X3J11/88-001 is perfectly at liberty to (indeed, is advised to) carry with each pointer a `modifiable' bit, that (following 3.3.4) remains off when a pointer to a const-qualified object is cast into a plain pointer. This implementation will prevent many of the real uses of strchr , for example. I'm thinking of things like

if (p = strchr(q, '/')) *p = ' ';

A related observation is that string literals are not of type `array of const char.' Indeed, the Rationale (88-004 version) says, `However, string literals do not have [this type], in order to avoid the problems of pointer type checking, particularly with library functions....' Should this bald statement be considered anything other than an admission that X3J11's rules are screwy? It is ludicrous that the committee introduces the `const' qualifier, and also makes strings unwritable, yet is unable to connect the two conceptions.

2. Noalias is an abomination

`Noalias' is much more dangerous; the committee is planting timebombs that are sure to explode in people's faces. Assigning an ordinary pointer to a pointer to a `noalias' object is a license for the compiler to undertake aggressive optimizations that are completely legal by the committee's rules, but make hash of apparently safe programs. Again, the problem is most visible in the library; parameters declared `noalias type *' are especially problematical.

In order to write such a library routine using the new parameter declarations, it is in practice necessary to violate 3.3.4: `A pointer to a noalias-qualified type ... may be converted to ... the non-noalias-qualified type. If the pointed to object is referred to by means of the converted pointer, the behavior is undefined.' Thus, the problem that occurs with `const' is now much worse; there are no interesting and legal uses of strchr .

How do you code a routine whose prototype specifies a noalias pointer? If you fail to violate 3.3.4, but instead try to rewrite the declarations of temporary variables to make them agree in type with parameters, it becomes hard to be sure that the routine works. Consider the specification of strtok :

char *strtok(noalias char *s1, noalias const char *s2);

qsort

The `noalias' rules have the assignment and cast restrictions backwards. Assigning a plain pointer to a const-qualified pointer ( pc = p ) is well-defined by the rules and is safe, in that it restricts what you can do with pc . The other way around ( p = pc ) is forbidden, presumably because it creates a writable access path to an unwritable object. With `noalias,' the rules are the same ( pna = p is OK, p = pna is forbidden), but the realistic safety requirements are completely different. Both of these assignments are equally suspicious, in that both create two access paths to an object, one manifestation of which might be virtual.

Here is another way of observing the asymmetry: the presence of `const type *' in a parameter list is a useful piece of interface information, but `noalias type *' most assuredly is not. Given the declaration

memcpy(noalias void *s1, const noalias void *s2, size_t n);

More generally, suppose I see a prototype

char *magicfunction(noalias char *, noalias char *);

Within the function itself, things are equally bad. A `const type *' parameter, though it presents problems for strchr and other routines, does usefully constrain the function: it's not allowed to store through the pointer. However, within a function with a `noalias type *' parameter, nothing is gained except bizarre restrictions: it can't cast the parameter to a plain pointer, and it can't assign the parameter to another noalias pointer without creating unwanted handles and potential virtual objects. The interface must say noalias, or at any rate does say noalias, so the author of the routine has all the grotesque inventions of 3.5.3 (handles, virtual objects) rubbed in his face, like or not.

The utter wrongness of `noalias' is that the information it seeks to convey is not a property of an object at all. `Const,' for all its technical faults, is at least a genuine property of objects; `noalias' is not, and the committee's confused attempt to improve optimization by pinning a new qualifier on objects spoils the language. `Noalias' is a bogus invention that is not necessary, and not in any case sufficient for its apparent purpose.

Earlier languages flirted with gizmos intended to help optimization, and generally abandoned them. The original Fortran, for example, had a FREQUENCY statement that didn't help much, confused people, and was dropped. PL/1 had `normal/abnormal' and `uses/sets' attributes that suffered a similar fate. Today, these are generally looked on as adolescent experiments.

On the other hand, the insufficiency of `noalias' is suggested by Cray's Fortran compiler, which has 20 separate keywords that control various details of optimization. They are specified by an equivalent of #pragma , and thus, despite their oddness, can be ignored when trying to understand the meaning of a program.

Perhaps there is some reason to provide a mechanism for asserting, in a particular patch of code, that the compiler is free to make optimistic assumptions about the kinds of aliasing that can occur. I don't know any acceptable way of changing the language specification to express the possibility of this kind of optimization, and I don't know how much performance improvement is likely to result. I would encourage compiler-writers to experiment with extensions, by #pragma or otherwise, to see what ideas and improvements they can come up with, but I am certain that nothing resembling the noalias proposal should be in the Standard.

3. The cost of inconsistency

K&R C has one important internal contradiction (variadic functions are forbidden, yet printf exists) and one important divergence between rule and reality (common vs. ref/def external data definitions). These contradictions have been an embarrassment to me throughout the years, and resolving them was high on X3J11's agenda. X3J11 did manage to come up with an adequate, if awkward, solution to the first problem. Their solution to the second was the same as mine (make a rule, then issue a blanket license to violate it).

I'm aware that there are distinctions to be made between `conforming' and `strictly conforming' programs. Although the X3J11 rules for qualifiers are inconsistent, and therefore most nominally X3J11 compilers will ignore, or only warn about, casts and assignments that X3J11 says are undefined, people will somehow survive. C has, after all, survived the vararg and the extern problems.

Nevertheless, I advise strongly against sanctifying a language specification that no one can possibly embody in a useful compiler. This advice is based on bitter experience.

4. What to do?

Noalias must go. This is non-negotiable.

It must not be reworded, reformulated or reinvented. The draft's description is badly flawed, but that is not the problem. The concept is wrong from start to finish. It negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) `the spirit of C.'

Const has two virtues: putting things in read-only memory, and expressing interface restrictions. For example, saying

char *strchr(const char *s, int c);

Reword page 47, lines 3-5 of 3.3.4 (Cast operators), to remove the undefinedness of modifying pointed-to objects, or remove these lines altogether (since casting non-qualified to qualified isn't discussed explicitly either.) Rewrite the constraint on page 54, lines 14-15, to say that pointers may be assigned without taking qualifiers into account. Preserve all current constraints against modifying non-modifiable lvalues, that is things of manifestly const-qualified type. String literals have type `const char []'. Add a constraint (or discussion or example) to assignment that makes clear the illegality of assigning to an object whose actual type is const-qualified, no matter what access path is used. There is a manifest constraint that is easy to check (left side is not const-qualified), but also a non-checkable constraint (left side is not secretly const-qualified). The effect should be that converting between pointers to const-qualified and plain objects is legal and well-defined; avoiding assignment through pointers that derive ultimately from `const' objects is the programmer's responsibility.

These rules give up a certain amount of checking, but they save the consistency of the language.