Iverson’s Convention is, roughly speaking, the idea that a boolean result can be encoded using the first two natural numbers: 0 for false, 1 for true. More precisely that for a relationship R, aRb is 1 when a stands in the relation R to b , 0 otherwise. It’s 1 when {a,b} ∈ R, if you like to think of R a subset of the cartesian product.

Knuth promotes Iverson’s convention in this paper. Really Knuth is promoting a specific notation [aRb] meaning a function that is 1 when aRb is true and 0 otherwise. In the field of computer science I’m not really so concerned about the notation, more about the name. The name Iverson’s Convention doesn’t seem to be particularly well known, but I think it gives a useful name to distinguish different computer languages.

Computer programming languages roughly divide into those languages that use Iverson’s Convention (IC languages) and those that do not. C famously does, whereas Java does not. In C the result of «x == y» is either 0 or 1 and is of type int ; in Java the result of «x == y» is either true or false and is of type boolean .

Surveying a random smattering of languages (by consulting my bookcase) with regards to the question of what a typical relational expression, like «x < y», can evaluate to I find 3 rough categories: IC languages; those that have some sort of more or less well defined boolean type; and languages that do something else.

Iverson’s Convention languages: C, awk, J, Python, Sinclair BASIC, Inform 6, perl

Boolean type languages: Java, JavaScript, PostScript, Logo, Dylan

Something else: Lisp, FORTH, Icon, BCPL, /bin/sh.

[In the first version of this article, I placed Python in the non-IC camp. That was wrong and I have learned.]

[On 2007-08-02 I added Javascript to the discussion.]

BCPL surprised me by not being an IC language. BCPL has only one type, all values are an integer of some fixed bit-length. Iverson’s Convention would be a natural match for the language (and was adopted for one of BCPL’s descendents, C), but in BCPL true and false are implementation defined values (see [BCPL1980] page 13). FORTH is semi-IC; according to my [FORTH1983] false is 0 and true is non-zero. I expect there was a fair amount of variation across different FORTH implementations. Sinclair BASIC (on which I cut my teeth) is IC, but I remember there being considerable variation amongst different BASIC dialects.

Icon takes a completely different tack. In Icon all the usual relationship operators are control flow constructs. So «x < y» evaluates to y when the relationship holds, but when the relationship does not hold the expression is said to fail and produces no result. Constructs like if are sensitive to whether the control expression succeeded with any value or failed.

I have to say I’m surprised that their aren’t more languages in the IC camp along with C, J, and Python. Inform is esoteric but I included it because I happened to have [DM4] to hand. There are probably lots of similarly esoteric languages that I don’t know about, lazily copying the IC feature of C. Perhaps C++ is like that.

Python has an interesting take on the issue (which I got wrong in the first version of the article, but discussion below, particular from Paul and Gareth, has set me straight). In Python 2.2 expressions like «x < y» evaluate to 0 or 1 (integers). In Python 2.3 such expressions evaluate to a bool , True or False , but bool is a numeric type containing the integers 0 (spelt False ) and 1 (spelt True ). False and True are distinct objects from 0 and 1 but have values 0 and 1 when used as numbers. When converted to strings they convert as ‘False’ and ‘True’. It’s as if Python is trying to gradually and carefully rid itself of Iverson’s Convention.

PostScript can be thought of as a modernised FORTH, so perhaps it is an example of a language evolving from IC to non-IC (well, from nearly IC in the case of FORTH). Some languages designed since the rise of C—Java, most notably—have adopted a boolean type rather than IC. Perhaps there is a modern trend away from IC.

/bin/sh is amusing because it implements a sort of inverted Iverson’s Convention, IIC. true is 0 and false is 1, in the sense of the return codes used by programs likely to be used in if and while and friends (programs such as grep and test).

Common Lisp does have a boolean type (with values T and NIL ), but generally the functions that return a boolean result return a generalized boolean where NIL is returned for false, and any other value for true. That’s why it’s in the “other” category. Another thing that sets Common Lisp slightly apart from the “boolean type” languages is that Lisp’s false value, NIL , also has another rôle for being the empty list.

Naturally, J is an IC language.

What does if accept?

That’s a snapshot of Iverson’s Convention from a viewpoint of value production. So what about the behaviour of if when its condition is some value other than the canonical true/false values (whether that be 0 or 1 in the case of IC languages, or true, false for most of the others)? Mostly the languages are quite tolerant, following the principle of being conservative in what you produce and liberal in what you accept.

All the IC languages accept any integer in a condition and treat 0 as false and non-zero as true (generalised IC). C accepts any scalar and treats the condition «x» as being the same as «x != 0». Thus in C, the float 0.0f and the null pointer also act like false, and other values act like true. awk extended conditions to strings. In awk the empty string is treated like false and any other string is treated like true (see Appendix A for an awk obscurity). perl extends false privileges to the string ‘0’, the empty list, and the undefined value.

The boolean type languages are variously liberal. They have particular canonical values for true and false. Sometimes that’s encoded in a particular type ( boolean in Java, Lua), sometimes not. Java follows a strict model, only expressions that are statically of type boolean are permitted as conditions, and its boolean type contains only true and false ; this is true to Java’s static heart but a spurning of its C heritage that still surprises some C to Java converts. The other languages are more dynamic and more liberal. Generally there’s a finite set of value that are treated like false ( False , NIL , nil , 0) and the rest are treated like true.

Lua used to not have a boolean type; the boolean type was introduced in version 5.0; prior to that, Lua used nil (note, not the empty list) and any value different from nil as its false and true. So for historical reasons nil still acts like false when used in a condition. The caretakers of Lua think that they “should have introduced booleans from the start”, see [LUA2007] section 7.

Python follows the scripting language tradition (exemplified by awk and perl) of accepting the empty string for false and treating other strings as true ; it also follows the Lisp traditional of treating the empty list (and the 0-tuple) as false .

JavaScript also follows the scripting language tradition of accepting the empty string as false , as well as null and undefined . 0 length arrays are not special (unlike Python and perl), they are true just like any other language. JavaScript is one of the few languages to incorporate IEEE 754 floating point numbers as standard; the NaNs in the floating point type are also accepted as false (as well as both zeroes: -0, +0).

So in terms of what is acceptable as a boolean there are more conventions than for production. There’s generalised IC (C, awk, Python, perl, FORTH), boolean type (Lisp, Python, Lua, Java, JavaScript), empty string (awk, Python, perl, JavaScript), empty list (Lisp, Python, perl). Python and Lua both have the convention of using a value not otherwise related to other values as being equivalent to false: None in Python, and nil in Lua (this is also similar to undefined and null in JavaScript, I’m just not sure it’s similar enough). perl has a convention not shared by any other language (that I looked at): it accepts the string ‘0’ for false.

shaurz on reddit points out that in Python objects can pass themselves off as False by implementing __nonzero__ . Cool. It seems to be the only common language that makes true/false an extensible protocol.

Java is the only language that accepts exactly the same values as it produces.

There are many major languages left out, particularly given the ridiculous obscurity of some of the ones I have included, perhaps I will add more at some point (I did: on 2007-08-02 I added JavaScript). Scheme and ML deserve a showing because they Do The Right Thing; Prolog deserves a mention because, like Icon, true/false is matter of control flow, not computing values.

Appendix A – when a non-empty string in awk is true

awk is a little bit tricky because a value might have both a string part and a numeric part; when used in a condition the numeric part has priority. Where this gets hairy is for strings that would evaluate to 0 if used in a numeric context.

Consider the following two programs and their output:

$ awk 'BEGIN{a[1]="0";print a[1];if(a[1])print"hello"}' 0 hello $ awk 'BEGIN{split("0",a);print a[1];if(a[1])print"hello"}' 0

The first program prints a 0 followed by “hello”, the string a[1] is not empty so “hello” is printed. The second program prints just a 0. a[1] still has the same string value but this time it also has a numeric value; the numeric value of a[1] is tested by the if , found to false, so “hello” is not printed.

Bibliography

[BCPL1980] “BCPL the language and its compiler”; Martin Richards, Colin Whitby-Strevens; CUP; 1980.

[DM4] “The Inform Designer’s Manual”; Graham Nelson;; 2001.

[FORTH1983] “FORTH on the BBC Microcomputer”; Richard de Grandis–Harrison; Acornsoft; 1983.

[LUA2007] “The Evolution of Lua”; Roberto Ierusalimschy, Luiz Henrique de Figueiredo, Waldemar Celes; HOPL III; 2007

Share this: Twitter

Facebook

Like this: Like Loading... Related