This arises from very useful discussions with Gareth and Paul in my article on Iverson’s convention.

Python code, like this: «if x is 1 :», that uses is to test equality with numbers, is subtely wrong and should not be tolerated. Python’s is operator tests for object identity (whether two expressions evaluate to the same object). Numbers in Python are objects just like everything else (and very sensible this is too).

The problem is that there may be more than one object with the same value. And since you as a programmer are almost certainly only interested in the value of a number it’s probably a mistake to use is for numbers. Two numbers with the same value aren’t necessarily the same object. Example using float:

>>> x = 1.0 >>> x is 1.0 False >>> x == 1.0 True

Each floating point number is actually a new object, so the «x is 1.0» test fails; the literal 1.0 creates a new float object that is not the same as the one that was created for the assignment «x = 1.0».

The same is true of long:

>>> x = 1L >>> x is 1L False >>> x == 1L True

Compiler memoisation

As Gareth points out there is an additional subtlety. The Python compiler feels at liberty to merge literal number objects having the same value into one number object; at least within a compilation unit:

>>> x = 1L >>> y = 1L; print x is 1L, y is 1L False True

«y is 1L» is True in this case because Python uses the same 1L object for the all the 1L literals that appear on the same line passed to the interpreter.

On int

For int (Plain integer) the situtation is a bit more misleading. Certainly an int is an object and you can have two int objects with the same value, creating the same issues as for float and long:

>>> x = 2007 >>> x is 2007 False

The issue is that using is seems to work for smaller numbers:

>>> x = 160 >>> x is 160 True

What is happening is that Python maintains a global store of pre-made int objects for some of the more commonly found values. Whenever it needs an int object it checks to see if it is one of its favourites that it made earlier and if so then reuses that object instead of creating a new one. Compared to allocating a new object every time an int is required, this is a big win.

There’s no mention of this in the Python language reference, and it’s just the sort of thing you might stumble upon by accident. You might observe by experiment that is is safe to use on numbers, because you only played with small ones, and then get bitten by some sort of horrible «x is 2007» bug.

The Python implementation (that is, the C Python implementation that everyone uses) doesn’t define what range of int objects get memoised in this way. This is deliberate. Apparently it used to be 100 numbers, might now be 262 numbers. Maybe other implementations don’t use this technique at all. If you think «x is 1» is safe then what range are you going to rely on?

The message is clear: It is not sensible to rely on using is for numbers. Not even 1.

Bonus section: bool

Though bool is an integer type (False == 0, True == 1), code like «x is False» is fine. That’s because the language specification says that there are only two bool objects, True and False. You can’t create any more, so expressions that evaluate to True or False, like «x > 7», always evaluate to one of the canonical bool objects. You can’t even use bool.__new__ to reach behind the curtain and generate new bool objects:

>>> x = bool.__new__(bool, 0) >>> x False >>> x is False True

(despite what the documentation for bool.__new__ says)

Obligatory Lisp Comparison

Python’s is is Common Lisp’s EQ . Like Python, Common Lisp has all the same issues about whether «(EQ 3 3)» is true or not (it’s not guaranteed, either way). But Common Lisp has EQL which implements conceptual sameness. Python doesn’t have anything similar, and I think that’s a mistake.

Share this: Twitter

Facebook

Like this: Like Loading... Related