From: andrew cooke <andrew@...>

Date: Tue, 4 Jun 2013 08:50:28 -0400

Just under a month ago I wrote a disappointed rant[1] about the new Enum for Python 3. Since then I spent time extending the existing code[2] (to produce bnum[3]), before changing direction and writing my own, alternative Enum from scratch[4]. I've also swapped a few emails with various kind people. While all that doesn't make me an expert, I do now have a better understanding of the design and some of the choices it embodies. So I thought I'd revisit the points I made earlier and try explain why things work that way, where I can, and what space for change still remains. Implicit Values --------------- In the original article I said that my first example - class Colour(Enum): red green - was illegal syntax. I was wrong. When Python parses the class definition it looks for values in the dict of the class that it is building. Python 3's metaclass protocol allows us to provide that dict, so we can use a subclass that assigns default values to unknown names (and provide values from 'red' and 'green' instead of triggering an error). I use this trick in simple-enum[4], and the above is a valid declaration when using that package. But it's not a risk-free solution: there's the possibility of very confusing errors / bugs. The problem is that elsewhere in the code we may refer to a name in the global scope. The class dict will, incorrectly, provide a default value for that name too. And while you can reduce the effect[5], I am not (yet) convinced that you can remove the risk entirely. Implicit Values 2 (Strange Syntaxes) ------------------------------------ But all is not lost. The PEP 0435 test code includes support for: class Colour(Enum): red = ... green = ... which, although uglier, avoids the issues with "magic values". This is not in the default implementation, but it can be added by providing an alternate metaclass). Duplicate Values ---------------- Python 3's metaclass support includes keyword arguments. Here's an example from simple-enum: class WithAliases(Enum, implicit=False, allow_aliases=True): one = 1 another_one = 1 Please ignore the 'implicit=False' (it does what you'd expect; it's needed because of the shadowing issue I discussed earlier) - I am showing this example because the 'allow_aliases' flag seems like a good solution to whether or not duplicate values should be flagged: duplicates are errors by default, but can be enabled if required. Unfortunately I don't see anything likle this in the PEP 0345 code. There's no such flag (no metaclass keyword args are used at all) and no related tests. But it is fairly easy to add - I included it in bnum[3]. Inheritance ----------- The PEP Enum code is complicated by the need to support inheritance - something that is used mainly (in the tests) to allow Enums to "be" integers. In case the above is not clear, this is valid with PEP 0435: # PEP 0435 class Number(int, Enum): one = 1 two = 2 > Number.one + Number.two 3 > isinstance(Number.one, Number) True > isinstance(Number.one, int) True In contrast, the simple-enum implementation is, well, simpler, treating all enumerations as named tuples (with name and value components), so equivalent code would read: # simple-enum > Number.one.value + Number.two.value 3 Now, after that long introduction, the question is: why does PEP 0435 have this emphasis? I chose the simple (but more verbose) solution because it seemed easier to understand, simplified the implementation, and allowed me to present enumerations in terms of both dicts and tuples (something I haven't explained here, since I'm focussing on PEP 0435, but see [4] if you're curious). Nick Coghlan explained this to me, and it shone a completely new light on the design: the PEP is designed to be backwards compatible with existing Python library code. For example, the socket module is full of constants like TIPC_ADDR_NAME and AF_BLUETOOTH. Existing code *expects* them to evaluate to whatever their current value is. Changing all that existing code to TIPC_ADDR_NAME.value is impossible. So the PEP Enum *must* "be" an int (or a str, or whatever the original design chose) if these constants are going to be replaced by Enums. Hence the inheritance. Alternate Implcit Values ------------------------ I mentioned this in my rant, and support it in simple-enum with metaclass keyword arguments (eg. 'values=from_zero'), but implicit values currently play a very small part in the PEP 0435 Enum (they are available only through the "functional API"). Language Changes ---------------- While the "names in a class" syntax is possible, it's something of a hack (see above). Since support for such a syntax might also help named tuples it's interesting to ask whether the language could be extended in some way to support this. I don't have a concrete suggestion, but it seems like an interesting avenue to explore. A simpler change, which I stumbled across while considering alternatives, is to extend matching to infinite sequences. This would allow a syntax like: class SequenceBased(Enum): one, two, three, ... = count() Summary ------- Much of the design of the PEP 0435 Enum is driven by the requirement that it be applied retroactively to existing classes (without changing existing code). Honestly, to me, that seems a bit "too clever", but I understand the temptation. The "clean" syntax (names in a class) is possible in Python, but concerns about opaque errors / bugs from name shadowing have excluded it from the PEP. Alternative syntaxes (like 'name = ...') exist in the tests, but are not enabled by default. Duplicate values could be enabled by a named argument to the metaclass. I don't understand why this issue is missing from the code (even in the tests, which otherwise do a good job of exploring alternative ideas). [1] rant - http://www.acooke.org/cute/Pythonssad0.html [2] PEP 0435 code - https://bitbucket.org/stoneleaf/ref435 [3] mod of [2] - https://github.com/andrewcooke/bnum [4] new start - https://github.com/andrewcooke/simple-enum [5] magic values - https://github.com/andrewcooke/simple-enum#the-danger-of-magic