[Python-Dev] if-syntax for regular for-loops

Greg Ewing wrote: > Vitor Bosshard wrote: >> The exact same argument could be used for list comprehensions themselves. > No, an LC saves more than newlines -- it saves the code > to set up and append to a list. This is a substantial > improvement when this code would otherwise swamp the > essentials of what's being done. > > This doesn't apply to a plain for-loop that's not > building a list. Not only do LCs make it obvious to the reader that "all this loop does is build a list", but the speed increases from doing the iteration in native code rather than pure Python are also non-trivial - every pass through the main eval loop that can be safely avoided leads to a fairly substantial time saving. Generally speaking, syntactic sugar (or new builtins) need to take a construct in idiomatic Python that is fairly obvious to an experienced Python user and make it obvious to even new users, or else take an idiom that is easy to get wrong when writing (or miss when reading) and make it trivial to use correctly. Providing significant performance improvements (usually in the form of reduced memory usage or increased speed) also counts heavily in favour of new constructs. I strongly suggest browsing through past PEPs (both accepted and rejected ones) before proposing syntax changes, but here are some examples of syntactic sugar proposals that were accepted. List/set/dict comprehensions ============================ (and the reduction builtins any(), all(), min(), max(), sum()) target = [op(x) for x in source] instead of: target = [] for x in source: target.append(op(x)) The transformation ("op(x)") is far more prominent in the comprehension version, as is the fact that all the loop does is produce a new list. I include the various reduction builtins here, since they serve exactly the same purpose of taking an idiomatic looping construct and turning it into a single expression. Generator expressions ===================== total = sum(x*x for x in source) instead of: def _g(seq): for x in source: yield x*x total = sum(_g(x)) or: total = sum([x*x for x in source]) Here, the GE version has obvious readability gains over the generator function version (as with comprehensions, it brings the operation being applied to each element front and centre instead of burying it in the middle of the code, as well as allowing reduction operations like sum() to retain their prominence), but doesn't actually improve readability significantly over the second LC-based version. The gain over the latter, of course, is that the GE based version needs a lot less *memory* than the LC version, and, as it consumes the source data incrementally, can work on source iterators of arbitrary (even infinite) length, and can also cope with source iterators with large time gaps between items (e.g. reading from a socket) as each item will be returned as it becomes available. With statements =============== with lock: # perform synchronised operations instead of: lock.aqcuire() try: # perform synchronised operations finally: lock.release() This change was a gain for both readability and writability - there were plenty of ways to get this kind of code wrong (e.g. leave out the try-finally altogether, acquire the resource inside the try block instead of before it, call the wrong method or spell the variable name wrong when attempting to release the resource in the finally block), and it wasn't easy to audit because the lock acquisition and release could be separated by an arbitrary number of lines of code. By combining all of that into a single line of code at the beginning of the block, the with statement eliminated a lot of those issues, making the code much easier to write correctly in the first place, and also easier to audit for correctness later (just make sure the code is using the correct context manager for the task at hand). Function decorators =================== @classmethod def f(cls): # Method body instead of: def f(cls): # Method body f = classmethod(f) Easier to write (function name only written once instead of three times), and easier to read (decorator names up top with the function signature instead of buried after the function body). Some folks still dislike the use of the @ symbol, but compared to the drawbacks of the old approach, the dedicated function decorator syntax is a huge improvement. Conditional expressions ======================= x = A if C else B instead of: x = C and A or B The addition of conditional expressions arguably wasn't a particularly big win for readability, but it *was* a big win for correctness. The and/or based workaround for lack of a true conditional expression was not only hard to read if you weren't already familiar with the construct, but using it was also a potential buggy if A could ever be False while C was True (in such case, B would be returned from the expression instead of A). Except clause ============= except Exception as ex: instead of: except Exception, ex: Another example of changing the syntax to eliminate potential bugs (in this case, except clauses like "except TypeError, AttributeError:", that would actually never catch AttributeError, and would locally do AttributeError=TypeError if a TypeError was caught). Cheers, Nick. P.S. There's a fractionally better argument to be used in favour of allowing an if condition on the for loop header line: it doesn't just save a newline or improve consistency with comprehensions and generator expressions, it saves an *indentation level*. And that gain is exactly the rationale that was used to begin allowing: try: ... except: ... else: ... finally: ... instead of requiring the extra indentation level: try: try: ... except: ... else: ... finally: ... However, even that argument is greatly weakened in the for/if case by the fact that the indentation level is being saved by moving the if condition up and to the right after the for loop details, whereas in the try-statement case there were absolutely no downsides (the redundant try keyword was simply dropped entirely). So I'm personally still -1 when it comes to incorporating an if clause directly into the for loop syntax - it's only necessary in the GE/LC case due to the fact that those don't support statement-based nesting. (Tangent: the above two try/except examples are perfectly legal Py3k code. Do we really need the "pass" statement anymore?) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org