[Python-ideas] Optional static typing -- the crossroads

I have read pretty much the entire thread up and down, and I don't think I can keep up with responding to every individual piece of feedback. (Also, a lot of responses cancel each other out. :-) I think there are three broad categories of questions to think about next. (A) Do we even need this? (B) What syntax to use? (C) Does/should it support <feature X>? Taking these in turn: (A) Do we even need a standard for optional static typing? Many people have shown either support for the idea, or pointed to some other system that addresses the same issue. On the other hand, several people have claimed that they don't need it, or that they worry it will make Python less useful for them. (However, many of the detractors seem to have their own alternative proposal. :-) In the end I don't think we can ever know for sure -- but my intuition tells me that as long as we keep it optional, there is a real demand. In any case, if we don't start building something we'll never know whether it'll be useful, so I am going to take a leap of faith and continue to promote this idea. I am going to make one additional assumption: the main use cases will be linting, IDEs, and doc generation. These all have one thing in common: it should be possible to run a program even though it fails to type check. Also, adding types to a program should not hinder its performance (nor will it help :-). (B) What syntax should a standard system for optional static typing use? There are many interesting questions here, but at the highest level there are a few choices that constrain the rest of the discussion, and I'd like to start with these. I see three or four "families" of approaches, and I think the first order is to pick a family. (1) The mypy family. (http://mypy-lang.org/) This is characterized by its use of PEP 3107 function annotations and the constraint that its syntax must be valid (current) Python syntax that can be evaluated without errors at function definition time. However, mypy also supports collecting annotations in separate "stub" files; this is how it handles annotations for the stdlib and C extensions. When mypy annotations occur inline (not in a stub file) they are used to type check the body of the annotated function as well as input for type checking its callers. (2) The pytypedecl family. (https://github.com/google/pytypedecl) This is a custom syntax that can only be used in separate stub files. Because it is not constrained by Python's current syntax, its syntax is slightly more elegant than mypy. (3) The PyCharm family. ( http://www.jetbrains.com/pycharm/webhelp/using-docstrings-to-specify-types.html) This is a custom syntax that lives entirely in docstrings. There is also a way to use stub files with this. (In fact, every viable approach has to support some form of stub files, if only to describe signatures for C extensions.) (I suppose we could add a 4th family that puts everything in comments, but I don't think anyone is seriously working on such a thing, and I don't see any benefits.) There's also a variant of (1) that Łukasz Langa would like to see -- use the syntactic position of function annotations but using a custom syntax (e.g. one similar to the pytypedecl syntax) that isn't evaluated at function-definition time. This would have to use "from __future__ import <something>" for backward compatibility. I'm skeptical about this though; it is only slightly more elegant than mypy, and it would open the floodgates of unconstrained language design. So how to choose? I've read passionate attacks and defenses of each approach. I've got a feeling that the three projects aren't all that different in maturity (all are well beyond the toy stage, none are quite ready for prime time). In terms of specific type system features (e.g. forward references, generic types, duck typing) I expect they are all acceptable, and all probably need some work (and there's no reason to assume that work can't be done). All support stubs so you can specify signatures for code you can't edit (whether C extension, stdlib or just opaque 3rd party code). To me there is no doubt that (1) is the most Pythonic approach. When we discussed PEP 3107 (function annotations) it was always my goal that these would eventually be used for type annotations. There was no consensus at the time on what the rules for type checking should be, but their syntactic position was never in doubt. So we decided to introduce "annotations" in Python 3 in the hope that 3rd party experiments would eventually produce something satisfactory. Mypy is one such experiment. One of the important lessons I draw from mypy is that type annotations are most useful to linters, and should (normally) not be used to enforce types at run time. They are also not useful for code generation. None of that was obvious when we were discussing PEP 3107! I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics. It promises compatibility, and I take that very seriously, but I think it is reasonable to eventually deprecate other uses of annotations -- there aren't enough significant other uses for them to warrant crippling type annotations forever. In the meantime, we won't be breaking existing use of annotations -- but they may confuse a type checker, whether a stand-alone linter like mypy or built into an IDE like PyCharm, and that may serve as an encouragement to look for a different solution. Most of the thornier issues brought up against mypy wouldn't go away if we adopted another approach: whether to use concrete or abstract types, the use of type variables, how to define type equivalence, the relationship between a list of ints and a list of objects, how to spell "something that implements the buffer interface", what to do about JSON, binary vs. text I/O and the signature of open(), how to check code that uses isinstance(), how to shut up the type checker when you know better... The list goes on. There will be methods whose type signature can't be spelled (yet). There will be code distributed with too narrowly defined types. Some programmers will uglify their code to please the type checker. There are questions about what to do for older versions of Python. I find mypy's story here actually pretty good -- the mypy codec may be a hack, but so is any other approach. Only the __future__ approach really loses out here, because you can't add a new __future__ import to an old version. So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy. I also hope that somebody will write converters from pytypedecl and PyCharm stubs into mypy stubs, so that we can reuse the work already put into stub definitions for those two systems. And of course I hope that PyCharm and pytypedecl will adopt mypy's syntax (initially in addition to their native syntax, eventually as their sole syntax). PS. I realize I didn't discuss question (C) much. That's intentional -- we can now start discussing specific mypy features in separate threads (or in this one :-). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140814/82c1429e/attachment-0001.html>