Let's look at a bunch of issues. For some, I have strawman answers. For others, I'm still groping in the dark. (And I need a re-education on how some of these things are solved in other languages!)

But... : taking that to the extreme would not flag gcd("x", "y") as a bug because the operations used (bool-testing, binary %) are supported. Also, it doesn't help for the return type. What if I had a UnicodeStringIO class whose getvalue() returned a Unicode string?

Possible solution? : the compiler derives an inherited interface (let's call this a duck type) for the argument from the fact that the only two methods used are write() and getvalue(); the latter returning a str. It makes sure that these methods are in fact defined by the StringIO class, and then accepts other class instances that implement the duck type.

What if a cStringIO instance were passed? There are technical reasons why cStringIO can't or shouldn't inherit from StringIO, but more importantly, requiring inheritance defeats Python's reliance on duck typing .

It would be a shame if declaring gcd() as taking int arguments would mean that gcd(123456789012345, 23456789012355) would no longer work (those are longs, not ints). This particular case can be solved by using inheritance (let int derive from long, or vice versa), but there are others that aren't solved so easily. For example:

Container types offer lots of interesting problems. As an introduction (not yet using containers), consider a function that compute the smallest of two values:

def min(a, b): if a < b: return a else: return b

We would like to be able to add type annotations indicating that a and b should have the same type and that the result also has that type. Strawman syntax:

def min(a: T, b: T) -> T: if a < b: return a else: return b

As a (rather unpythonic) strawman, I'm using T and T0, T1, T2, etc. as type variables. You can think of these as the free variables in an equation of types; we're saying that the types of a, b and the (unnamed) return value must all be the same. So min(1, 2) is valid and returns an int; min("a", "b") is valid and returns a string.

What about min(1, 2.5)? That ought to be valid and return a float. I guess this means that there should be some kind of typing hierarchy that explains how the various numeric types are embedded in each other. The VM already knows about these coercion rules, but the trick is to build them into the type system. I think I would like to do this using a mechanism separate from inheritance, since I really don't think that it is a good idea to require that int is a subclass of float and float a subclass of complex. But the mechanism should also be open to user-defined types; there shouldn't be mechanisms in Python that the user cannot extend (not many, anyway).

Now on to containers. Let's look at a different function that computes the smallest element of a sequence:

def min(a): it = iter(a) x = it.next() # This raises StopIteration if a is empty for y in it: if y < x: x = y return x

How would we write the signature? Clearly a is any iterable; the function return type is the iterable's element type. Strawman:

def min(a: iterable(T)) -> T: it = iter(a) x = it.next() # This raises StopIteration if a is empty for y in it: if y < x: x = y return x

I guess iterable would have to be a new built-in to represent the concept "anything over which you can iterate". This includes lists and tuples and strings, but also dictionaries, files, and in general anything that defines __getitem__ or __iter__.

The Boost folks have a habit of Capitalizing these abstract types, so it would be Iterable. That would perhaps also be a way out of the dilemma of int vs. long: we could use Integer for idealized integers. There are other numeric abstractions that we might want to define, like Exact and Inexact, Rational, Real and Complex (and Quaternion?), but I don't want to digress into number systems. Type systems are hard enough without them.

Perhaps we could extend this convention to saying that whenever a class name starts with a Capital letter it acts as a duck type, and when it starts with a lower case letter it acts as a concrete type (so only instances of the type and its subclasses are allowed).

I don't particularly like enforcing naming conventions based on case; it's one thing to have an agreement amongst a group of Python developers that Foo is an abstract class and foo is a concrete class, but it's quit a different thing to have the compiler look at this too for its type checking.

It would be good to have a set of notations both for the common (abstract and concrete) container types, just so we can write more examples. Here's a strawman.

list(T): a list whose elements are all T's set(T): a set whose elements are all T's (set is a builtin in Python 2.4) tuple(T0, T1, T2): a tuple of length three whose elements have the specified types (if you want to use tuples of arbitrary length as sequences, use Seqence) dict(T1, T2): a dict whose key type is T1 and whose value type is T2 Iterable(T): an iterable whose elements are all T's Sequence(T): a sequence whose elements are all T's Mapping(T1, T2): a mapping whose key type is T1 and whose value type is T2 union(T1, T2): either T1 or T2 (union is not really a type but an operator, but I think it could just be a built-in)

Do we need to distinguish between mutable and immutable sets and sequences?

We need a better name for type(None) than types.NoneType; perhaps void will do?

A common pattern for optional values is to have union(T, void). Perhaps we could add the notation optional(T) to mean just this?

I'm handwaving a bit about how the compiler knows all these built-in names. I think it should assume that names of built-ins that aren't redefined as globals (or imported, etc.) stand for their built-in meaning and there's no surreptitious run-time redefinition going on. So then it can know about int, and about len(x: sequence(T))->int, and so on.