Some time ago, Greg Parker asked the Twitternets what we’d like to see in a purely hypothetical Objective-C-without-the-C language. Someone — I believe it was Landon Fuller — pointed at an article about the Strongtalk type system for Smalltalk. I quite like the idea of Objective-C-without-the-C (i.e., a language that is native to the Objective-C object system and runtime without the baggage of C), but after reading that article I found myself asking why we couldn’t do something similar in Objective-C.

I don’t think my random musings have much influence on the design of the language, but if I don’t write it down nobody’s going to know how nuts I am, so here’s a semi-concrete proposal for contextual types and generics for Objective-C. Since anyone even mentioning generics in the vicinity of Objective-C will inevitably be flamed for trying to turn it into C++, this is followed by an aside entitled Why This is Not the Baby-eating Spawn of Bjarne Stroustrup. (Nothing personal, Bjarne.)

Types in Objective-C

A fundamental characteristic of Objective-C is that it has two separate type systems: the dynamic type system, which applies to objects, and the static type system, which applies to variables (which may or may not refer to objects).

As far as objects are concerned, the static type system is optional — you can refer to any object with the type id , except when calling a method whose type may be ambiguous. The static type system is also advisory — it suggests to programmers, the compiler and other tools such as the IDE and analyzer what the class of an object may be at runtime, but doesn’t constrain the object. A variable of type NSString * may actually refer to an NSArray at runtime, and method calls will dynamically go to NSArray ’s implementation.

My proposal deals only with the static type system. The idea is to provide information that helps the compiler and analyzer check your logic, and the IDE to provide better suggestions. This is done by replacing id with more specific static types in most of the situations it’s used in. The generated code is not affected in any way. The proposal does not introduce bondage and discipline on the language; the new types can always be cast away.

Contextual Types

By far the most common use of id is as the return type for methods that may return an instance of “this” class, or of the subclass it’s called on. The obvious examples are +alloc and -init .

The IDE and, I believe, the analyzer already use heuristics to determine the return type of +alloc and -init , but I propose formalizing this in code. It would look something like this:

@interface NSObject <NSObject> { Class isa; } + (void)load; + (void) initialize; - ([:Self]) init; + ([:Self]) new; + ([:Self]) allocWithZone:(NSZone *)zone; + ([:Self]) alloc; // ... + ([:Superclass]) superclass; + ([:Class]) class; - ([:Superclass]) superclass; - ([:Class]) class; @end

When calling a class method, [:Class] resolves to the receiver (or, type-equivalently, a pointer to an instance the receiver’s metaclass), and [:Self] resolves to a pointer to an instance of the receiver. [:Superclass] resolves to the superclass of the receiver. For instances, they resolve as for class methods on the class of the instance.

I’m sure some people will object to the conceptual purity of this design, and possibly the names and syntax. The colon is a bit odd — it’s there to avoid ambiguity with generics (see below). All of these are minor quibbles; the syntax would need to be reviewed if actually implementing it.

So what’s the point? Consider the following code:

NSString *s = [[NSArray alloc] init];

As it stands, this is perfectly valid and doesn’t generate a compiler diagnostic. With the addition of contextual types it would, because:

The type of +[NSArray alloc] (inherited from NSObject ) is [:Self] , which resolves to NSArray * .

(inherited from ) is , which resolves to . The receiver of the -init is thus known to be an NSArray .

is thus known to be an . The type of -[NSArray init] (inherited from NSObject ) is [:Self] , which again resolves to NSArray * .

(inherited from ) is , which again resolves to . Therefore, the right hand side of the assignment is of type NSArray * , and the assignment is invalid.

If, for some reason, you really wanted to do that, you could use an explicit cast to get rid of the diagnostic.

Generics

The other major use of id is for polymorphic collections. True polymorphic collections are great! But some of the time, you only want to put one kind of object in your collection, and would appreciate the computer doing the drudge work of checking that you didn’t put the wrong stuff in the wrong place.

From the perspective of the previous section, generics are a simple extension to contextual types. Instead of restricting you to [:Self] and the highly specialized [:Class] and [:Superclass] , you can provide one or more class names as a parameter to a type declaration. Example time again:

@interface MyThingHolder[ThingType = id] : NSObject { [ThingType] thing; } - ([:Self]) initWithThing:([ThingType])thing; - (void) setThing:([ThingType])thing; - ([ThingType]) thing; // Or, for modernists: @property (readonly, nonatomic) [ThingType] thing; @end // ... MyThingHolder[NSString] *holder = [[MyThingHolder[NSString] alloc] initWithThing:@"foo"]; holder.thing = [NSNumber numberWithBool:MAYBE]; // Warning: type mismatch holder.thing = (id)[NSNumber numberWithBool:MAYBE]; // OK. (an issue here would be inconsistent with general Objective-C behaviour.) // Or, equivalently: typedef MyThingHolder[NSString] MyStringHolder; MyStringHolder *holder = [[MyStringHolder alloc] initWithThing:@"foo"];

Some notes: the type parameter can only be a class, since specialized code is not generated for each type. (See Why This is Not the Baby-eating Spawn of Bjarne Stroustrup below.) Since the type parameter is always a class, I have made the * implicit. This is cleaner, but could well lead to confusion and is quite likely a bad idea. (If only I had a time machine…)

The type parameter has a default value, previously unheard of in Objective-C, so that you can ignore generics and create a “vanilla” MyThingHolder that works just like in traditional Objective-C. This provides an upgrade path for existing classes:

@interface NSArray[Item <NSObject> = id <NSObject>] : NSObject <NSCopying, NSMutableCopying, NSCoding, NSFastEnumeration> - (NSUInteger) count; - ([Item]) objectAtIndex:(NSUInteger)index; @end @interface NSArray[Item] (NSExtendedArray) - (NSArray[Item] *) arrayByAddingObject:([Item])anObject; - (NSArray[Item] *) arrayByAddingObjectsFromArray:(NSArray[Item] *)otherArray; // ... - (BOOL)containsObject:([Item])anObject; // ... + ([:Self[Item]]) arrayWithObject:([Item])anObject; // ...

@interface NSDictionary[Key <NSCopying, NSObject> = id <NSCopying, NSObject>, Value <NSObject> = id <NSObject>] : NSObject <NSCopying, NSMutableCopying, NSCoding, NSFastEnumeration> - (NSUInteger) count; - ([Value]) objectForKey:([Key])aKey; - (NSEnumerator[Key] *)keyEnumerator; @end @interface NSDictionary[Key, Value] (NSExtendedDictionary) - (NSArray[Key] *) allKeys; - (NSArray[Key] *) allKeysForObject:([Value])anObject; - (NSArray[Value] *) allValues; // ... - (BOOL) isEqualToDictionary:(NSDictionary[Key, Value] *)otherDictionary; - (NSEnumerator[Value] *) objectEnumerator; - (NSArray[Value] *) objectsForKeys:(NSArray[Key] *)keys notFoundMarker:([Value])marker; // ... - (void) getObjects:([Value] *)objects andKeys:([Key] *)keys; // ...

These examples introduce an additional concept: restrictions on type parameters, in this case protocol requirements. The other obvious type of restriction would be a superclass requirement, such as [Type: NSString = NSString] .

Using these genericized versions in the same manner as the existing versions would require no code changes – as long as you’re using them with objects that fulfil the protocol requirements, which already exist but aren’t explicit in code.

Type Conformance

A fundamental question for static type systems with inheritance is when implicit casts are allowed. There are subtleties here, some of which I probably haven’t considered, and I have a feeling I’ve considered some and then forgotten about them. The most obvious case is when the base type and each type parameter could be validly cast:

NSMutableArray[NSMutableString] *a = whatever; NSMutableArray *b = a; // OK, b is NSMutableArray[id <NSObject>] NSMutableArray[NSString] *c = a; // OK a = b; // Not OK, implicit cast from NSMutableArray[NSString] to NSMutableArray[NSMutableString] NSArray[NSMutableString] *d = a; // OK NSArray[NSString] *e = a; // OK id f = a; // OK

As indicated above, passing an id when a parameter type is expected is necessary for consistency with Objective-C in general:

NSMutableArray[NSString] *a = whatever; [a addObject:[NSNumber numberWithInt:42]]; // Type mismatch, assuming +numberWithInt: is declared to return [:Self] [a addObject:(id)[NSNumber numberWithInt:42]]; // OK

How about the case where a method returns an unadorned type?

@interface LegacyThing: NSObject - (NSArray *) legacyListOfStrings; @end LegacyThing *l = whatever; NSArray[NSString] *list = [l legacyListOfStrings];

In order to minimize the burden of adopting generic syntax, I’d suggest explicitly permitting this, with an optional warning. (The non-generic equivalent, assigning an id <NSObject> to an NSString * , generates the somewhat unexpected warning “type ‘id <NSObject>’ does not conform to the ‘NSCopying’ protocol” in GCC and nothing in Clang. Assigning an NSObject * to an NSString * generates warnings in both. My proposal is that assigning a T to a T[P] should work without warning by default, even if the default type parameter is a class rather than id with or without protocols.)

Why This is Not the Baby-eating Spawn of Bjarne Stroustrup

Many Objective-C programmers are refugees from the blasted wasteland of C++, and will reflexively cringe at the similarity with templates:

std::vector<Duck> ducks; Chicken chicken; ducks.push_back(chicken); // Error ducks.push_back(*(Duck *)chicken); // Horrible crash here or some time in the future, maybe. NSArray[Duck] *ducks = [NSArray new]; Chicken *chicken = [Chicken new]; [ducks addObject:chicken]; // Warning: type mismatch [ducks addObject:(Duck *)chicken]; // No problem, unless you call -quack on it.

While the difference should hopefully be clear by now, I’ll spell it out: in C++, std::vector<Duck> creates an entirely new class (with bits of Duck inlined into it). The Objective-C-with-generics version only provides hints to the compiler, so it can catch mistakes. It doesn’t stop you from putting chickens among your ducks, or make the duck array reject chickens at runtime, or generate a new array-of-chickens class.

In earlier discussions of generics, it has been stated that this type of mistake is rare in practice and should be caught with unit tests. If you feel that way, you’re welcome to stick to your current approach, and the introduction of generics, as described, will not affect you.

To avoid ballooning side effects of generics, I’ve deliberately avoided suggesting generic functions and methods (i.e., ones whose return type is dependent on one or more argument types, independent of their class in the case of methods).

One Last Thing

An effect of the above is that I want the language to contain types like NSArray[NSMutableDictionary[NSString, NSSet[NSMutableArray[NSNumber]]]] * . That doesn’t mean I want to spend my time typing NSArray[NSMutableDictionary[NSString, NSSet[NSMutableArray[NSNumber]]]] * , even with autocomplete. Fortunately, there exists a well-known solution to this problem: type inference.

If I access a member of an array of the aforementioned sesquipedalian type, I get an NSMutableDictionary[NSString, NSSet[NSMutableArray[NSNumber]]] * . What’s more, the compiler knows this. The benefit of static typing lies primarily in checking what I do with my dictionary, rather than checking that the item I retrieved is what I thought it was, so I should be able to ask it to type a variable appropriately.

I quite like C++1x’s solution of recycling the auto keyword for this use, but that would conflict with Objective-C’s goal of being a strict superset of C. There are various other choices, such as var or any — or maybe ego . In any case, type inference would lead to code like:

typedef NSArray[NSMutableDictionary[NSString, NSSet[NSMutableArray[NSNumber]]]] MyHorribleNestedType; // TODO: replace with sensible model classes. MyHorribleNestedType *array = whatever; var element = [array objectAtIndex:0]; // Type is inferred as NSMutableDictionary[NSString, NSSet[NSMutableArray[NSNumber]]]* var subEntry = [NSArray arrayWithObject:@"bloop"]; // Type is inferred as NSArray[NSString] [element setObject:subEntry forKey:@"moop"]; // Type mismatch: expected NSSet[NSMutableArray[NSNumber]], got NSArray[NSString]

A subtlety here: the return type of [NSArray arrayWithObject:@"bloop"] (declared previously) is [:Self[Item]] , which is inferred from the receiver (the NSArray class object) and argument to resolve to NSArray[NSString] . Instead of typedef ing MyHorribleNestedType , we could have constructed the type implicitly in the same way. like everything else in the proposal, this is optional; if you used id or explicit, unparameterized types in this example, the type mismatch would not be detected, but the generated code would be identical.

Summary