> draft

Capturing Wildcards

Zhong Yu, 2015-09-01

Wildcard in Java Generics is quite simple and intuitive. However, it may get confusing and counterintuitive in some cases. This article explains the mechanics of wildcard, so that readers will be able to analyze how wildcard works in any situation.

In this article, we use the term wild type to refer to parameterized type with wildcard arguments, e.g. types like Map<?,?> , List<? extends Number> . Other types are referred to as concrete types, e.g. String , List<Number> .

To understand wild types, there are only two basic points. Take List<? extends Number> for example,

Supertype -- List <? extends Number> is the supertype of every concrete List< X > where X is a subtype of Number . Wild types are primarily used as variable types.

Capture Conversion -- If the type of a value is List<? extends Number> , the compiler converts the type to a concrete List<X> , where X stands for an unknown subtype of Number .

That's it. The rest of the article expands on these two points.

A generic class or interface can be viewed as a code template -- we can substitute its type-parameters with actual types to generate a concrete class or interface.

interface List< T > T → Number interface List<Number> { { int size(); int size(); T get(int); Number get(int); void add( T ); void add(Number); ... ...

The concrete List<Number> is easy to understand and use, just like any non-generic types.

Such concrete types are mutually exclusive, meaning List<A> and List<B> ( A≠B ) share no objects . For example, an object cannot be both an instance of List<Number> and List<Integer> .

Every object is an instance of some concrete class type (e.g. ArrayList<Number> ), which has concrete super-classes and concrete super-interfaces (e.g. List<Number> ).

In this article we do not consider raw types (e.g List ); nor are we concerned about type erasure.

Generic method

A generic method can also be viewed as a code template --

< T > List< T > newList() T → Number <Number> List<Number> newList() { { return new ArrayList< T >(); return new ArrayList<Number>(); } }

To invoke a generic method, its type-parameters must be substituted with actual types, either explicitly or by inference

List<Number> list = this.<Number>newList(); List<Number> list = this. newList(); // <Number> is inferred

List<Number> is not a supertype of List<Integer> ; they are two independent types, best considered as siblings.

If we want to design a method that adds up a list of numbers, the following method signature won't work very well --

double sum(List<Number> numberList) // add all numbers as `double`

This method can only accept List<Number> arguments, but not List<Integer>, List<Double> etc. To fix that problem, we want to declare numberList in a supertype of all of List<Number>, List<Integer>, List<Double>, ...

That is exactly what wild types are designed for

List<? extends Number> is the supertype of all List<X> where X <: Number .

(The symbol "<:" means "is a subtype of". Note that every type is a subtype of itself, e.g. Number <: Number .)

By declaring parameter numberList in the wild supertype, the method now can accept various lists --

double sum(List<? extends Number> numberList)

For example, we can call this method with a List<Integer> argument, precisely because the argument type is a subtype of the parameter type, i.e. List<Integer> <: List<? extends Number> .

Wild types can be viewed as "horizontal" supertypes, as opposed to "vertical" supertypes through inheritance.

Wild types are primarily used for declaring types of variables, so that a variable may store objects of various concrete types.

Wild types are quite different from concrete types. We cannot new a wild type, like new ArrayList<?>() . We cannot inherit from a wild type, like interface MyList extends List<?> . We muse use concrete types in these places.

Equivalent ways of defining the wild type --

List< X > <: List <? extends Number> iff X <: Number .

an object is a List<? extends Number> iff the object is a List<X> where X <: Number .

Notice how we focus on "wild type" instead of wildcard. A wildcard is only a syntactic component in denoting a wild type, it has no meaning on its own. In particular, a wildcard is not a type; it cannot be used in substitutions -- it makes no sense to call this.<?>newList() , and it makes no sense to generate a "concrete" List<?> from template List<T> .

If an object's static type is List<? extends Number> , what are the instance methods that we can invoke on the object, and what are the signatures of these methods? We cannot answer that by template substitution with wildcard --

interface List<?> { ? get(int); // nonsense void add( ? ); // nonsense ...

However, we do know that the object must be in a concrete type List<X> where X <: Number . At compile time, the exact type of X is unknown, so we use a type-variable X to represent it. The concrete type List<X> looks like

interface List<X> // X <: Number { int size(); X get(int); void add(X); ...

It would be easier to handle the object as the concrete List<X> -- and that is exactly what the compiler does, in a process called capture conversion -- wherever an object's type is a wild type, it is converted to a concrete type, by replacing each wildcard with a type-variable. By doing that, the compiler only needs to handle objects in concrete types.

For example, given an object numberList in type List<? extends Number> , the compiler converts its type to List<X> where X <: Number . Now we know the object has a method "X get(int)" , therefore we can do

Number number = numberList.get(i); // assign X to Number

The object also has a method add(X) which can be invoked with an X argument. A trivial case is add(null) , since null is assignable to any type. A better example would be an X argument retrieved from get(i) ; we'll discuss that later. Note that we cannot call add(number) with a Number argument, since it's not true that Number <: X .

In this article we use " X " to represent type-variables introduced by capture conversions. Compilers use names like " CAP#1 ", " capture#1 " for those type-variables; we may see these names in compiler messages, for example,

numberList.add(number); // compile error: add(capture#1) is not applicable to Number

The compiler applies capture conversion on every expression that yields a value in wild type.

This is very important to understand, so let's exercise with some mind-numbing examples. In the following code, capture conversion is applied on every underlined expression.

List<? extends Number> foo(List<? extends Number> numberList) { #1 for(Number number : numberList ) #2 assert numberList .contains(number); #3 numberList = numberList ; #4 return numberList ; }

Line#1 -- The type of numberList is capture converted to List< X 1 > . It is a subtype of Iterable< X 1 > , therefore it can be used in the for statement. Variable number can be declared as Number , because X 1 <: Number .

Line#2 -- numberList is converted to List< X 2 > . The compiler searches contains() method in List< X 2 > .

Line#3 -- Right-hand numberList is converted to List< X 3 > first. Then, the compiler checks whether List< X 3 > is assignable to the left-hand type. The left-hand numberList denotes a variable, not a value; it's not subject to capture conversion. The assignment is legal because List< X 3 > <: List <? extends Number> .

Line#4 -- similar to #3 , except List<X 4 > is checked against the return type.

void bar(List<? extends Number> numberList) { #a numberList .stream().map( n->n.intValue() ); #b foo ( foo ( numberList ) ).stream(); #c numberList .add( numberList .get(0) ); // compile error }

Line#a -- numberList is converted to List< X a > ; the stream() method returns Stream< X a > . In map(n->...) , the type of n is inferred as X a , which inherits the method intValue() from Number .

Line#b involves 3 capture conversions numberList is converted to List< X b1 > first; then, the compiler checks it against the method parameter type. foo(numberList) 's type is List<? extends Number> , which is then converted to List< X b2 > type of foo(foo(numberList)) is converted to List< X b3 > ; stream() returns Stream< X b3 > .

Line#c -- There are two numberList expressions; the compiler applies capture conversion on them individually, resulting in List<X c1 > and List<X c2 > . The get() method returns X c2 , and the add() method accepts X c1 . The code fails to compile, because X c2 <: X c1 is not true.

Obviously, we don't do this kind of analysis in everyday coding; it's tedious and pointless. But it is important to understand that this is how it works under the hood, so that we can analyze more complex cases, for example, this case study.

Capture conversion changes types of expressions, which can be surprising; see this case study.

Type-variables introduced by capture conversions are "undenotable" -- their names are given arbitrarily, whether by the compiler (like " CAP#1 ") or by our mind (like " X 1 "). They don't have proper names that we can reference in source code.

If we could reference them in source code, it would be very useful in some cases. Imagine if we could do

void bar(List<? extends Number> numberList) { // numberList.add( numberList.get(0) ); // compile error List<X> list = numberList ; // *imaginary* code X number = list.get(0); // get() returns X list.add(number); // add() accepts X

Fortunately, there is a way to approach that. We can introduce a generic method with named type-variables

< T extends Number> void bar2(List< T > list) { T number = list.get(0); list.add(number);

then we can call the method as

bar2( numberList );

What's happening is, first, numberList is capture converted to List<X> ; then, the compiler infers T → X for bar2() .

Essentially, the capture helper method bar2 assigns a name to the type-variable introduced by the capture conversion. This technique is useful whenever you find it impossible or unpleasant to work with wild types.

See case study Capture Helper.

Capture conversion on wild types depends on bounds of wildcards and bounds of type-parameters.

Bounds of Type-parameters

Every type-parameter has an upper bound, limiting the types it can be substituted with. The default upper bound is Object .

public interface List< T > // i.e. List< T extends Object> public class Fooo< T extends Appendable> implements List< T > {

Fooo<StringBuilder> is legal, because StringBuilder <: Appendable . But Fooo<String> would be illegal.

In Java, type-parameters have no lower bounds.

Bounds of Wildcards

A wildcard is either upper-bounded or lower-bounded, but not both. By default, a wildcard is upper-bounded by Object .

upper-bounded wildcard

Examples

Fooo<? extends CharSequence> List<?> // i.e. List<? extends Object>

During capture conversion, an upper-bounded wildcard is replaced by a new type-variable, which takes the upper bound of the wildcard, and the upper bound of the type-parameter --

Fooo<? extends CharSequence> ⇒ c Fooo<X> where X <: CharSequence & Appendable // example: X = StringBuilder List<?> ⇒ c List<X> where X <: Object & Object // i.e. X <: Object, i.e. any X

(We use symbol "⇒ c " to indicate capture conversion.)

lower-bounded wildcard

Examples

Fooo<? super StringBuilder> List<? super String>

During capture conversion, a lower-bounded wildcard is replaced by a new type-variable, which takes the lower bound of the wildcard, and the upper bound of the type-parameter --

Fooo<? super FileWriter> ⇒ c Fooo<X> where FileWriter <: X <: Appendable List<? super String> ⇒ c List<X> where String <: X <: Object

Capture conversion explains how we can operate on values of such wild types --

void test(Fooo<? super FileWriter> fooo, List<? super String> list) { Appendable a = fooo.get(0); // returns X 1 , X 1 <: Appendable list.add( "abc" ); // add(X 2 ), String <: X 2 }

Capture Conversion

A wild type may contain one or more wildcard arguments; each wildcard is captured separately --

Map<String, ? super Integer> ⇒ c Map<String, X> where Integer <: X Function<? super Integer, ? extends CharSequence> ⇒ c Function<X 1 ,X 2 > where Integer <: X 1 and X 2 <: CharSequence

See more examples in case study.

When we say "the capture conversion of a wild type", we mean the resulting concrete type. For example, the capture conversion of List<?> is List<X> . The capture conversion of a wild type represents every concrete type in the wild type.

Often we need to test whether one type is a subtype of another type; either type could be a wild type.

concrete <: concrete

A concrete type is a subtype of another concrete type, if there's an inheritance relationship, or if they are the same type.

List<A> <: List<B> iff A=B ArrayList<A> <: List<A> <: Iterable<A> due to inheritance

concrete <: wild

A concrete type is a subtype of a wild type, if it satisfies the capture conversion. From previous example,

Fooo<? super FileWriter> ⇒ c Fooo<X> where FileWriter <: X <: Appendable

Fooo<Writer> is a subtype of this wild type, because FileWriter <: Writer <: Appendable .

Actually, there is a simpler way, we only need to test the bound of the wildcard --

• G<B> <: G<? extends A> iff B <: A • G<A> <: G<? super B> iff B <: A

We don't need to test the bound of the type-parameter, because it must be satisfied by the concrete type already.

wild <: concrete

A wild type is a subtype of a concrete type, if the capture conversion is. That may sound odd, but consider

ArrayList<?> ⇒ c ArrayList<X> where X <: Object

ArrayList<?> <: Cloneable only because ArrayList<X> <: Cloneable .

Or consider this example

public class MyList< T , V > extends ArrayList< T >{} -- MyList<Integer, X> <: Iterable<Integer> => MyList<Integer, ?> <: Iterable<Integer>

The capture conversion represents every concrete type in the wild type; if every concrete type is a subtype of some type Z , the wild type must be a subtype of Z . This reasoning applies to the next section too.

wild <: wild

Wild type Wa is a subtype of Wb , if the capture conversion of Wa is a subtype of Wb . See the reasoning in the last section.

For example, List<? extends Exception> <: Iterable<? extends Throwable> , because

List<? extends Exception> ⇒ c List<X> where X <: Exception List<X> <: Iterable<X> <: Iterable<? extends Throwable>

If both wild types are of the same generic class/interface, we can simply check the bounds of the wildcards

• G<? extends B> <: G<? extends A> if B <: A • G<? super A> <: G<? super B> if B <: A • G<? super B> <: G<?> for any B

Note: if , not iff . The converse may not be true; see this case study.

As an example, it's quite obvious that

Function<? super Number, ? extends Exception> <: Function<? super Double, ? extends Throwable>

In more complex cases we'll need to resort to capture conversion to analyze subtyping; see case study.

A wild type is a type, therefore it can be used in substitutions. For example, substitute T→Set<?> on List<T>

interface List<Set<?>> { int size(); Set<?> get(int); void add(Set<?>); ...

List<Set<?>> is a concrete type with well defined methods; it's not a wild type that requires capture conversion.

In particular, List<Set<?>> is not the supertype of all List<Set<X>> ; such supertype does not exist in Java.

Capture conversion applies only on top level wildcards, not on nested wildcards.

If we do want to express, vaguely, a list of sets of numbers, we can use multiple levels of wildcards, e.g.

List<? extends Set<? extends Number>> ⇒ c List<X> where X <: Set<? extends Number>

It contains subtypes like ArrayList<HashSet<Integer>> , per subtyping rules.

We say that upper-bounded wildcards are co-variant, and lower-bounded wildcards are contra-variant, in the sense that

• B <: A ⇒ G<? extends B> <: G<? extends A> • B <: A ⇒ G<? super A> <: G<? super B>

Variance is the main reason why wildcards are used profusely in public APIs, especially since Java 8.

Co-variance

Intuitively, a Supplier of Integer is kind of a Supplier of Number , because when it supplies an Integer , it supplied a Number too. We say that Supplier is "intuitively co-variant". Unfortunately in Java, it is not the case that

B <: A ⇒ Supplier<B> <: Supplier<A> // nope!

Therefore we cannot use a Supplier<Integer> where a Supplier<Number> is expected. That is very counter-intuitive.

The workaround is to use upper-bounded wildcard for its co-variant nature

B <: A ⇒ Supplier<? extends B> <: Supplier<? extends A>

Intuitively co-variant types are almost always used with upper-bounded wildcards, particularly in public APIs. If you see a concrete type Supplier<Something> in an API, it is very likely a mistake.

Contra-variance

Intuitively, a Consumer of Number is kind of a Consumer of Integer , because if it can consume any Number , it can consume any Integer too. We say that Consumer is "intuitively contra-variant".

Intuitively contra-variant types are almost always used with lower-bounded wildcards, particularly in public APIs.

B <: A ⇒ Consumer<? super A> <: Consumer<? super B>

Variance on type-parameters

More precisely speaking, "intuitively variant" is a property on type-parameters -- Supplier<R> is intuitively co-variant on R , Consumer<T> is intuitively contra-variant on T .

Function<T,R> consumes T and supplies R , therefore it is intuitively contra-variant on T and co-variant on R . The Function type is almost always used with two wildcards correspondingly, particularly in public APIs -

Function<? super Foo, ? extends Bar>

Use-site variance

List<T> both consumes T and supplies T ; it is neither intuitively co-variant nor intuitively contra-variant.

But, we can use List<? extends Foo> to use the type in a co-variant sense, i.e. to see the list only as a supplier of Foo .

Or, we can use List<? super Foo> to use the type in a contra-variant sense, i.e. to see the list only as a consumer of Foo .

Therefore in Java, it is the use-site that chooses whether to use a type in a co-variant or contra-variant sense.

Of course, if a type is intuitively co-variant or contra-variant, the use-site generally shouldn't make an opposite choice. It rarely makes sense to write types like Supplier<? super Foo> or Consumer<? extends Foo> .

Java Generics was designed with a heavy biased towards Collection framework, leading to some controversial decisions, one of which being use-site variance through wildcard. The decision did make a lot of sense back then -- use-site variance is pretty neat for collection interfaces; wildcard usages were moderate and manageable.

Since Java 8, functional interfaces ( Consumer , Function , etc.) are used in a lot of APIs; most of these interfaces are intuitively variant, therefore they are almost always used with wildcards. Consequently, wildcard usages exploded.

The syntax of wildcard is quite ugly and distracting, especially for types that are slightly more complex, for example,

Function<? super A, ? extends Function<? super B, ? extends C>>

What's worse, these wildcards are frivolous -- they are required, yet they convey no real meanings. Use-site variance is of very little value for intuitively variant types; it just becomes a nuisance.

Hopefully, you are not too bothered by the problem. After all, wildcard is simple, despite the appearance.

Maybe we could do something about the appearance; for example, we could render wildcards on javadoc pages like Function<?A, ? Function<?B, ? C>> which is less distracting.

Lastly, contrarian to everything we've learned so far, maybe, we can get rid of most wildcards from our code, and things still work out just fine. See this case study and judge for yourself.

Contact: https://groups.google.com/forum/#!forum/java-lang-fans