I want to [finish my discussion of associated items][pp] by taking a look at how they are handled in Haskell, and what that might mean in Rust.

These proposals have the same descriptive power as what I described before, but they are backwards compatible. This is nice.

Object-oriented style name resolution

In the object-oriented, C++-like version of associated items that I introduced before, the names of associated items and methods were resolved relative to a type. To see what I mean by this, consider a (slightly expanded) variant the graph example I introduced before:

trait Graph { type Node; // associated type static K: uint; // associated constant } mod graph { fn depth_first_search<G: Graph>( graph: &G) -> ~[G::Node] { let k = G::K; ... } }

Consider a path like graph::depth_first_search , which names an item within a module. This kind of path is based solely on the module hierarchy and can be resolved without knowing anything types whatsoever.

Now consider the paths G::Node and G::K that appear in depth_first_search . These paths look similar to in form to graph::depth_first_search , but they are resolved quite differently. Because G is a type and not a module, if we want to figure out what names Node and K refer to, we don’t examine the module hierarchy but rather the properties of the type G . In this case, G is a type parameter that implements the Graph trait, and the Graph trait defines an associated type Node and an associated constant K .

Note that the name lookup process here is exactly analogous to what happens on a method call. With an expression like a.b() , the meaning of the name b is resolved by first examining the type of the expression a and then checking to see what methods that type offers. The module hierarchy is not consulted.

The object-oriented style of naming specification is not fully explicit. In particular, a path like G::Node does not specify the trait in which Node was defined, the compiler must figure it out. It is also possible that the type G implements multiple traits that have an associated item Node , so the syntax could be ambiguous. To make things fully explicit, I proposed in my previous post that the full syntax would be Type::(Trait::Item) . So the fully explicit form of G::Node would be G::(Graph::Node) , since the type Node is defined in the trait Graph .

Functional-style name resolution (take 1)

In Haskell, all name resolution is done based on lexical scoping and the module hierarchy. This is no accident. It means that the Haskell compiler can figure out what each name in a program refers to without knowing anything about the types involved, which is helpful when performing aggressive type inference.

What this means is that we can’t use a path like G::Node to mean “the type Node relative to the type G ”, because interpreting this path would require examining the definition of the type G (as we saw before). Instead, if we were to use a syntax analogous to what Haskell uses, one would write something like Graph::Node<G> . Note that all the names here ( Graph::Node , G ) can be resolved using only the module hierarchy.

So the example I gave before would look as follows:

trait Graph { type Node; // associated type static K: uint; // associated constant } mod graph { fn depth_first_search<G: Graph>( graph: &G) -> ~[Graph::Node<G>] { let k = Graph::K::<G>; ... } }

Note that where before we wrote G::K to refer to the constant K associated with the type G , we would now write Graph::K::<G> . As is typical in Rust, the extra :: that appears before the type parameter <G> is necessary to avoid parsing ambiguities when the path appears as part of an expression.

Let’s look a bit more closely at what’s going on here. Effectively what is happening is that, for each associated item within a trait, we are adding a synthetic type parameter. For any reference to an associated item, this type parameter tells the compiler which type is implementing the trait. The path Graph::Node by itself is not complete; Graph::Node<G> means “the type Node defined for the type G ”.

Let’s dig into it this Haskell-style convention a bit to see some of the implications.

Return type inference

One benefit of the Haskell style convention is that the values for the type parameters can often be deduced by inference. For example, let’s return to the trait FromStr from my [previous post][pp]. The trait FromStr is used to parse a string and produce a value of some other type:

trait FromStr { fn parse(input: &str) -> Self; }

We might implement FromStr for unsigned integers as follows:

impl FromStr for uint { fn parse(input: &str) -> uint { uint::parse(input, 10) // 10 is the radix } }

Now we could write a function that invokes parse() like so:

fn parse_strings(v: &[&str]) -> ~[uint] { v.map(|s| FromStr::parse(*s)) }

Note that when we called FromStr::parse(*s) , we did not say what type it should parse to. The compiler was able to infer that we wanted to parse a string into a uint based on the return type of parse_strings() as a whole. A fully explicit version of parse_strings would look like:

fn parse_strings(v: &[&str]) -> ~[uint] { v.map(|s| FromStr::parse::<uint>(*s)) // ^~~~~~ specify return type }

Generic traits

Imagine that we have a generic trait, like this Add trait:

trait Add<Rhs> { type Sum; fn add(&self, r: &Rhs) -> Sum<Self, Rhs>; }

This trait is very similar to the trait used in Rust to implement operator overloading, except it has been adapted to use an associated type for the Sum (which is probably how the Rust type should be defined as well, since the type of the sum ought to be determined by the types of the things being added).

Previously, with the associated type Node , we said that any reference to node had to include a single type parameter to indicate the type that was implementing Graph . But with a generic trait like Add a simple type parameter is not enough. To fully specify all the types involved, we need to include both the Self type and any type parameters. This is why the return type of the method add() is Sum<Self, Rhs> —a mere reference to Sum or Sum<Self> would be incomplete.

Comparison to object-oriented form. Interestingly, this case is something that the object-oriented style of naming cannot handle very well. This is because the object-oriented convention is strongly oriented towards specifying the Self type but does not easily expand to accommodate generic traits. Using the fully explicit syntax that I suggested in my [previous post][pp], I think the result would look like Self::(Add<Rhs>::Sum) . (It’s plausible that, within the trait definition itself, one could simply write Sum , but from outside the trait definition I think it would be necessary to specify the full type parameters).

Generic items

It is also possible to have an associated item which itself has type parameters. For example, we might want to have a graph where kind of node can carry its own userdata:

trait Graph { type Node<B>; fn get_node_userdata<B>(n: &Node<Self, B>) -> B; }

Here we see that when we refer to the Node type in get_node_userdata , we specify both the Self type parameter and the type parameters defined on Node itself. I think this is a bit surprising.

Comparison to object-oriented form. The object-oriented naming scheme handles this case very naturally. For example, get_node_userdata() would be declared as follows:

fn get_node_userdata<B>(n: &Self::Node<B>) -> B;

Functional-style name resolution (take 2)

In the previous section we added implicit type parameters to each associated item. Particularly in the cases of generic traits or generic items, this can be a bit confusing. You wind up mixing type parameters that were declared on the trait together with type parameters declared on the item.

An alternative that would be a bit more explicit is to (1) designate the implicit parameter for the trait’s Self type using a special keyword, such as for or self (I prefer for since it echoes the impl Trait for Type form) and (2) push the trait type parameters into the path itself. So instead of writing Graph::Node<G> you would write Graph::Node<for G> , and instead of Add::Sum<Lhs, Rhs> you would write Add<Rhs>::Sum<for Lhs> . You’ll see more examples of how this looks in the next section.

Conclusion: Comparing the conventions

I think none of these conventions is perfect. Each has cases where it is a bit counterintuitive or ugly. To try and make the comparison easier, I’m going to create a table summarizing the object-oriented, functional 1, and functional 2 styles, and show how each syntax looks for each of the use cases I identified in this post. For each use case, I’ll provide both the shortest possible form and the fully explicit variant.