[rust-dev] RFC: Overloadable dereference operator

Hi everyone, I've recently started thinking that a number of use cases that we've wanted to solve at some point could be solved if the dereference operator could be overloaded much like the other operators. Most importantly, this addresses a missing part of the puzzle for custom smart pointers, but it also fixes issues relating to autoderef on newtypes and "common fields". # Mechanics We introduce a new lang item trait: #[lang="deref"] pub trait Deref<Result> { fn deref(&'self self) -> &'self Result; } This `deref` method is invoked by the compiler in two cases: 1. When the unary `*` operator is used on a value. In this case, the result pointer type is automatically dereferenced and becomes an lvalue (albeit an immutable one). 2. When method lookup or field projection fails. In this case, the method lookup or field projection is tried again with the `Result` type. It would be nice if `Result` were a functional dependency of `Self` above (e.g. simultaneous `impl Result<int> for Foo` and `impl Result<float> for Foo` would be forbidden). Unfortunately we don't have the trait machinery to enforce this yet, as this is associated types. We could just enforce this in an ad hoc way, or we could not enforce it. I don't care too much either way. # Use cases There are several use cases that this enables: ## Custom smart pointers For custom smart pointers it is highly desirable to support autoderef and to have the `*` operator enable access to fields. For example, suppose `@T` becomes `Gc<T>`. We would like to avoid something like: let p: Gc<int> = ...; do p.read |x| { printfln!("Your lucky number is %d", *x) } With overloadable deref it would look like: let p: Gc<int> = ...; printfln!("Your lucky number is %d", *p) I *believe* that this does not cause liveness issues for GC and RC because the lifetime of the resulting reference is tied to the lifetime of the GC/RC box itself, so the reference piggybacks off the pointer's reference count and everything is OK. However, I could be mistaken here; here I'd like others to check my reasoning. In particular I'm also interested in legitimate use cases that this might forbid. ## Controllable newtype autoderef Currently, newtype structs automatically dereference to the value they contain; for example: struct MyInt(int); fn main() { let x = MyInt(3); printfln("1 + 2 = " + x.to_str()); // prints "1 + 2 = 3" } This behavior is sometimes undesirable, as Brian often points out. Haskell allows behavior similar to this to be controlled on an opt-in basis with `GeneralizedNewtypeDeriving`. We could support something similar by turning off autoderef for newtype structs and leaning on overloadable dereferencing when it is desirable. In this new world, to get the behavior above one would write: struct MyInt(int); impl Deref<int> for MyInt { fn deref(&'self self) -> &'self int { let MyInt(ref inner) = *self; inner } } We could imagine something like this to make it simpler: #[deriving(Deref)] struct MyInt(int); ## Anonymous fields In Go (and in C with Plan 9 extensions) it is possible to place one struct inside another struct and inherit its fields: type Foo struct { X int Y int } type Bar struct { Foo Z int } x = Bar { Foo { X: 1, Y: 2, } Z: 3, } fmt.Println("%d", x.Y) // prints 2 This is almost multiple inheritance, except that the type of the `this` pointer will be different when invoking `Foo` methods on a `Bar` instance. With overloadable deref this would be possible in Rust as well: struct Bar { base: Foo, z: int, } impl Deref<Foo> for Bar { fn deref(&'self self) -> &'self Foo { &self.base } } One could imagine macro sugar for this use case, for example: #[deriving(Deref(base))] struct Bar { base: Foo, z: int, } ## Common fields It is a common pattern, for example in Servo, to simulate inheritance in Rust with something like: struct Bar { base: FooCommon, ... } struct Baz { base: FooCommon, ... } struct Boo { base: FooCommon, ... } enum Foo { BarClass(~Bar), BazClass(~Baz), BooClass(~Boo), } The problem here is that if you have a `Foo` instance there is no convenient way to access the common fields short of a `match`. Again, overloadable deref comes to the rescue here. We could imagine an overloaded `Deref` as follows: impl Deref<FooCommon> for Foo { fn deref(&'self self) -> &'self FooCommon { match *self { BarClass(ref bar) => &bar.base, BazClass(ref baz) => &baz.base, BooClass(ref boo) => &boo.base, } } } And once again we could come up with some sort of syntactic sugar for this. # Conclusion This one small feature seems to encompass a lot of use cases which we had previously thought we might have to solve using multiple disparate features. That cleanliness is attractive to me, assuming that this scheme works. I'd be interested to hear everyone's thoughts.