Valhalla -- finding the primitives

I think its worth reflecting on how far we've come in Valhalla, both for the specific designs in the VM and language, and the clarity of the basic concepts. In the early model (Q World), the idea was that we would declare a class as either a value class or a "regular" class, and we would derive various properties based on that: regular classes have identity, value classes do not regular classes are nullable, value classes are not regular classes are reference types, value classes are not This model was derived from the current relationship between `int` and `Integer`. But, to interoperate with dynamically typed code (such as reflection) and erased generics, we needed reference types, so each value class got a "box" type (denotable as LV at the VM level) which was a reference type. Any interfaces declared on the value class were superinterfaces of the box. There was one runtime class, and `getClass()` returned that, regardless of whether invoked on a value or a boxed reference. This worked, but there were many aspects which were either confusing or unsatisfying. Value classes were neither reference types nor primitives, so we had gone from a type system split cleanly in two (which many people dislike) to one split uncleanly in three (for example, values had the Object methods, but didn't derive from Object, complicating code that was supposed to be generic across values and references alike.) Some chafed at the notion that value types could never be nullable; others didn't like that the all-zero value was always a member of the value set, whether or not it had semantic meaning. Some reference types had significant identity, and others (boxes) didn't. And we didn't have a clean story for migration. In the second iteration (L World), we addressed (at the VM level) the need to box in order to access reference-related functionality, by making `QV` a subtype of `LObject`, rationalizing the subtyping relationships between arrays, and replacing the box with a null-adjunction type (LV). This reduced the pressure on migration substantially, but we still hadn't addressed most of the user model issues, including the tripartite nature of the type system, and we created quite a few problems as a result (such as the relationship between the two class mirrors for QV/LV.) For example, to address initialization safety (where the zero value is outside the domain), we explored the notion of zero-default vs null-default inline classes, which involved treating the all-zero value as a null for some value classes but as a zero for others. But we kept finding that we were having too many "flavors" of everything, because, in hindsight, the various aspects were not yet cleanly factored down to their primitives. In the end, it turned out we were conflating a number of distinctions, and kept trying to use one as a proxy for another: - nullable vs non-nullable - pass-by-reference vs pass-by-value / flattened - reference type vs value type - identity-ful vs identity-free For example, we wanted to call classes like String "reference types" and classes like Point "value types", but when we got to types like Object and interfaces, they had one foot in each camp. It turns out, that in the "find the primitive" game, "reference type" wasn't the primitive. Classes. The user declares _classes_ ("public class Foo { }"); we derive _types_ from class declarations (Foo, Foo[], etc.) The primitive that Valhalla introduces into class declaration is whether the instances of the class _have identity or not_. Traditional classes are now revealed to be "identity classes"; the new kind (identity-free) are called "inline classes". (This might not be the final word on the subject.) Types and values. In the type system we have now, some types contain primitive values, and other types contain _references to objects_. What messed us up for a while is that the type types -- Object and interfaces -- can contain both. A big AHA of the recent iterations is that it makes sense to talk about both _values of_ inline classes and _references to_ those values. Reference type has (almost) nothing to do with inline vs identity -- it has to do with whether the value set of the type contains values, or references. For an identity class C, we derive one type: C, which consists of references to instances of C. For an inline class V, we derive two types: `V.ref`, which is a reference type (and therefore nullable), and contains references to the instances of V, and `V.val`, which is not a reference type, and whose values are "raw" instances of V. With this understanding, the nullity problem becomes a simpler one: nullity is a property of _reference types_. So `V.ref` is nullable, and `V.val` is not; we don't need a way to say "nullable value" or different ways to interpret the default value. We derive flattening and calling conventions in the same way; for reference types, we always store / pass as-if-by reference, but for "val" types, we store / pass as-if-by value. It is this refined understanding that has brought me back to the ref/val notation _for the types_. "Inline" is a way of saying "identity free" when declaring classes, but it doesn't say anything (yet) about the semantics of how we represent variables on the heap or pass them on the stack. For this, we need an additional property of the type, and ref vs val seems to ideally describe what we mean -- that the value set of the type consists of either references or values, and the representation/calling conventions behave as if we are storing/passing references or values. (Having come to this clarity about the types, we are free to pick a word other than "inline" if we think there is a better way to say "identity-free", though I don't think going back to "value" is necessarily right.) With this distinction in place, some previously nasty problems (such as nullity) become trivial (if you want "nullable values", use references), and some previously impossible problems (such as unifying primitives with values) become tractible. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.openjdk.java.net/pipermail/valhalla-spec-experts/attachments/20200218/1da78c5d/attachment.htm>