Switch expressions -- gathering the threads

There's been some active discussion on "Is this the switch expression construct we're looking for" over on amber-dev. Its a good time to take stock of where we are, and identifying any loose ends. ## Approach Our approach is driven not merely by the desire to have an expression form of switch, but to make switch more generally useful as a multi-way conditional construct. The biggest driver here of course is making it work well with pattern matching. Pattern matching is a driver for better handling of nulls and primitives (though these are also useful on their own); additionally, the more useful we make switch, the more obvious the cumbersomeness of its statement-orientation becomes. Pattern matching also pushes hard on the somewhat unfortunate scoping behavior; a straightforward interpretation of existing scoping of locals in switch would not be very good for pattern bindings. At first, given all the constraints of existing switches, we thought it unlikely that we'd be able to get away with teaching switch some new tricks, and would have to create a new construct (say, "match"). Bit by bit, though, we were able to chip away at the accidental complexity of the { constants, patterns } x { statement, expression } space, to the point where it seemed practical to unify the construct. Having a single construct has pros and cons. On the other hand, entities should not be multipled without necessity; on the other, a one-size-fits-all construct might exhibit schizoid behavior. And the switch statement probably has more unusual (some would say objectionable) behaviors than any other Java construct, putting us in tension between compatibility and perceived complexity. ## Current proposal The current proposal starts with existing statement switch, extending `break` to support a value, and requiring that the value-ness of the break match the value-ness of the switch (just as return must with methods or lambdas). We also slightly adjust the rules regarding nonlocal control flow _through_ a switch switch. Because expression switches are expressions, they must be total. For expression switches over enums and sealed types, we have the option to infer a throwing default when all sealed members are provided. We then offer a shorthand form for case labels in expression switches, that: case P -> e; is shorthand for case P: break e; This leaves the following differences between expression switches and statement switches: - Expression switches are required to be exhaustive; statement switches cannot be required to be exhaustive. - Expression switches permit the `->` shorthand form. - Expression switches may restrict fallthrough in some way, or may not, TBD. - You can `return` and `continue` out of a statement switch, but not out of an expression switch (like lambdas.) - You cannot `break` or `continue` _through_ an expression switch (like lambdas and conditionals.) And leaves some open issues for discussion: - We have some options as to whether to restrict fallthrough in expression switches, and also whether to restrict fallthrough into patterns. - We have the option to try and give the `->` form some meaning in statement switches. ## Commentary The concerns raised so far mostly revolve around potential confusion. Because the two forms are mostly alike, but have subtle differences, the fear is this will lead to confusion. Various schemes have been suggested to make them look more different, or to make them behave more different, to make it more clear where the lines are. For example, the following have been cited: - Saying `break expression` is ugly, or confusable for a labeled break; - Concerns that fallthrough-by-default is an even worse default for expression switches than for statements (and, if we restrict fallthrough in switch expression, the gap between the forms grows); - The asymmetry of the implicit throwing default in apparently-exhaustive enum switches will be a sharp edge; - That a user might not be able to tell, by looking at the middle of a large switch, whether its an expression or statement switch? - The possibility people will write code with mixed label forms (colon and arrow) seems to scare the heck out of people; - The arrows might confuse people with similarity to lambdas. My reaction to most of these is "meh". I think the arrow-form is going to be so preferable that the risk of fallthrough will be low (because there are few statements in the first place), and can be lowered further with restrictions; similarly, I think unrestricted mixing of arrow and colon forms will be quite rare (except for the case where there is one catch-all case, often a default, which will take statement form, which seems mostly harmless), and strongly discouraged. And that means that the confusion between expression and statement will be nonexistent -- because the expression ones will have arrows and the statement ones will not. There are also a number of calls for "If X is rare, just disallow X", where X could be a statement-plus-expression form in expression switches or mixed label forms in one switch. The problem is that they are usually not rare _enough_ that their lack would not cause a different kind of backlash. #### Some alternatives that have been suggested **Separate keyword.** Having a separate keyword ("choose") for expression switch seems like it should dispel all the "but people will be confused" issues, but I'm not sure it actually will. Because the two constructs will still be so similar, the differences will likely still be surprises to people. It is also not a magic wand; we still have to figure out how to deal with statement+expression compounds, and doesn't automatically rule out the "mixed colons and arrows" problem. **Block expression**. For the "mixed colons and arrows" problem, several have suggested some sort of ad-hoc, switch-specific block expression, but from a language evolution perspective, I think this is a cure is worse than the disease. Having an ad-hoc form just for switch is terrible, and adding a general block expression form to the language is not where we want to go -- and doing it to avoid the perception of rampant mixed colons-and-arrows would be killing a dust mite with a napalm blast. **No colons in expression switch.** Without a block expression, this is a non-starter; there are way too many legitimate uses for compound expressions in expressions witches. **No mixed colons and arrows**. This will be intensely irritating to users; if you add one compound expression in a 50-way switch, you have to change 49 others from the nice form to the nasty one. ## Open issues The main issue we need to address is whether we want to restrict fallthrough in expression switches (or in the extreme case, prohibit it entirely.) One argument why fallthrough might be desirable is that some existing statement switches that make use of fallthrough (such as string or packet parsers) could become expression switches; these frequently have a "main result" they want to return (such as the index of the next character), while at the same time recording some side state about the context. Refactoring these to expression switches could be beneficial just as it is for many other statement switches. On the other hand, it would also be reasonable say we should leave these cases in statement-world where they are now. A form of fallthrough that I think may be more common in expression switches is when something wants to fall _into_ the default: int x = switch (y) { case "Foo" -> 1; case "Bar" -> 2; case null: default: // handle exceptional case here } Because `default` is not a pattern, we can't say: case null, default: here. (Well, we could make it one.) Though we could carve out an exception for such "trivial" fallthrough. I think a reasonable restriction that might preserve flexibility while avoiding most accidental uses is to make it illegal to fall _into_ an arrow-labeled case; if you want fallthrough, stay in colon-world. (It's impossible to fall _out of_ an arrow case.) Given that most users would rather live in arrow-world, this means that for practical purposes, there's no fallthrough in expression switches at all, but advanced users have a fallback that works just like the switch and fallthrough they've always known. While it is not specific to expression vs statement switch, we should also ask whether we want to restrict fallthrough into certain kinds of pattern labels (i.e., those without binding variables), even in statement switch. (I don't really see the point, though; I don't see a path to getting rid of the breaks, which would be the real payoff.) Further, because of the intersection rules about OR pattern, its more likely an accidental fallthrough from one pattern label to another would result in a compile error anyway. #### -> in statement switch Finally, people have asked about whether we should consider allowing `->` for statement switches too (perhaps on the theory that they're kind of like void-valued expression switches.) I see the attraction here -- when the majority of actions are single-line, this would be a winner, and you could drop the breaks. However, because the distribution of statement count in switch arms is all over the map, this would dramatically increase the the prevalence of mixed colon-and-arrow switches, and probably further exposing people to the risk of accidental fallthrough, as now break is needed sometimes and not others _in the same statement switch_.