Puzzler #1: Colons, Arrows, Braces, Break, Yield
Which of the following lines can occur in a switch
? (Not all necessarily in the same.) And in which kind of switch
?
case 5: if (Math.random() < 0.5) break;
Click arrow to reveal answer
Sure. This is legal in a classic switch
statement. Half the time, execution falls through the next case. Don’t code like this at home.
case 5 -> log("TGIF"); yield "Friday";
Click arrow to reveal answer
No. This is a switch
expression without fall through. The ->
must be followed by an expression, throw
, or a block. It would be ok if you enclosed the code following the ->
in braces: case 5 -> { log("TGIF"); yield "Friday"; }
case 5: log("TGIF"); yield "Friday";
Click arrow to reveal answer
Yes. This is a switch
expression with fall through. This branch doesn’t fall through, actually, but yields the expression’s value. No braces needed because, colon.
case 5 -> { if (Math.random() < 0.5) break; log("TGIF"); }
Click arrow to reveal answer
All good. This is a switch
statement without fall through. Half the time, the call to log
is skipped. Don’t code like this at home. Note that the braces are necessary.
Did you get all four right? Congratulations! You earned a partridge in a pear tree. Skip the next section and move on to puzzler #2.
Principle #1: Two Axes
The classic switch
of the C language had a simple purpose: to be compiled into a “jump table” that holds the memory addresses of the code for each case. The value of the “selector”—the expression inside switch (...)
—is used as table index, either as an offset or with a binary search. That is more efficient than a linear if/else if/else if/else branch sequence, particularly if the number of cases is large. In a high-level language, there is no way to code a jump table directly. Hence the switch
statement.
Many programmers, when learning switch
, were warned of the weirdness of “fall through”. By default, execution flows from one case to the next. Of course it does. That’s how jump tables work. They only care about the efficient jump. If you don’t want to fall through, just add a break
, which is compiled into a jump to the end.
Thirty years later, many modern programming languages support pattern matching. In its simplest form, using Java syntax:
String seasonName = switch (seasonCode) { case 0 -> "Spring"; case 1 -> "Summer"; case 2 -> "Fall"; case 3 -> "Winter"; default -> "???"; };
There are two crucial differences:
- Each case yields a value
- The branches are disjoint; there is no fall through
Are these differences crucial enough to come up with a different syntax for pattern matching? The Java designers didn’t think so. This is what they wrote in JEP 361:
“By teasing the desired benefits (expression-ness, better control flow, saner scoping) into orthogonal features, switch expressions and switch statements could have more in common. The greater the divergence between switch expressions and switch statements, the more complex the language is to learn, and the more sharp edges there are for developers to cut themselves on.”
So, now we have four forms of switch
:
- The classic
switch
statement, unchanged from Java 1.0. With fall through. - Expression switch with no fall through—the crisp, clean form that you just saw, with
-> value
after each case. - A modern
switch
statement with no fall through. - For completeness, expression switch with fall through. Why would you ever want that??? You probably don’t. Except if one of the cases has a side effect, such as the logging call above. Then turn all arrows into colons, and add
yield
in eachcase
. Hopefully your IDE can help you with that rewrite.
For a switch
to be an expression, it must be in expression position: assigned to a variable or passed as a method argument. Also, if you see break
, you know it must be a statement. And if you see yield
, it must be an expression.
The colon :
denotes classic fall-through. The ->
indicates no fall-through. Mercifilly, you can’t mix them in the same switch
.
After a colon, you can have any number of statements. As always. With a switch
expression, there must be one or more yield
statements.
Conversely, after an arrow, there can only be an expression, or throw
, or a block. Which must have yield
in a switch
expression.
Caution: Some programmers think that ->
signals an expression switch because it looks like a lambda expression. And because it must be followed by an expression or block. That is not so. A no-fall-through switch
statement uses case ... -> { ... }
.
Puzzler #2
Is this legal?
Object x = ...; String result = switch (x) { case "" -> "empty"; case 0 -> "zero"; default -> "something else"; };
Click arrow to reveal answer
No—a constant label of type java.lang.String
and of type int
is not compatible with switch selector type Object
What about
enum Size { SMALL, MEDIUM, LARGE, EXTRA_LARGE }; Object x = ...; String result = switch (x) { case Size.EXTRA_LARGE -> "extra large"; default -> "something else"; };
Click arrow to reveal answer
Perfectly legal.
Why isn’t it like the preceding code snippet? The constant label has type Size
, and the switch selector type is Object
.
The rules are different for enum
case constants. Their value must merely be assignment compatible to the selector type.
Principle #2: Selector types
The selector types of switch
have expanded over time:
- Java 1.0:
int
,short
,byte
,char
- Java 5:
Integer
,Short
,Byte
,Char
- Java 5:
enum
- Java 7:
String
- Java 17: any reference type, pattern cases
- Still to come:
float
,double
,long
,boolean
With pre-pattern matching switches, a constant case label must be a compile-time constant, and it must be assignment-compatible to the selector expression type. For example, you can have case 5
when the selector type is Integer
.
With pattern-matching switches, the rules are different and complex. When the selector type is Object
or some other supertype of String
, Integer
, Short
, Byte
, or Char
, you can’t have constant labels. For example,
case 0 -> "zero";
won’t work when the selector type is something other than int
, short
, byte
, char
, Integer
, Short
, Byte
, Char
.
The remedy is:
case Integer i when i == 0 -> "zero";
But for enum
, the rules have evolved differently. First off, the rules have changed for the case constants. Previously, you wrote
case EXTRA_LARGE -> "extra large";
The enum
type was inferred from the selector type. Since now the selector type can be a supertype, you qualified enum
names:
case Size.EXTRA_LARGE -> "extra large";
You can use them even if you don’t have to, with an enum
selector type.
More importantly, you are allowed to use enum
constants in case
labels. This is useful for pattern matching in a sealed hierarchy where some of the implementing classes are enumerations, such as in this (incomplete) JSON primitive type hierarchy:
sealed interface JSONPrimitive permits JSONNumber, JSONString, JSONBoolean {} final record JSONNumber(double value) implements JSONPrimitive {} final record JSONString(String value) implements JSONPrimitive {} enum JSONBoolean implements JSONPrimitive { FALSE, TRUE; } JSONPrimitive p = ...; result = switch (p) { case JSONNumber(v) when v == 0 -> "zero"; case JSONString(s) where s.isEmpty() -> "empty"; case JSONBoolean.FALSE -> "false"; default -> "something else"; }
Finally, note that constants are not allowed inside record patterns. For example, you cannot use
case JSONNumber(0) -> "zero";
You can use a when
clause, as in the preceding example. Nicer syntax may come in the future.
Puzzler #3
Looking again at this (incomplete) JSON primitive type hierarchy:
sealed interface JSONPrimitive permits JSONNumber, JSONString, JSONBoolean {} final record JSONNumber(double value) implements JSONPrimitive {} final record JSONString(String value) implements JSONPrimitive {} enum JSONBoolean implements JSONPrimitive { FALSE, TRUE; }
compare
if (j instanceof JSONNumber(var v)) d = "" + v; else if (j instanceof JSONString(var s)) d = s; else if (j instanceof JSONBoolean b) d = b.name();
and
switch (j) { case JSONNumber(var v): d = "" + v; break; case JSONString(var s): d = s; break; case JSONBoolean b: d = b.name(); break; };
Do they do exactly the same thing? If no, for which value of j
do they differ?
Click arrow to reveal answer
By design, pattern matching for instanceof
and switch
have the same behavior, including the binding to the matched variable (v
or b
in the example).
But there is one crucial difference. For historical reasons, instanceof
is null
-friendly. The expression null instanceof ...
is simply false
. But switch
is null
-hostile: switch (null) { ... }
throws a NullPointerException.
So, the answer is: the two statements have the same effect except when j
is null
.
Knowing this, let’s move on to record patterns:
record Box<T>(T contents) { } Box<String> boxed = null; String unboxed = switch (boxed) { case Box(String s) -> s; };
What happens?
Click arrow to reveal answer
A NullPointerException
. No surprise.
What about
Box<String> boxed = new Box(null); String unboxed = switch (boxed) { case Box(String s) -> s; };
Click arrow to reveal answer
No problem. s
is bound to null
, and unboxed
becomes null
.
What about
Box<Box<String>> doubleBoxed = new Box(null); String unboxed = switch (doubleBoxed) { case Box(Box(String s)) -> s; };
Click arrow to reveal answer
An implicit mechanism tries to match Box(null)
wiith a Box(b)
, which is a Box(String s)
, and then set s = b.contents()
. The match is deemed to fail, and there are no further matching cases. Therefore, a MatchException
is thrown. Not a NullPointerException
.
Principle #3: Null
To nobody’s surprise, null
is always a cause of grief. In Java 1.0, switch
was only defined for primitive types, so null
wasn’t an issue. When wrappers were added, it made sense to say that null
was exceptional. When enum
was added in Java 5, that still made sense. Why would an enum
value ever be null
? And with switching on strings in Java 7, there was no reason to rock the boat either. A switch
with a null
selector simply throws a NullPointerException
.
But with pattern matching, it was decided that it would be ugly to surround switch
with checks against null
, and a case null
was allowed. For example:
String unboxed = switch (boxed) { case Box(String s) -> s; case null -> "empty"; };
Note that the first case is not a match. That explains the doubleBoxed
puzzler.
You can combine case null
with default
, but not with any other case:
case null, default -> "something else"; // Ok case null, 0 -> "nullish"; // ERROR
Adding case null
to any switch
makes the switch
null-friendly, but it also turns it into a “modern” switch, which has more stringent requirements than its classic cousin. See the following sections.
Puzzler #4
Compare the following two uses of switch
. Which one is incorrect, and why?
int x = ...; String d = switch (x) { case 0 -> "zero"; case 1, 2, 3 -> "small"; } switch (x) { case 0: d = "zero"; break; case 1, 2, 3: d = "small"; break; }
Click arrow to reveal answer
The first switch
—an expression—won’t compile. It is not exhaustive. If x
is something other than 0, 1, 2, 3, it can’t produce a value.
The second switch
—a classic statement—doesn’t have to be exhaustive. If x
is something other than 0, 1, 2, 3, nothing happens.
Ok, now what about
Integer x = ...; String d = ""; switch (x) { case 0: d = "zero"; break; case 1, 2, 3: d = "small"; break; case null: d = "null"; break; }
Click arrow to reveal answer
This switch
statement doesn’t compile. It is not exhaustive.
Wait…since when do switch statements have to be exhaustive? If you are surprised, read on.
Principle #4: Exhaustiveness
All switch
expressions must be exhaustive. For any selector value, there must be a matching case. This is necessary since the expression must always yield a value.
Classic switch
statements need not be exhaustive. But “modern” switch
statements have to. If you mean to do nothing when none of the cases match, add a default: break;
or default -> {};
A switch is modern if it has a type pattern, record pattern, or case null
.
Note that cases with when
clauses are ignored for exhaustiveness checking (unless the when
clause is a compile-time constant). This switch
is not exhaustive:
Integer x = ...; String d = switch (x) { case 0 -> "zero"; case Integer n when n > 0 -> "positive"; case Integer n when n < 0 -> "negative"; }
The compiler isn’t a mathematician. It doesn’t try to reason that every integer must be zero, positive, or negative.
Remedy: case Integer _
or default
in the last clause.
Exhaustiveness is particularly useful with sealed hierarchies:
switch (j) { case JSONNumber(var v) -> "" + v; case JSONString(var v) -> v; case JSONBoolean.FALSE -> "false"; }; // oops--what about JSONBoolean.TRUE?
Finally, note that null
is never used in exhaustiveness checking. A switch
can be exhaustive without case null
. It is just null
-hostile and throws a NPE with a null
selector. Or a MatchError
when there is a nested null
in a record.
Puzzler #5
What is wrong with this switch
?
String d = switch (obj) { case Number n -> "a number"; case Integer i -> "an integer"; default -> "something else"; };
Click arrow to reveal answer
With type and record patterns, order matters. The first case dominates the second. That is a compile-time error.
What about
Integer x = 0; String d = switch (x) { case Integer i when i > 0 -> "positive"; default -> "negative"; case 0 -> "zero"; }
Click arrow to reveal answer
It’s perfectly fine. For historical reasons, default
has inconsistent dominance rules. Read on for the details.
Principle #5: Dominance
Type and record patterns are processed top to bottom. The compiler generates an error if one case dominates the other. For example:
case Number n
dominates
case Integer i
and
case Number n when n.intValue() == 0
The record pattern
case Box(var b)
dominates
case Box(JSONString(var s))
As with exhaustiveness checking, the contents of when
clauses is not analyzed (unless they are compile-time constants). The compiler can’t tell that
case Number n when n.intValue() >= 0
dominates
case Number n when n.intValue() == 0
The default
clause must come after any patterns. But for historical reasons, it can come before constant cases.
With classic switch
statements, the order of the cases doesn’t matter, except when there is fall through. Because you can fall through from the default
clause, it can be anywhere:
switch (n) { case 0: log("zero"); break; default: log("ignore the next log entry"); // FALL THROUGH case 1: log("one"); break; }
I couldn’t think of a realistic example where this behavior would be useful. Just put default
last.
Puzzler #6
Can you declare variables with the same name in different cases?
switch (n) { case 0, 1: String d = "binary"; log(d); break; default: String d = "not binary"; log(d); break; }
Click arrow to reveal answer
Since Java 1.0, it has been legal to declare a variable inside a switch
. The scope extends from the point of declaration until the end of the switch
.
Therefore, the switch
above doesn’t compile. The variable d
is declared twice. Remedy: Use braces to confine d
to a block.
What about variables introduced in patterns?
JSONPrimitive j = ...; String d; switch (j) { case JSONNumber(var v): d = "" + v; break; case JSONString(var v): d = v; break; case JSONBoolean v: d = v.name(); break; };
Click arrow to reveal answer
This switch
compiles. The scope of each pattern variable v
extends to the end of the statements in the case
.
Principle #6: Variable Scopes
There are three ways of declaring variables inside a switch
:
- Inside a block:
{ var a = ...; ... }
. These are unsurprising. The scope ends with the block. - Inside a pattern:
case JSONNumber(var v)
. The scope starts with the declaration, so you can use it in guards:case JSONNumber(var v) when v >= 0
The scope is confined to the
case
. - In a statement following a colon of a
case
. This is a weird historical artifact. More below.
Ever since the switch
statement in the C programming language, it has been legal to declare a variable anywhere in the switch
. Its scope extends to the end of the statement. After all, the case labels are just jump targets. This is perfectly legal:
int n = ...; switch (n) { case 0, 1: String d = "binary"; log(d); // FALL THROUGH default: d = "default"; log(d); }
Note that the default
branch must assign something to d
before using it. Otherwise, the compiler reports an error about a possibly uninitialized variable.
Because of the tracking of uninitialized variables, such switch
-scoped variables are never useful. I have only seen them in certification exam questions. Just stick to block-scoped and pattern variables.
The alert reader may swonder what happens with fall-through into a pattern:
case Integer n: log(n); // FALL THROUGH case String s: log(s.length()); break; // ERROR
This is an error. When falling through from case Integer n
, it is impossible to bind the selector value to s
. But you can fall into a type pattern that does not bind the match to a variable:
case Integer n: log(n); // FALL THROUGH case String _: log("string"); break; // Ok
Don’t code like that at home!
Conclusion
Pattern matching has the potential to make code easier to read, particularly when working with sealed type hierarchies that are designed with pattern matching in mind. This is common practice in functional programming languages. I imagine it will become much more common in Java when we have efficient value objects.
Java has chosen to incorporate pattern matching into the classic switch
and instanceof
syntax. That leverages programmer experience in straightforward cases. But it can create confusion in edge cases, as you can probably confirm from your performance on those puzzlers. (If you got them all correct, award yourself five gold rings.)
To keep out of trouble, I send you these six geese a-laying rules of thumb:
- Don’t use fall through. It saves you a lot of grief and complexity!
- Use
switch
expressions, not statements. - Have a
case null
unless you really want a null selector to throw a NPE - Put
default
at the end - Sort your cases so that the most specific ones come first (in particular, constants)
- Don’t use
switch
-scoped variables