Java 11 String API Updates

It turns out that the new upcoming LTS JDK 11 release is bringing a few interesting String API updates to the table.

Let’s have a look at them and the interesting facts surrounding them.

String#repeat

One of the coolest additions to the String API is the repeat() method… that allows concatenating a String with itself a given number of times:

var string = "foo bar "; var result = string.repeat(2); // foo bar foo bar

But the things, I was most excited about here, were the corner cases to try out – if you try to repeat a String 0 times, you will always get an empty String:

@Test void shouldRepeatZeroTimes() { var string = "foo"; var result = string.repeat(0); assertThat(result).isEqualTo(""); }

Same applies to repeating an empty String:

@Test void shouldRepeatEmpty() { var string = ""; var result = string.repeat(Integer.MAX_VALUE); assertThat(result).isEqualTo(""); }

It might be tempting to think that it’s just relying on a StringBuilder underneath, but it’s not the case. The actual implementation is much more resource-effective:

public String repeat(int count) { if (count < 0) { throw new IllegalArgumentException("count is negative: " + count); } if (count == 1) { return this; } final int len = value.length; if (len == 0 || count == 0) { return ""; } if (len == 1) { final byte[] single = new byte[count]; Arrays.fill(single, value[0]); return new String(single, coder); } if (Integer.MAX_VALUE / count < len) { throw new OutOfMemoryError("Repeating " + len + " bytes String " + count + " times will produce a String exceeding maximum size."); } final int limit = len * count; final byte[] multiple = new byte[limit]; System.arraycopy(value, 0, multiple, 0, len); int copied = len; for (; copied < limit - copied; copied <<= 1) { System.arraycopy(multiple, 0, multiple, copied, copied); } System.arraycopy(multiple, 0, multiple, copied, limit - copied); return new String(multiple, coder); }

From the Compressed Strings point of view, the following fragment might look suspicious at the first sight (non-latin single-character String occupies two bytes), but it’s important to remember that value.length is the size of the internal byte array and not the String itself:

final int len = value.length; // ... if (len == 1) { final byte[] single = new byte[count]; Arrays.fill(single, value[0]); return new String(single, coder); }

String#isBlank

That one is super straightforward – now we can check if a String instance is empty or contains whitespace (defined by Character#isWhitespace(int)) exclusively:

var result = " ".isBlank(); // true

String#strip

We can easily get rid of all leading and trailing whitespace from each String now:

assertThat(" f oo ".strip()).isEqualTo("f oo");

This one will come in handy to avoid excessive whitespace once Raw Strings arrive in Java.

Additionally, we can narrow the operation only to trailing/leading whitespace:

assertThat(" f oo ".stripLeading()).isEqualTo("f oo "); assertThat(" f oo ".stripTrailing()).isEqualTo(" f oo");

However, you might be asking yourself how does this one differ from String#trim?

It turns out that String#strip is a modern Unicode-aware alternative that relies on the same definition of whitespace as String#isBlank.

More details about it can be found straight at the source.

String#lines

Using this new method, we can easily split a String instance into a Stream<String> of separate lines:

"foo

bar".lines().forEach(System.out::println); // foo // bar

What’s really cool is that instead of splitting a String and converting it into a Stream, specialized Spliterators were implemented(one for Latin and one for UTF-16 Strings) that make it possible to stay lazy:

private final static class LinesSpliterator implements Spliterator<String> { private byte[] value; private int index; // current index, modified on advance/split private final int fence; // one past last index LinesSpliterator(byte[] value) { this(value, 0, value.length); } LinesSpliterator(byte[] value, int start, int length) { this.value = value; this.index = start; this.fence = start + length; } private int indexOfLineSeparator(int start) { for (int current = start; current < fence; current++) { byte ch = value[current]; if (ch == '

' || ch == '\r') { return current; } } return fence; } private int skipLineSeparator(int start) { if (start < fence) { if (value[start] == '\r') { int next = start + 1; if (next < fence && value[next] == '

') { return next + 1; } } return start + 1; } return fence; } private String next() { int start = index; int end = indexOfLineSeparator(start); index = skipLineSeparator(end); return newString(value, start, end - start); } @Override public boolean tryAdvance(Consumer<? super String> action) { if (action == null) { throw new NullPointerException("tryAdvance action missing"); } if (index != fence) { action.accept(next()); return true; } return false; } @Override public void forEachRemaining(Consumer<? super String> action) { if (action == null) { throw new NullPointerException("forEachRemaining action missing"); } while (index != fence) { action.accept(next()); } } @Override public Spliterator<String> trySplit() { int half = (fence + index) >>> 1; int mid = skipLineSeparator(indexOfLineSeparator(half)); if (mid < fence) { int start = index; index = mid; return new LinesSpliterator(value, start, mid - start); } return null; } @Override public long estimateSize() { return fence - index + 1; } @Override public int characteristics() { return Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL; } }

Sources

Code snippets backing this article can be found on GitHub.



