Java 11 String API Updates
The imminent JDK 11 is bringing with it some new features to the String API! Let's take a dive and see what some of those features are and how they work.
Join the DZone community and get the full member experience.
Join For FreeIt turns out that the new upcoming LTS JDK 11 release is bringing a few interesting String API updates to the table.
Let's have a look at them and the interesting facts surrounding them.
String#repeat
One of the coolest additions to the String API is the repeat()
method. This method allows concatenating a String with itself a given number of times:
var string = "foo bar ";
var result = string.repeat(2); // foo bar foo bar
But, I was most excited about trying out the corner cases — if you try to repeat a String
0 times, you will always get an empty String
:
@Test
void shouldRepeatZeroTimes() {
var string = "foo";
var result = string.repeat(0);
assertThat(result).isEqualTo("");
}
The same applies to repeating an empty String
:
@Test
void shouldRepeatEmpty() {
var string = "";
var result = string.repeat(Integer.MAX_VALUE);
assertThat(result).isEqualTo("");
}
It might be tempting to think that it's just relying on a StringBuilder
underneath, but that's not the case. The actual implementation is much more resource-effective:
public String repeat(int count) {
if (count < 0) {
throw new IllegalArgumentException("count is negative: " + count);
}
if (count == 1) {
return this;
}
final int len = value.length;
if (len == 0 || count == 0) {
return "";
}
if (len == 1) {
final byte[] single = new byte[count];
Arrays.fill(single, value[0]);
return new String(single, coder);
}
if (Integer.MAX_VALUE / count < len) {
throw new OutOfMemoryError("Repeating " + len + " bytes String " + count +
" times will produce a String exceeding maximum size.");
}
final int limit = len * count;
final byte[] multiple = new byte[limit];
System.arraycopy(value, 0, multiple, 0, len);
int copied = len;
for (; copied < limit - copied; copied <<= 1) {
System.arraycopy(multiple, 0, multiple, copied, copied);
}
System.arraycopy(multiple, 0, multiple, copied, limit - copied);
return new String(multiple, coder);
}
From the Compressed Strings point of view, the following fragment might look suspicious at first sight (non-latin single-character String occupies two bytes), but it's important to remember that value.length
is the size of the internal byte array and not the String itself:
final int len = value.length;
// ...
if (len == 1) {
final byte[] single = new byte[count];
Arrays.fill(single, value[0]);
return new String(single, coder);
}
String#isBlank
That one is super straightforward. Now, we can check if a String instance is empty or contains whitespace (defined by Character#isWhitespace(int)) exclusively:
var result = " ".isBlank(); // true
String#strip
We can easily get rid of all the whitespace from each String
now:
assertThat(" ".strip()).isEmpty();
This one will come in handy to avoid excessive whitespace once Raw Strings arrive in Java.
Additionally, we can narrow the operation only to trailing/leading whitespace:
assertThat(" foo ".stripLeading()).isEqualTo("foo ");
assertThat(" foo ".stripTrailing()).isEqualTo(" foo");
However, you might be asking yourself how does this one differ from String#trim
?
It turns out that String#strip
is a modern Unicode-aware alternative that relies on the same definition of whitespace as String#isBlank
.
More details about it can be found straight at the source.
String#lines
Using this new method, we can easily split a String
instance into a Stream<String> of separate lines:
"foo\nbar".lines().forEach(System.out::println);
// foo
// bar
What's really cool is that, instead of splitting a String
and converting it into a Stream
, specialized Spliterators
were implemented (one for Latin and one for UTF-16 Strings) that makes it possible to stay lazy:
private final static class LinesSpliterator implements Spliterator<String> {
private byte[] value;
private int index; // current index, modified on advance/split
private final int fence; // one past last index
LinesSpliterator(byte[] value) {
this(value, 0, value.length);
}
LinesSpliterator(byte[] value, int start, int length) {
this.value = value;
this.index = start;
this.fence = start + length;
}
private int indexOfLineSeparator(int start) {
for (int current = start; current < fence; current++) {
byte ch = value[current];
if (ch == '\n' || ch == '\r') {
return current;
}
}
return fence;
}
private int skipLineSeparator(int start) {
if (start < fence) {
if (value[start] == '\r') {
int next = start + 1;
if (next < fence && value[next] == '\n') {
return next + 1;
}
}
return start + 1;
}
return fence;
}
private String next() {
int start = index;
int end = indexOfLineSeparator(start);
index = skipLineSeparator(end);
return newString(value, start, end - start);
}
@Override
public boolean tryAdvance(Consumer<? super String> action) {
if (action == null) {
throw new NullPointerException("tryAdvance action missing");
}
if (index != fence) {
action.accept(next());
return true;
}
return false;
}
@Override
public void forEachRemaining(Consumer<? super String> action) {
if (action == null) {
throw new NullPointerException("forEachRemaining action missing");
}
while (index != fence) {
action.accept(next());
}
}
@Override
public Spliterator<String> trySplit() {
int half = (fence + index) >>> 1;
int mid = skipLineSeparator(indexOfLineSeparator(half));
if (mid < fence) {
int start = index;
index = mid;
return new LinesSpliterator(value, start, mid - start);
}
return null;
}
@Override
public long estimateSize() {
return fence - index + 1;
}
@Override
public int characteristics() {
return Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL;
}
}
Sources
Code snippets from this article can be found on GitHub.
Published at DZone with permission of Grzegorz Piwowarek, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments