Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

When a Blank Isn't Blank

DZone's Guide to

When a Blank Isn't Blank

In a world where everything is already implemented by someone, there surely is no need for new Util classes, is there? Well, there are always uncovered edge cases.

· Java Zone ·
Free Resource

Download Microservices for Java Developers: A hands-on introduction to frameworks and containers. Brought to you in partnership with Red Hat.

I had a funny moment this week when I needed to generate an error message when users entered a string of blank spaces. I've previously blogged on using utility classes under Why do we still create Util classes?, so my first port of call was to use StringUtil.isBlank. But it didn't catch my blank — so I had a case of blank not being a blank

Of course, the answer was simple, and the whitespace was a No-Break Space(U+00A0). It was generated by a transformation between my instance of CKEditor in the browser and the validation layer.

Now if we look at the StringUtils.isBlank method:

public static boolean isBlank(final CharSequence cs) {
    int strLen;
    if (cs == null || (strLen = cs.length()) == 0) {
        return true;
    }
    for (int i = 0; i < strLen; i++) {
        if (!Character.isWhitespace(cs.charAt(i))) {
            return false;
        }
    }
    return true;
}


We see it ultimately delegates to Character.isWhitespace, and if we look at the documentation we clearly see that it doesnt detect non-breaking space - '\u00A0', '\u2007', '\u202F'

  • It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
  • It is '\t', U+0009 HORIZONTAL TABULATION.
  • It is '\n', U+000A LINE FEED.
  • It is '\u000B', U+000B VERTICAL TABULATION.
  • It is '\f', U+000C FORM FEED.
  • It is '\r', U+000D CARRIAGE RETURN.
  • It is '\u001C', U+001C FILE SEPARATOR.
  • It is '\u001D', U+001D GROUP SEPARATOR.
  • It is '\u001E', U+001E RECORD SEPARATOR.
  • It is '\u001F', U+001F UNIT SEPARATOR.

So for me, the moral is to keep using Util classes — but occasionally RTFM.

Incidentally, it was caught in the unit testing — so all was well!

Download Building Reactive Microservices in Java: Asynchronous and Event-Based Application Design. Brought to you in partnership with Red Hat

Topics:
util classes ,java strings ,java ,white space ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}