DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Coding
  3. Java
  4. New Methods on Java Strings With JDK 11

New Methods on Java Strings With JDK 11

JDK 11 will probably be bringing a few new methods to the String class! Most of them focus on redefining white space for a more consistent approach.

Dustin Marx user avatar by
Dustin Marx
·
May. 03, 18 · News
Like (9)
Save
Tweet
Share
27.83K Views

Join the DZone community and get the full member experience.

Join For Free

It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.

BUG # BUG TITLE NEW String METHOD DESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the ... end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"


Evidence of the progress that has been made related to these methods can be found in messages requesting "compatibility and specification reviews" (CSR) on the core-libs-dev mailing list:

  • Please review CSR: JDK-8200425 String#lines (25 April 2018)
  • Please review CSR: JDK-8200378 String#strip, String#stripLeading, String#stripTrailing (25 April 2018)
  • Please review CSR: JDK-8200425 String#lines (25 April 2018)

A common characteristic of four of these five new methods is that they use a different (newer) definition of "white space" than did old methods such as String.trim(). Bug JDK-8200373 ["String::trim JavaDoc should clarify meaning of space"] even addresses this for the String.trim() method (mailing list review request):

The current JavaDoc for String::trim does not make it clear which definition of "space" is being used in the code. With additional trimming methods coming in the near future that use a different definition of space, clarification is imperative. String::trim uses the definition of space as any codepoint that is less than or equal to the space character codepoint (\u0040.) Newer trimming methods will use the definition of (white) space as any codepoint that returns true when passed to the Character::isWhitespace predicate.

The method isWhitespace(char) was added to Character with JDK 1.1, but the method isWhitespace(int) was not introduced to the Character class until JDK 1.5. The latter method (the one accepting a parameter of type int) was added to support supplementary characters. The Javadoc comments for the Character class define supplementary characters (typically modeled with int-based "code point") versus BMP characters (typically modeled with single character):

The set of characters from U+0000 to U+FFFF is sometimes referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater than U+FFFF are called supplementary characters. The Java platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes. In this representation, supplementary characters are represented as a pair of char values ... A char value, therefore, represents Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. An int value represents all Unicode code points, including supplementary code points. ... The methods that only accept a char value cannot support supplementary characters. ... The methods that accept an int value support all Unicode characters, including supplementary characters.

I added the emphasis in the above quote to emphasize the significance of a "code point," which is defined for the Java context as "a value that can be used in a coded character set". Four of the five proposed new methods for String in JDK 11 rely heavily on the concept embodied in Character.isWhitespace(int) to determine how to "trim" a given string or when determining if a given string is "blank."

Speaking of Unicode, JEP 327 ["Unicode 10"] has been proposed to be added to JDK 11 as well. As that JEP states, its intent is to "upgrade existing platform APIs to support version 10.0 of the Unicode Standard." This will be especially exciting news for anyone wishing to work with the "56 new emoji characters" supported by this new version.

Conclusion

The new methods on String currently proposed for JDK 11 provide a more consistent approach to handling white space in strings that can better handle internationalization, provide methods for trimming white space only at the beginning of the string or at the end of the string, and provide a method especially intended for coming raw string literals.

Additional References

  • Java's String.trim has a strange idea of whitespace
  • Java Tutorial: Unicode
  • Supplementary Characters in the Java Platform
  • Unicode Chart
  • STR01-J. Do not assume that a Java char fully represents a Unicode code point
  • StackOverflow.com: Removing whitespace from strings in Java
  • Checking for Null or Empty or White Space Only String in Java
  • JEP 327: Unicode 10
Java (programming language) Java Development Kit Strings

Published at DZone with permission of Dustin Marx, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Why It Is Important To Have an Ownership as a DevOps Engineer
  • How to Secure Your CI/CD Pipeline
  • How To Use Terraform to Provision an AWS EC2 Instance
  • How Do the Docker Client and Docker Servers Work?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: