DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Smart-Doc: Generating gRPC API Documentation in Java Projects
  • Performance of ULID and UUID in Postgres Database
  • Increase Your Code Quality in Java by Exploring the Power of Javadoc
  • Extending Swagger and Springdoc Open API

Trending

  • Microservices: Externalized Configuration
  • Optimizing Databricks Spark Pipelines Using Declarative Patterns
  • YOLOv5 PyTorch Tutorial
  • A Comprehensive Guide to Prompt Engineering
  1. DZone
  2. Coding
  3. Java
  4. Undocumented Java 16 Feature: The End-of-File Comment

Undocumented Java 16 Feature: The End-of-File Comment

A Unicode escape (\u001a) in Java 16 acts like an end-of-file comment, silently cutting off compilation—an undocumented quirk with real implications.

By 
Pravin Jain user avatar
Pravin Jain
·
Jul. 23, 25 · Analysis
Likes (8)
Comment
Save
Tweet
Share
4.5K Views

Join the DZone community and get the full member experience.

Join For Free

While working on some code where I wanted to obscure parts of it using Unicode escapes instead of the actual source, I accidentally stumbled upon an undocumented feature that’s been around since Java 16: what I call the end-of-file comment.

In Java, we typically have three types of comments:

  • Single-line comment: starts with // and runs to the end of the line.
  • Block comment: starts with /* and ends with */. It can span multiple lines.
  • Documentation comment: starts with /** and ends with */. This is a special kind of block comment used by the javadoc tool to generate documentation for classes, methods, fields, constructors, and so on. It must appear right before the element it describes.

End-of-File Comment

The end-of-file comment makes use of the end-of-file character, which causes everything after it in the file to be ignored by the compiler.

The Unicode escape for the end-of-file character is \u001a.

Consider the following sample Hello.java file:

Java
 
class HelloWorld {
    public static void main(String[] args) {
        System.out.printf("Hello world\n");
    }
}
interfface HelloInterface {
    public static void main(String[] args) {
        System.out.println("Hello from interface");
    }
}
enum HelloEnum {
HELLO,
HI,
;
public static void main(String[] args) {
        System.out.println("Hello from enum");
    }
}
record HelloRecord() {
    public static void main(String[] args) {
        System.out.println("Hello from record");
    }
}
@interface HelloAnnotation {
}


In the above Hello.java file, if we want to comment out the last two definitions (i.e., HelloRecord and HelloAnnotation), then we can use the end-of-file character before the HelloRecord definition, like so:

Java
 
class HelloWorld {
    public static void main(String[] args) {
        System.out.printf("Hello world\n");
    }
}
interfface HelloInterface {
    public static void main(String[] args) {
        System.out.println("Hello from interface");
    }
}
enum HelloEnum {
HELLO,
HI,
;
public static void main(String[] args) {
        System.out.println("Hello from enum");
    }
}
\u001a  The rest of the file content get commented

record HelloRecord() {
    public static void main(String[] args) {
        System.out.println("Hello from record");
    }
}
@interface HelloAnnotation {
}


This end-of-file character works as the start of a comment up to the end of the file, starting from Java 16.

Prior to Java 16, this character could only be used as the last character in a Java file—nothing was accepted beyond it.

In other words, in the above code, you’d get a compilation error after the usage of the end-of-file character if you were using any Java compiler prior to Java 16.

About Keywords and Identifiers

A few interesting observations about Java keywords:

  • Prior to Java 9, all Java keywords were restricted identifiers (i.e., they follow the rules of identifiers but are restricted from being used as identifiers).
  • From Java 9, with the introduction of module definitions in module-info.java, Java restricted usage of some additional identifiers in specific contexts—e.g., module in a module definition. Java preferred to call these restricted keywords.
  • Up to Java 15, more identifiers were restricted in specific contexts. Java referred to these as restricted identifiers, though technically all keywords are also restricted identifiers (they’re restricted in all contexts).

Then an interesting thing happened in Java 16.

non-sealed: The Non-Identifier Keyword in Java 16

In Java 16, keywords were categorized into reserved keywords and contextual keywords.

  • Reserved keywords are restricted from being used as identifiers everywhere.
  • Contextual keywords are restricted in only certain contexts.

One notable addition in Java 16 was the contextual keyword: non-sealed. This isn’t a restricted identifier like other keywords; instead, it behaves more like an expression.

Processing Identifier-Ignorable Characters

Another thing to note: when processing identifier-ignorable characters, the compiler treats all keywords as identifiers, and non-sealed is treated as two identifiers.

i.e., identifier-ignorable characters are allowed inside keywords.

For example:

Java
 
instance\u00adof


(\u00ad is the Unicode escape for the soft hyphen character, which is one of the identifier-ignorable characters). This is equivalent to writing instanceof.

The identifier-ignorable characters are discussed and listed in the article Charsets and Unicode Identifiers in Java.

All these characters are valid as java-identifier-part, but not as java-identifier-start.

This can be checked with the following code:

Java
 
IntStream.range(0, 0x10ffff)
         .filter(Character::isIdentifierIgnorable)
         .allMatch(Character::isJavaIdentifierPart);  // returns true
Java
 
IntStream.range(0, 0x10ffff)
         .filter(Character::isIdentifierIgnorable)
         .anyMatch(Character::isJavaIdentifierStart);  // returns false


So, in the case of non-sealed, an identifier-ignorable character is not allowed in two places:

  1. At the beginning
  2. At the beginning of sealed in non-sealed

Acceptable:

Java
 
no\u00adn-sealed


Not acceptable:

Java
 
\u00adnon-sealed
non-\u00adsealed


It seems this undocumented behavior, the end-of-file comment, may have been unintentionally introduced while Java was addressing the contextual keyword non-sealed, which is the only keyword that isn’t a restricted identifier.

Conclusion

Starting with Java 16, the Unicode end-of-file character (\u001a) began behaving in a way that causes the compiler to ignore the rest of the file after its occurrence. This isn't technically a comment, but it acts similarly by truncating further compilation. The change appears to be a side effect of internal updates to Java’s parsing and keyword handling, particularly related to contextual keywords like non-sealed. While useful in niche cases, this behavior remains undocumented and isn't compatible with earlier Java versions.

Because it allows developers to bypass compilation for any content following the character, it can affect code clarity, debugging, and various developer tools. Java maintainers may eventually need to either formalize this behavior by updating the Java Language Specification (JLS) or clarify its unintended status and consider restricting or deprecating it in future versions to avoid confusion or accidental misuse.

Documentation Identifier Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • Smart-Doc: Generating gRPC API Documentation in Java Projects
  • Performance of ULID and UUID in Postgres Database
  • Increase Your Code Quality in Java by Exploring the Power of Javadoc
  • Extending Swagger and Springdoc Open API

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook