Java Virtual Machine Internals, Part 2: Class File Format

DZone 's Guide to

Java Virtual Machine Internals, Part 2: Class File Format

Learn more about JVM internals and the class file format.

· Java Zone ·
Free Resource

In this series of articles, I’ll be discussing how the Java Virtual Machine works. In part 1, we looked at the ClassLoader sub-system of the Java Virtual Machine. But in this article, we are going to talk about the class file format.

As most of us already know, all source code written in Java is first compiled into bytecodes (instructions for the Java Virtual Machine) by using the javac compiler provided in the Java Development Kit. Bytecodes are saved in a binary file format called the class file format. These class files (bytecodes) are then loaded dynamically (only when required) into memory by the class loader component of Java Virtual Machine. In simple interpreter mode, the Java execution engine executes these bytecodes one by one on the host machine CPU.

Image title

Every file with .java extension is compiled into at least one .class file. There is one .class file for each class, interface, and module defined in the source code. This applies to nested classes or interfaces too.

Note: For simplicity, files with the .class extension are called a class file here.

Let’s write a simple program

public class ClassOne{
    public static void main(String[] args){
        System.out.println("Hello world");
    static class StaticNestedClass{
class ClassTwo{
interface InterfaceOne{

Running on this file produces the following files:


As you can see, one class file is created for each class, and the interface is defined in the source file.

What’s Inside the Class File?

The class file is a binary file format. Information is generally written to the class file with no space or padding between consecutive pieces of information; everything is aligned on byte boundaries. All 16-bit and 32-bit quantities are constructed by reading two and four consecutive 8-bit bytes, respectively.

A class file contains the following information:

Magic number: The first four bytes of every class file are always 0xCAFEBABE. These four bytes identify the class file format from others.

Major and minor version: This is the second four bytes of the class file containing the major and minor version numbers. Together, a major and a minor version number determine the version of the class file format. If a class file has major version number M and minor version number m, we denote the version of its class file format as M.m.

Every JVM has a maximum version it can load, and JVMs will reject class files with later versions. For example, Java 11 supports major versions from 45 to 55 and Java 12 supports major version 45..56

Constant Pool: This is a table of structures (heterogeneous) that represent string constants, class, interface names, field names, method names, and other constants referred to within the ClassFile structure and its substructures. Each element of this constant pool starts with a one-byte tag specifying the type of constant at that position in the table. Depending on the type of constant, the next bytes can be the constant-value or reference to another element in the pool.

Access flags: The list of flags tells you whether this class or interface has public or private access, and whether this class is final or allows extensions. Various flags such as  ACC_PUBLIC,  ACC_FINAL,  ACC_INTERFACE,  ACC_ENUM, etc. are defined in the JVM Specification document.

This class: Refers to an entry in the constant pool.

Super class: Refers to an entry in the constant pool.

Interfaces: Counts the number of interfaces implemented by this class.

Field Count: Counts the number of fields in this class or interface.

Fields: Following the field count is a table of variable-length structures, one for each field describing the type of field and name (a reference to the constant pool entry)

Method Count: Counts the number of methods in the class or interface. This count includes only those methods that are explicitly defined by this class, not any methods that may be inherited from superclasses.

Methods: Following the method count are the methods themselves. The structure for each method contains several pieces of information about the method, including the method descriptor (as well as its return type and argument list), the number of words required for the method’s local variables, the maximum number of stack words required for the method’s operand stack, a table of exceptions caught by the method, the bytecode sequence, and a line number table.

Attribute Count: Counts the number of attributes in this class, interface, or module.

Attributes: Following the attribute count is a table or variable-length structure describing each attribute. For example, one attribute is the source code attribute; it reveals the name of the source file from which this class file was compiled.

Although the class file format is not readable directly, the JDK provides a tool called javap, which dissembles the class file and outputs its content in a readable format.

Let’s write a simple Java program:

package bytecode;
import java.io.Serializable;

public class HelloWorld implements Serializable, Cloneable {

    public static void main(String[] args) {
        System.out.println("Hello World");

Let’s compile this program using the javac command, which will produce a HelloWorld.class file. Then, we will use the javap tool to disassemble the HelloWorld.class file. Using javap with -v (verbose) on HelloWorld.class produces the following output:

Classfile /Users/apersiankite/Documents/code_practice/java_practice/target/classes/bytecode/HelloWorld.class
  Last modified 02-Jul-2019; size 606 bytes
  MD5 checksum 6442d93b955c2e249619a1bade6d5b98
  Compiled from "HelloWorld.java"
public class bytecode.HelloWorld implements java.io.Serializable,java.lang.Cloneable
  minor version: 0
  major version: 55
  flags: (0x0021) ACC_PUBLIC, ACC_SUPER
  this_class: #5                          // bytecode/HelloWorld
  super_class: #6                         // java/lang/Object
  interfaces: 2, fields: 0, methods: 2, attributes: 1
Constant pool:
   #1 = Methodref          #6.#22         // java/lang/Object."<init>":()V
   #2 = Fieldref           #23.#24        // java/lang/System.out:Ljava/io/PrintStream;
   #3 = String             #25            // Hello World
   #4 = Methodref          #26.#27        // java/io/PrintStream.println:(Ljava/lang/String;)V
   #5 = Class              #28            // bytecode/HelloWorld
   #6 = Class              #29            // java/lang/Object
   #7 = Class              #30            // java/io/Serializable
   #8 = Class              #31            // java/lang/Cloneable
   #9 = Utf8               <init>
  #10 = Utf8               ()V
  #11 = Utf8               Code
  #12 = Utf8               LineNumberTable
  #13 = Utf8               LocalVariableTable
  #14 = Utf8               this
  #15 = Utf8               Lbytecode/HelloWorld;
  #16 = Utf8               main
  #17 = Utf8               ([Ljava/lang/String;)V
  #18 = Utf8               args
  #19 = Utf8               [Ljava/lang/String;
  #20 = Utf8               SourceFile
  #21 = Utf8               HelloWorld.java
  #22 = NameAndType        #9:#10         // "<init>":()V
  #23 = Class              #32            // java/lang/System
  #24 = NameAndType        #33:#34        // out:Ljava/io/PrintStream;
  #25 = Utf8               Hello World
  #26 = Class              #35            // java/io/PrintStream
  #27 = NameAndType        #36:#37        // println:(Ljava/lang/String;)V
  #28 = Utf8               bytecode/HelloWorld
  #29 = Utf8               java/lang/Object
  #30 = Utf8               java/io/Serializable
  #31 = Utf8               java/lang/Cloneable
  #32 = Utf8               java/lang/System
  #33 = Utf8               out
  #34 = Utf8               Ljava/io/PrintStream;
  #35 = Utf8               java/io/PrintStream
  #36 = Utf8               println
  #37 = Utf8               (Ljava/lang/String;)V
  public bytecode.HelloWorld();
    descriptor: ()V
    flags: (0x0001) ACC_PUBLIC
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
        line 4: 0
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lbytecode/HelloWorld;

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
      stack=2, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: ldc           #3                  // String Hello World
         5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
         8: return
        line 7: 0
        line 8: 8
        Start  Length  Slot  Name   Signature
            0       9     0  args   [Ljava/lang/String;
SourceFile: "HelloWorld.java"

Here, you can see that this class is publicly accessible; it’s constant pool has 37 entries, has one attribute (source file at the bottom), implements two interfaces (Serializable, Cloneable), has zero fields, and two methods.

You might be wondering why there is only one static main method in the source code but the class file says there are two methods. Well, remember the default constructor in Java programming language: It’s a no-arg constructor added by the javac compiler whose bytecodes are also visible in the output. Constructors are treated as methods.

Another tip when using the javap tool, you can use it to see how lambdas are different than anonymous inner classes. You can read more about the javap tool here.

In the next part of this series, I will talk about the memory layout when running a JVM instance. Stay tuned!

java ,jvm architecture ,jvm byte code ,java virtual machine ,jdk 11

Published at DZone with permission of Prateek Saini . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}