String Creation - Looking Under the Hood
In order to master Java, the first step is to master the String class but to achieve that you need to explore, by checking the way it works under the hood.
Join the DZone community and get the full member experience.Join For Free
Recently, I came across the following question on a forum:
How many String objects are created here? One or Two?
String langName = new String("Java");
I was curious how others answered it, knowing that this is a tricky question, particularly if one is not aware of how the String class works in Java. So, I dived into the comment section.
To my surprise, there were people that have chosen 'One' as the correct answer, but even a lot more have chosen 'Two' as the correct answer. For a microsecond, I started to doubt my knowledge about Strings.
The correct answer is that 'it depends'. The question is not explicit enough and it leaves room for debate. I would reformulate it in the following way:
How many “Java” String objects are created in memory by executing this statement?
The answer is One.
How many “Java” String objects will there be in memory after executing this statement?
The answer is Two.
Hopefully, the uncertainty vanished as soon as I checked out the memory dump of a program with a statement like in the example above.
Fig. 1 Amount of "Java" String objects in the heap memory
The memory dump of this program reveals the existence of two String objects with the same content. This proves that invoking the constructor of the class String and passing as argument a String literal results in two objects added to the heap memory, one in normal (non-pool) memory, and another in the String Constant Pool (SCP), a memory area which is also part of the heap. The tricky part is when they will be added.
The invocation of the constructor always results in a new object placed in the non-pool area. But the argument of the constructor which is a string literal is an object as well and it is created and stored in SCP during class loading, provided that the string pool doesn’t contain a string with the same content.
According to the Java Language Specification, “string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.”
Since the string “Java” is literal and therefore represents the value of a constant expression, it is pooled.
To make it even more obvious, let’s rewrite the statement presented at the beginning as follows:
String java = "Java"; String langName = new String(java);
Now, let’s go back to the main question. Will the following statement create one or two String objects:
String langName = new String("Java");
To answer that and eliminate any suspicion, let’s look at the bytecode of the main method:
public static main([Ljava/lang/String;)V L0 LINENUMBER 11 L0 NEW java/lang/String DUP LDC "Java" INVOKESPECIAL java/lang/String.<init> (Ljava/lang/String;)V ASTORE 1 L1 LINENUMBER 14 L1 FRAME APPEND [java/lang/String] GOTO L1 L2 LOCALVARIABLE args [Ljava/lang/String; L0 L2 0 LOCALVARIABLE langName Ljava/lang/String; L1 L2 1 MAXSTACK = 3 MAXLOCALS = 2
On line 6, you can see the LDC (Load Constant) command. It loads an item from the String Constant Pool into the stack. This means that at the time the constructor was called, the “Java” literal, which is also an object, had already been added to the pool. This happened during class loading.
So, the invocation of the String class constructor with a string literal creates only one object and places it in the non-pool memory area. This analysis has been performed using Oracle JDK 8u271 and OpenJDK 11.0.8.
Opinions expressed by DZone contributors are their own.