Monday, July 4, 2022

Why String is Immutable or final in Java - 5 Reasons

There is hardly any Java Interview, where no questions are asked from String, and Why String is Immutable in Java is I think most popular Java String question. This question is also asked as Why String class is made final in Java or simply, Why String is final. In order to answer these questions, Java programmer must have a solid understanding of how String works, what are special features of this class, internal structure and implementation of String and some key fundamentals. The String class is a God class in Java, It has got special features which is not available to other classes like String literals are stored in string pool

You can concatenate strings using + operator. Given its importance in Java programming, Java designer has made it final, which means you can not extend java.lang.String class, this also helps to make String object Immutable.


Now coming to questions, Why String is Immutable in Java? of course it should be related to benefits, advantages. Now let's think what are those advantages or features, which drives this decision. I don't know if there is any official document from Oracle or Sun previously, which can throw some light on this decision.

Though I remember reading somewhere, that once asked to James Gosling, creator of Java about making String class final, he has said something along security. It's been argued that making a class final seriously limits its ability to evolve or extend and James has made comment that, classes which are key to Java's Security commitment are made final, so that no one can change its behavior and game with Java platform.




5 Reasons of Why String is final or Immutable in Java

Though true reasons of why String class was made final is best known to Java designers, and  apart from that hint on security by James Gosling, I think following reasons also suggest Why String is Final or Immutable in Java.

1. String Pool

Java designer knows that String is going to be most used data type in all kind of Java applications and that's why they wanted to optimize from start. One of key step on that direction was idea of storing String literals in the String pool. 

The goal was to reduce temporary String object by sharing them and in order to share, they must have to be from Immutable class. You can not share a mutable object with two parties that are unknown to each other. Let's take a hypothetical example, where two reference variable is pointing to same String object:

String s1 = "Java";
String s2 = "Java";

Now if s1 changes the object from "Java" to "C++", the reference variable also got value s2="C++", which it doesn't even know about it. By making String immutable, this sharing of String literal was made possible. In short, the key idea of the String pool can not be implemented without making String final or Immutable in Java.


Why String Class is made Immutable or Final in Java




2. Security

Java has a clear goal in terms of providing a secure environment at every level of service and String is critical in that whole security stuff. The string has been widely used as a parameter for many java classes, like for opening network connection, you can pass host and port as String, for reading files in Java you can pass the path of files and directory as String, and for opening database connection, you can pass database URL as String. 

If String was not immutable, a user might have granted to access a particular file in the system, but after authentication, he can change the PATH to something else, this could cause serious security issues. 

Similarly, while connecting to a database or any other machine in the network, mutating the String value can pose security threats. Mutable strings could also cause security problems in Reflection as well, as the parameters are strings.


3. Use of String in Class Loading Mechanism

Another reason for making String final or Immutable was driven by the fact that it was heavily used in the class loading mechanism. As String been not Immutable, an attacker can take advantage of this fact and a request to load standard Java classes like java.io.Reader can be changed to malicious class com.unknown.DataStolenReader. By keeping String final and immutable, we can at least be sure that JVM is loading the correct classes.




4. Multithreading Benefits

Since Concurrency and Multi-threading was Java's key offering, it made lot of sense to think about the thread-safety of String objects. Since it was expected that String will be used widely, making it Immutable means no external synchronization, which means much cleaner code involving sharing of String between multiple threads

This single feature, makes already complicate, confusing and error prone concurrency coding much easier. Because String is immutable and we just share it between threads, it result in more readable code.


5. Optimization and Performance

Now when you make a class Immutable, you know in advance that, this class is not going to change once created. This guarantees open path for many performance optimization e.g. caching. String itself knows that I am not going to change, so String caches its hashcode. It even calculate hashcode lazily and once created, just cache it. 

In a simple world, when you first call hashCode() method of any String object, it calculates hash code and all subsequent call to hashCode() returns already calculated, cached value. 

This results in good performance gain, given String, is heavily used in hash-based Maps like Hashtable and HashMap. Caching of hashcode was not possible without making it immutable and final, as it depends upon content of String itself.




Pros and Cons of String being Immutable or Final in Java

Apart from the above benefits, there is one more advantage that you can count on due to String being final in Java. It's one of the most popular objects to be used as a key in hash-based collections like HashMap and Hashtable

Why String class is made Immutable or Final in JavaThough immutability is not an absolute requirement for HashMap keys, its much safer to use Immutable object as key than mutable ones, because if the state of a mutable object is changed during its stay inside HashMap, it would be impossible to retrieve it back, given its equals() and hashCode() method depends upon the changed attribute. 

If a class is Immutable, there is no risk of changing its state, when it is stored inside hash-based collections.  Another significant benefit, which I have already highlighted is its thread safety. Since String is immutable, you can safely share it between threads without worrying about external synchronization. It makes concurrent code more readable and less error-prone.

Despite all these advantages, Immutability also has some disadvantages, like it doesn't come without cost. Since String is immutable, it generates lots of temporary use and throw object, which creates pressure for the Garbage collector. Java designer has already thought about it and storing String literals in pool is their solution to reduce String garbage. 

why string is immutable in Java


It does help, but you have to be careful to create String without using constructor e.g. new String() will not pick an object from String pool. Also on average Java application generates too much garbage. Also storing Strings in pool has a hidden risk associated with it. String pool is located in PermGen Space of Java Heap, which is very limited as compared to Java Heap. 

Having too many String literals will quickly fill this space, resulting in java.lang.OutOfMemoryError: PermGen Space. Thankfully, Java language programmers has realized this problem and from Java 7 onwards, they have moved String pool to normal heap space, which is much much larger than PermGen space. 

There is another disadvantage of making String final, as it limits its extensibility. Now, you just can not extend String to provide more functionality, though more general cases its hardly needed, still its limitation for those who wants to extend java.lang.String class.


That's all about why String is final or Immutable in Java. While we don't know the exact reason as Oracle never published about their decision of making String class final in Java, these 5 practical reasons of caching, security, concurrency, and performance definitely gives an hint that Why the String class has been made Final and Immutable in Java

Of-course it's decision of Java designers but looks like above points contributes to take them this decision. Due to similar reasons wrapper classes like Integer, Long, Double and Float are also immutable and Final 

Other Java String Tutorials and Examples you may like to explore:

18 comments:

  1. #3 looks bogus to me, if you have the ability to replace String with an evil twin you might as well do it directly for the IO classes to get your DataStolenReader into the jvm

    ReplyDelete
  2. #2 - If thats true, StringBuffer, StringBuilder etc.. should also be made immutable. We use these classes to constrct a text and probably convert it to String and use it.

    ReplyDelete
    Replies
    1. please refer this
      http://stackoverflow.com/questions/47605/string-concatenation-concat-vs-operator
      operator + is treated different from .concate method.

      Delete
  3. One advantage of making String immutable is for saving memory. When your program grows the number of String instances it creates also grows and if you don't cache String constants you end up with lots and lots of String in your heap space. By caching and sharing String constants JVM reduces lots of memory for real world Java applications.

    ReplyDelete
  4. Can you please elaborate 1st Reason?

    ReplyDelete
  5. I like your simple language to complex matters!

    ReplyDelete
  6. Can you please elaborate 1st Reason?

    ReplyDelete
  7. The first point, second paragraph is contradictory with the actual implementation,

    Now if s1 changes the object from "Java" to "C++", reference variable also got value s2="C++", which it doesn't even know about it. By making String immutable, this sharing of String literal was possible. In short, key idea of String pool can not be implemented without making String final or Immutable in Java.

    S2 will not have value of C++. the reference mapping of S1 is removed from "Java" and mapped to new location with value of C++.

    ReplyDelete
    Replies
    1. No, it is not contradictory.
      What he meant is, if the Strings are mutable, by changing Java to C++, both S1 and S2 will affect.
      And with immutability in place, the point you said, "S2 will not have value of C++. the reference mapping of S1 is removed from "Java" and mapped to new location with value of C++" holds good.

      Delete
  8. Let's take an hypothetical example, where two reference variable is pointing to same String object:

    String s1 = "Java";
    String s2 = "Java";

    Now if s1 changes the object from "Java" to "C++", reference variable also got value s2="C++", which it doesn't even know about it.

    TAKING about this example it that really possible. plz provide the code.

    ReplyDelete
    Replies
    1. It's not possible now because String is Immutable but if it was Mutable and then String was shared from pool then it was possible.

      Delete
  9. Can someone please more elaborate on point 5

    ReplyDelete
  10. Could someone please explain this properly example with a program.

    ReplyDelete
  11. I am trying to understand point #2. Since String ref. variable will be used across the code. String objects are not mutable, but we can assign new malicious value to String reference Variable and hence that new value will be used across the code. e.g.
    public class MyApi {
    final String myUrl;
    public MyApi(String urlString) {
    // Verify that urlString points to an approved server
    if (!checkApprovedUrl(urlString)) throw new IllegalArgumentException();
    myUrl = urlString;
    }
    }
    In the above code suppose some new value is assigned to urlString say after the if condition and which will be used across the code, thus compromising security. Please explain.

    Thanks.

    ReplyDelete
    Replies
    1. I am not sure if my example is correct. But consider there are two methods in the library to connect to db - 1. Configure database details configure(). 2. Connect to db connect(). Suppose Strings were immutable, then a hacker can possibly change the url, username, password directly after the configure call and then call connect method to maybe connect to his database. The use case might be to modify an existing software/application and share it as the original application, so that the user's data might be compromised. Let me know guys if I'm right.

      Delete
    2. Yes, your explanation is correct, but I think you mean "Suppose Strings were Mutable, then a hacker can possibly change the url, username, password directly" . You wrote Immutable there. If String is Immutable then it cannot be changed.

      Delete

Feel free to comment, ask questions if you have any doubt.