Saturday, February 11, 2023

Difference between HashSet, TreeSet, and LinkedHashSet in Java

Hello guys, in the last article, we have learned about difference between HashMap, TreeMap, and LinkedHashMap and in this article, we are going to learn about difference between HashSet, TreeSet, and LinkedHashSet in Java, another popular Java interview questions for 1 to 2 years experienced developers. If you are doing Java development work then you know that LinkedHashSet, TreeSet, and HashSet are three of the most popular implementations of the Set interface in the Java Collection Framework. Since they implement a Set interface, they follow its contracts for not allowing duplicates but there is a key difference between them in terms of ordering of elements. For example, HashSet doesn't maintain any order, while TreeSet keep elements in the sorted order specified by external Comparator or comparison logic defined in the objects'  natural order. Similarly, LinkedHashSet keep the elements into the order they are inserted.  

All these implementation except, TreeSet uses equals() method to check for duplicates, on the other hand TreeSet use compareTo() or compare() method for comparing objects and can break Set interface contract of unique element, if equals method is not consistent with compareTo() or compare() method

In this Java Collection tutorial, we will see the difference between LinkedHashSet vs TreeSet vs HashSet on different points e.g. speed, performance, ordering, synchronization, etc. Based upon these differences we can also decide when to use LinkedHashSet vs TreeSet vs HashSet in Java.

TL;DR, Use HashSet for all general purpose usage i.e. where you need to store only unique elements without any ordering requirement. If you need to maintain order in which elements are added into Set then use LinkedHashSet, it provides ordering with little impact on performance.

You an also use TreeSet when you absolutely need to keep elements in specifically sorted order like keeping employees in increasing order of their age or salary.

Remember, TreeSet is significantly slower than LinkedHashSet and HashSet because of this sorting overhead. BTW, if you are preparing for Java interviews, I suggest taking a look at Java Programming Interview Exposed by Markham. You will find a lot of such questions with a very detailed answer.




Difference between HashSet vs TreeSet vs LinkedHashSet in Java

Before comparing them, let's see some similarities between them. All of them implement Set and Collection interface, so they can be passed to a method that accepts Set as an argument. Though they all are set, they differ in their implementation like LinkedHashSet is backed by linked list and HashMap, TreeSet is implemented as TreeMap till Java 5 and now using NavigableMap from Java 6 onward, and HashSet is also backed by HashMap in Java. 

Now let's see some comparison between them :


1. Synchronization

All three i.e. HashSet, TreeSet, and LinkedHashSet are not synchronized. They can not be shared between multiple threads until specifically synchronized. It's easy to create a synchronized Set, though, all you need to do is use java.util.Collections utility class as shown below :

Synchronizing HashSet in Java

Set s = Collections.synchronizedSet(new HashSet(...));


Synchronizing LinkedHashSet in Java

Set s = Collections.synchronizedSet(new LinkedHashSet(...));


Synchronizing TreeSet in Java

Set s = Collections.synchronizedSet(new TreeSet(...));


2. Ordering

HashSet doesn't guarantee any order, while TreeSet sorts all object-based upon there natural ordering by using the compareTo() method, or custom order by using compare() method of  Comparator passed to them. 

On the other hand, LinkedHashSet also provides ordering support to keep elements in the order they are added into the Collection. 

Actually, this property is also derived from the fact that they are backed by respective Map implementation. You can further see the difference between HashSet and HashMap for few more details on these two classes.




3. Null Element

This property can be deduced form HashMap, LinkedHashMap, and TreeMap since HashSet internally uses HashMap, LinkedHashSet internally uses LinkedHashMap and TreeSet internally uses TreeMap

Both HashMap and LinkedHashMap allow one null key and so are these two Set implementations.

On the other hand, since TreeMap doesn't allow null keys, TreeSet doesn't allow null elements and throws java.lang.NullPointerException when you try to add a null object. The main reason of this is the use of compareTo() and compare() method, which throws NullPointerException if one element is null, but it truly depends on implementation.


4. Implementation

HashSet internally uses a HashMap with dummy value object, while LinkedHashSet uses a LinkedHashMap to guarantee insertion order. When you iterate through HashSet order is unpredictable but when you iterate through LinkedHashSet, you iterate elements in the order they are added. TreeSet is backed by a navigable Map, as shown below :

Source code of HashSet

 public HashSet(int initialCapacity) {
   map = new HashMap<E,Object>(initialCapacity);
 }

Source code of LinkedHashSet

/**
 * Constructs a new, empty linked hash set.  (This package private constructor
 * is only used by LinkedHashSet.) The backing HashMap instance is a 
 * LinkedHashMap with the specified initial capacity and the specified load 
 * factor
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<E,Object>(initialCapacity, loadFactor);
}

Source code of  TreeSet

public TreeSet() {
        this(new TreeMap<E,Object>());
}

Earlier TreeSet is internally implemented TreeMap but with introduction of NavigableMap in Java 1.6, its implementation has changed.




5. Iterator

Iterator returned by all three Set implementations is fail-fast, which means if you modify collection once iteration begins i.e. add or delete elements without using Iterator's remove method, it will throw ConcurrentModificationException

Also, the Iterator of HashSet doesn't guarantee any order, while the Iterator of LinkedHashSet lets you iterate in the order elements are added. You can also see this article to learn more about different types of Iterator in Java.


6. Performance

Out of HashSet, LinkedHashSet, and TreeSet,  HashSet is the fastest for common operations like add, search and remove, LinkedHashSet is a close second, as it suffers a little drop in performance due to the overhead of maintaining a doubly linked list when an element is inserted or deleted. 

TreeSet is much slower than these two because it needs to perform sorting every time there is a change in TreeSet. It means by default you should use HashSet, until and unless you really need LinkedHashSet or TreeSet

You can also check this article for more differences between TreeSet and TreeMap in Java.


Summary

Here is a nice table to compare the different implementations of the Set interface e.g. EnumSet, TreeSet, CopyOnWriteArraySet, HashSet, LinkedHashSet, and ConcurrentSkipListSet on different parameters e.g. data structure, sorting, iterator and how they handle null input.

Difference between LinkedHashSet, TreeSet and HashSet in Java


That's all about the difference between LinkedHashSet vs TreeSet vs HashSet in Java. You should use HashSet for general purpose Set requirement, where you need to store only unique elements without any sorting or order requirement, if you need to maintain the order on which elements are added into Set, apart from guarantee of unique elements, use LinkedHashSet

If you need to keep your elements in a particular sorting order use TreeSet, you can either keep elements in their natural order or you can customize their sorting order by providing a Comparator e.g. keeping notes in increasing order of their value.


Thanks for reading this article so far. If you like this Java Set tutorial and difference between HashSet, TreeSet, and LinkedHashSet in Java then please share with your friends and colleagues. If you have any questions or feedback then please drop a note. 

P. S. - If you are new to Java and want to master Java collection framework then you can also checkout this list of best Java Collection and Stream courses to learn Java Collection Framework in depth. It include best Java collections course from Udemy and Pluralsight. 

1 comment:

Feel free to comment, ask questions if you have any doubt.