Introduction
A HashSet in Java is a part of the java.util package and is used to store unique elements. It is backed by a HashMap, which helps in achieving constant-time complexity (O(1)) for add, remove, and contains operations.
Internal Working of HashSet
1. Data Structure Used
HashSet internally uses a HashMap<K, V> with a dummy value. Each element in the HashSet acts as a key in the HashMap, and the value is a constant placeholder (PRESENT).
Implementation in Java:
private static final Object PRESENT = new Object(); private transient HashMap<E,Object> map;
When you create a HashSet, it initializes an empty HashMap:
public HashSet() {
map = new HashMap<>();
}
2. How Elements are Stored
When an element is added to a HashSet, it calls the put() method of HashMap with the element as the key and a dummy value (PRESENT).
public boolean add(E e) {
return map.put(e, PRESENT) == null;
}
3. What Happens When a Duplicate Key is Added?
- When a duplicate element is added, the
put()method ofHashMapchecks if the key already exists. - If the key exists, the previous value (
PRESENT) is returned, and the new entry is not added. - The
add()method then returnsfalse, indicating that the element was already present.
Example:
HashSet<String> set = new HashSet<>();
System.out.println(set.add("Apple")); // Output: true
System.out.println(set.add("Apple")); // Output: false (already exists)
4. How hashCode() and equals() Help in Uniqueness
- When adding an element,
hashCode()determines the bucket where the element should be placed. - If multiple elements have the same
hashCode(), theequals()method is used to check if they are truly identical. - If
equals()returnstrue, the element is treated as a duplicate and not added.
Example with Custom Objects:
import java.util.HashSet;
import java.util.Objects;
class Person {
String name;
int age;
public Person(String name, int age) {
this.name = name;
this.age = age;
}
@Override
public boolean equals(Object obj) {
if (this == obj) return true;
if (obj == null || getClass() != obj.getClass()) return false;
Person person = (Person) obj;
return age == person.age && Objects.equals(name, person.name);
}
@Override
public int hashCode() {
return Objects.hash(name, age);
}
}
If hashCode() is not overridden properly, two identical objects may be treated as different, leading to duplicates.
Key Takeaways
✔ HashSet provides fast lookup, insertion, and deletion (O(1)).
✔ Internally uses a HashMap where elements are keys, and values are dummy (PRESENT).
✔ Uses hashCode() and equals() to determine uniqueness.
✔ When a duplicate element is added, HashMap.put() prevents duplication by checking keys.
✔ Uses separate chaining (linked list or Red-Black Tree) to handle collisions.
✔ Load factor controls resizing to maintain performance.
✔ Ensure hashCode() and equals() are properly overridden for custom objects.
By understanding the internal working of HashSet, developers can make better choices for performance optimization in Java applications.