Posted on: February 8, 2025 Posted by: rahulgite Comments: 0

Introduction

A HashSet in Java is a part of the java.util package and is used to store unique elements. It is backed by a HashMap, which helps in achieving constant-time complexity (O(1)) for add, remove, and contains operations.


Internal Working of HashSet

1. Data Structure Used

HashSet internally uses a HashMap<K, V> with a dummy value. Each element in the HashSet acts as a key in the HashMap, and the value is a constant placeholder (PRESENT).

Implementation in Java:

private static final Object PRESENT = new Object();
private transient HashMap<E,Object> map;

When you create a HashSet, it initializes an empty HashMap:

public HashSet() {
    map = new HashMap<>();
}

2. How Elements are Stored

When an element is added to a HashSet, it calls the put() method of HashMap with the element as the key and a dummy value (PRESENT).

public boolean add(E e) {
    return map.put(e, PRESENT) == null;
}

3. What Happens When a Duplicate Key is Added?

  • When a duplicate element is added, the put() method of HashMap checks if the key already exists.
  • If the key exists, the previous value (PRESENT) is returned, and the new entry is not added.
  • The add() method then returns false, indicating that the element was already present.

Example:

HashSet<String> set = new HashSet<>();
System.out.println(set.add("Apple")); // Output: true
System.out.println(set.add("Apple")); // Output: false (already exists)

4. How hashCode() and equals() Help in Uniqueness

  • When adding an element, hashCode() determines the bucket where the element should be placed.
  • If multiple elements have the same hashCode(), the equals() method is used to check if they are truly identical.
  • If equals() returns true, the element is treated as a duplicate and not added.

Example with Custom Objects:

import java.util.HashSet;
import java.util.Objects;

class Person {
    String name;
    int age;

    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        Person person = (Person) obj;
        return age == person.age && Objects.equals(name, person.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }
}

If hashCode() is not overridden properly, two identical objects may be treated as different, leading to duplicates.


Key Takeaways

HashSet provides fast lookup, insertion, and deletion (O(1)).

✔ Internally uses a HashMap where elements are keys, and values are dummy (PRESENT).

✔ Uses hashCode() and equals() to determine uniqueness.

✔ When a duplicate element is added, HashMap.put() prevents duplication by checking keys.

✔ Uses separate chaining (linked list or Red-Black Tree) to handle collisions.

✔ Load factor controls resizing to maintain performance.

✔ Ensure hashCode() and equals() are properly overridden for custom objects.

By understanding the internal working of HashSet, developers can make better choices for performance optimization in Java applications.

Leave a Comment