Hash Sets

Say you have a long list of email addresses. You want to know if one email is already in it. With a plain list you have to walk through every item and compare them one by one. That gets slow when the list grows. A hash set fixes this. A hash set is a collection that stores only unique values. It lets you check “is this value here?” almost instantly.

🎯 What Is a Hash Set?

A hash set is a container that holds a group of values. Every value appears only once. You cannot have two copies of the same value inside it.

The “hash” part means it uses a hashing trick inside. It runs your value through a small function. That function turns the value into a number. The number points to the slot where the value lives. So when you ask “is this value here?”, the set jumps straight to the right slot. It does not check everything.

That is why the main jobs of a set are fast on average:

  • add a value
  • contains check if a value is present
  • remove a value

All three take O(1) time on average. O(1) means the time stays the same no matter how big the set gets.

Note

Think of a hash set like the guest list at a wedding gate. The guard does not read the whole list top to bottom. They jump to your name and say yes or no. And your name is on the list only once, never twice.

🤔 Hash Set vs Hash Map

People mix these up, so let us clear it now. Both use the same hashing idea inside. The difference is what they store.

  • A hash map stores key and value pairs. Like a name to a phone number. You look up a key and get back its value.
  • A hash set stores keys only. There is no value attached. You only ask one thing. Is this key in here or not?

So a set is like a map where you threw away the values and kept only the keys.

Feature Hash Set Hash Map
Stores Keys only Key and value pairs
Question it answers Is this value present? What value belongs to this key?
Duplicates Not allowed Keys must be unique
Common use Membership check, remove duplicates Lookup data by a key

🛠️ Where We Use a Hash Set

A hash set shines in a few everyday jobs. Here is when you reach for one:

  • Remove duplicates. You have a list with repeated items. Drop them all into a set and the repeats vanish on their own. A set refuses copies, so this just works.
  • Fast membership check. You need to ask “have I seen this before?” again and again. A set answers each time in O(1). A plain list would have to scan everything.

You will see both of these in interview problems all the time. Things like “find the first repeated number” or “does this array have any duplicates” become easy with a set.

📝 Steps to Use a Hash Set

Every language gives you a built-in set. So you do not build one from scratch. The pattern is the same everywhere.

  1. Create an empty set.
  2. Add values into it with an add or insert call. Adding a value that is already there does nothing, so no duplicates form.
  3. Ask if a value is present with a contains or membership check. You get back true or false.
  4. Remove a value when you no longer need it.
  5. To remove duplicates from a list, put every list item into a set, then read the set back out.

Now let us see this work in real code.

💻 Hash Set in Code

The demo below does the same thing in every language. It adds a few fruit names. It checks membership. Then it takes a list with repeats and squeezes it down to unique values using a set.

Python has a set built right into the language. You write it with curly braces or the set() call. Use add to insert, the in keyword to check membership, and discard to remove safely.

hash_set_demo.py
def main():
fruits = set()
# Add values. Adding apple twice keeps just one.
fruits.add("apple")
fruits.add("banana")
fruits.add("apple")
# Membership check with the in keyword.
print("Has apple?", "apple" in fruits)
print("Has mango?", "mango" in fruits)
# Remove duplicates from a list by wrapping it in a set.
numbers = [3, 5, 3, 7, 5, 9, 7]
unique = set(numbers)
print("Unique count:", len(unique))
main()

The output of the above code will be:

Output
Has apple? True
Has mango? False
Unique count: 4

Tip

See the pattern? Drop a whole list into a set and the duplicates disappear by themselves. The list [3, 5, 3, 7, 5, 9, 7] has seven items but only four unique ones, so the set keeps just 3, 5, 7, 9.

⏱️ Time Complexity

The built-in sets in C++, Java, Python, and JavaScript all use real hashing inside. So the main jobs are fast on average. They run in O(1) time. The C version above is the odd one out, since it scans an array each time.

Note

add, contains, and remove on a real hash set are O(1) on average. In the rare worst case, when many values land in the same slot, they can slip to O(n). For everyday inputs you treat them as O(1).

Operation Average Worst Case
Add a value O(1) O(n)
Contains check O(1) O(n)
Remove a value O(1) O(n)
Remove duplicates from n items O(n) O(n²)

⚠️ Common Mistakes

A few things trip people up with sets. Watch out for these:

  • Expecting an order. A hash set does not keep your values in the order you added them. If you print it, the order can look random. When you need order, use a list or an ordered set type.
  • Trusting a set to hold duplicates. It never will. If you actually need to count how many times each value appears, use a hash map, not a set.
  • Removing a value that is not there. In some languages this throws an error. Python’s discard and JavaScript’s delete are safe, but be careful with methods that assume the value exists.
  • Putting changeable items inside. In Python a set can only hold things that do not change, like numbers and strings. A list cannot go inside a set because it can change.

🧩 What You’ve Learned

  • ✅ A hash set stores only unique values, with no duplicates allowed.
  • ✅ Its three main jobs, add, contains, and remove, are O(1) on average thanks to hashing.
  • ✅ A set stores keys only, while a hash map stores key and value pairs.
  • ✅ Sets are perfect for removing duplicates and fast membership checks.
  • ✅ Each language ships a built-in: C++ unordered_set, Java HashSet, Python set, JavaScript Set, while C makes you build your own.
  • ✅ A set does not keep insertion order, so never rely on it.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What is the main difference between a hash set and a hash map?

    Why: A hash set keeps only keys to answer 'is this present?', while a hash map pairs each key with a value.

  2. 2

    What happens when you add a value that is already in a hash set?

    Why: A set refuses duplicates, so adding an existing value leaves the set unchanged.

  3. 3

    What is the average time to check if a value is in a hash set?

    Why: Hashing lets the set jump straight to the right slot, so a contains check is O(1) on average.

  4. 4

    Which is a good job for a hash set?

    Why: Dropping a list into a set strips its duplicates automatically, since a set holds each value only once.

🚀 What’s Next?

Share & Connect