Group Anagrams

This one looks tricky the first time you see it. But it is really a question about one thing. Can you find a smart way to tell which words belong together? The interviewer wants to see if you can spot a common signature that every anagram in a group shares. Once you find that signature, the whole problem becomes easy. So let us slow down and figure out what that signature is.

🧩 The Problem

You are given a list of words. You have to group together the words that are anagrams of each other. An anagram is a word made by rearranging the letters of another word. So "eat" and "tea" are anagrams. They use the same letters, just in a different order.

Here is what the input and output look like:

Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [
["eat", "tea", "ate"],
["tan", "nat"],
["bat"]
]

See how "eat", "tea", and "ate" all sit in one group? They are all built from the letters e, a, and t. The order of the groups does not matter. The order inside a group does not matter either. We just need the right words bunched together.

🐒 The Brute Force Idea

The first idea that comes to mind is to compare every word with every other word. Take one word. Then walk through all the rest. For each one, check β€œare you an anagram of me?”. If yes, drop it into the same group.

But how do you even check if two words are anagrams? You would sort both words and see if they match. Or count their letters. Either way, you do this check for every pair of words. That is a lot of comparing.

If you have n words, comparing every word with every other word is already O(nΒ²). And each comparison costs extra time on top. So this gets slow fast when the list is big. The interviewer will nod, but then ask the real question. β€œCan you do better?”

⚑ The Smart Idea: A Sorted Key

Now here is the trick that makes this click. Think about what is the same for every word in a group. The letters are the same. Only the order is different. So what if we remove the order?

If you take "eat" and sort its letters, you get "aet". If you take "tea" and sort its letters, you also get "aet". And "ate" sorted is "aet" too. See the pattern? Every anagram, once sorted, turns into the exact same string. That sorted string is our signature. It is the secret label that all members of a group share.

So now we use a hash map. A hash map is a structure that stores key and value pairs. The key will be the sorted word. The value will be the list of all words that sorted to that key. We go through each word once. We sort it to get the key. Then we push the original word into the bucket for that key. At the end, every bucket is one group of anagrams.

"eat" -> sort -> "aet" -> bucket["aet"] = ["eat"]
"tea" -> sort -> "aet" -> bucket["aet"] = ["eat", "tea"]
"tan" -> sort -> "ant" -> bucket["ant"] = ["tan"]
"ate" -> sort -> "aet" -> bucket["aet"] = ["eat", "tea", "ate"]
"nat" -> sort -> "ant" -> bucket["ant"] = ["tan", "nat"]
"bat" -> sort -> "abt" -> bucket["abt"] = ["bat"]

πŸ“ Steps to Group Anagrams

  1. Create an empty hash map. The key is a sorted word. The value is a list of words.
  2. Go through each word in the input list, one by one.
  3. Sort the letters of the word to make the key.
  4. Look up that key in the map. If it is missing, start a new empty list for it.
  5. Add the original word to the list for that key.
  6. After the loop, collect all the values from the map. Each value is one group.

πŸ’» The Code

Here is the optimal solution in all five languages. Each one builds the map, then prints every group.

group_anagrams.py
def group_anagrams(words):
groups = {} # key: sorted word, value: list of words
for word in words:
key = "".join(sorted(word)) # sort letters to build the key
groups.setdefault(key, []).append(word) # add word to its bucket
return list(groups.values())
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
for group in group_anagrams(words):
print(group)

The output of the above code will be:

['eat', 'tea', 'ate']
['tan', 'nat']
['bat']

Note

The group order in the output can change between languages. That is fine. A hash map does not promise any order. The interviewer only cares that the right words are grouped together, not which group prints first.

⏱️ Time and Space Complexity

Let n be the number of words and k be the length of the longest word.

Approach Time Space
Brute force (compare every pair) O(nΒ² Β· k log k) O(n Β· k)
Sorted key + hash map O(n Β· k log k) O(n Β· k)

For each word we sort its letters, and sorting one word of length k costs O(k log k). We do that for all n words. So the total time is O(n Β· k log k). The space is O(n Β· k) because in the worst case the map holds every word.

Tip

If the words use only lowercase English letters, you can skip sorting. Instead, count how many times each of the 26 letters appears and use that count as the key. That brings each word down to O(k) and the whole thing to O(n Β· k). Mentioning this in an interview shows you can push past the first good answer.

🧩 Key Takeaways

  • βœ… All anagrams share the same letters, so sorting any of them gives the same string. That sorted string is the perfect group key.
  • βœ… A hash map lets you bucket words by their sorted key in one pass through the list.
  • βœ… The sorted-key approach runs in O(n Β· k log k), much faster than comparing every pair.
  • βœ… For lowercase-only words, a 26-letter count key drops the cost to O(n Β· k).
  • βœ… Group order in the output does not matter. Only the grouping itself matters.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    Why does sorting a word make a good hash map key for grouping anagrams?

    Why: Anagrams use the exact same letters, so sorting any of them produces the identical string, which becomes their shared key.

  2. 2

    What is the time complexity of the sorted-key approach for n words of max length k?

    Why: We sort each of the n words, and sorting one word of length k costs O(k log k), giving O(n Β· k log k) overall.

  3. 3

    In the hash map, what does the value store?

    Why: Each key maps to a list of all the original words that sorted to that same key, which is one group of anagrams.

  4. 4

    If the words contain only lowercase English letters, how can you avoid sorting?

    Why: Counting how often each of the 26 letters appears gives the same key for anagrams in O(k) time, without sorting.

πŸš€ What’s Next?

Share & Connect