Longest Consecutive Sequence

This one looks easy. Then the interviewer adds one word. Do it in linear time. Now you cannot just sort. So the real test here is whether you can spot that a hash set turns a sorting problem into a single pass. Let me show you how to talk through it.

πŸ”’ The Problem

You get an array of integers. They are not sorted. They can be in any order. You have to find the length of the longest run of numbers that come one after another, like 1, 2, 3, 4. The numbers do not need to sit next to each other in the array. They just need to exist somewhere in it.

Input: [100, 4, 200, 1, 3, 2]
Output: 4
Why? The run 1, 2, 3, 4 is the longest set of consecutive numbers.
Their positions in the array do not matter, only that all four exist.

🐒 The Simple Idea First (Sorting)

The first thing most people say is β€œsort it”. And that is fine to say out loud. Once the array is sorted, consecutive numbers sit right next to each other. So you walk through once and count how long each run keeps growing.

The catch is the cost. Sorting takes O(n log n) time. The interviewer who asks this usually wants O(n). So sorting is the answer you give first and then improve. You also have to handle duplicates so they do not break your count.

⚑ The Optimal Idea (Hash Set)

Here is the trick. Put every number into a hash set first. A hash set is a collection that tells you in roughly O(1) time whether a value is present. Now you do not need things sorted, because the set can answer β€œis num + 1 here?” instantly.

But if you start counting from every number, you do the same run many times. So we add one smart rule. Only start counting a run from a number that is the start of its run. A number is a start only when num - 1 is not in the set. See, if num - 1 existed, then num is in the middle of some run, not the beginning. So we skip it. Each run gets counted exactly once this way, and that is what keeps the whole thing O(n).

Steps to do it:

  1. Put all numbers into a hash set. This also drops duplicates for free.
  2. Go through each number in the set.
  3. If num - 1 is in the set, skip it. This number is not a start.
  4. If num - 1 is missing, this is a start. Keep checking num + 1, num + 2, and so on while they exist, counting the length.
  5. Track the longest length you ever saw and return it.

One small note before the code. C does not have a hash set in its standard library. So the C tab falls back to the sorting approach instead. The other four languages all use the O(n) hash set version.

longest_consecutive.py
def longest_consecutive(nums):
num_set = set(nums) # all numbers, no duplicates
longest = 0
for num in num_set:
# Only start counting from the start of a run.
if num - 1 not in num_set:
current = num
length = 1
while current + 1 in num_set: # walk the run forward
current += 1
length += 1
longest = max(longest, length)
return longest
nums = [100, 4, 200, 1, 3, 2]
print("Longest consecutive sequence length:", longest_consecutive(nums))

The output of the above code will be:

Longest consecutive sequence length: 4

Note

That num - 1 not in num_set check is the whole interview. It looks like the inner while loop makes this slow, but it does not. A number is only ever walked over as part of one run. So across the entire run the inner loop does total work proportional to n. That is why the hash set version stays O(n).

⏱️ Time and Space Complexity

Approach Time Space
Sorting then scanning O(n log n) O(1) extra, or O(n) if sort is not in place
Hash set (optimal) O(n) O(n)

Tip

If the interviewer says memory is tight and O(n log n) time is fine, the sorting answer is actually the better trade. Say both out loud. Showing you know the trade-off between time and space scores more points than just reciting the fastest one.

🧩 Key Takeaways

  • βœ… Sorting solves it in O(n log n). It is a fine first answer, but the interviewer usually wants faster.
  • βœ… A hash set gives O(1) lookups, so you can ask β€œis num + 1 present?” without sorting.
  • βœ… The key trick is to only start a run from a number whose num - 1 is missing. That counts each run once.
  • βœ… The inner loop does not make it O(nΒ²). Each number is visited once across all runs, so it stays O(n).
  • βœ… Putting numbers in a set first also removes duplicates for free.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    Why do we only start counting a run when num - 1 is NOT in the set?

    Why: If num - 1 exists, num is in the middle of a run, so starting there would recount the same run.

  2. 2

    What is the time complexity of the hash set approach?

    Why: Each number is visited only once across all runs, so the total work grows linearly with n.

  3. 3

    Why does putting the numbers into a set help besides fast lookups?

    Why: A set stores each value only once, so duplicates disappear and cannot inflate a run's count.

  4. 4

    When might the sorting solution actually be preferable?

    Why: An in-place sort uses little extra memory, which can beat the hash set when space matters more than speed.

πŸš€ What’s Next?

Share & Connect