Find the Missing Number
Table of Contents + β
This one looks simple. You get a list of numbers and one is missing. Find it. But the interviewer is not really asking βcan you find itβ. They want to see if you can find it fast and without using extra memory. So the trick here is to notice the hidden pattern in the numbers.
π’ The Problem
You get an array of n distinct numbers. The numbers come from the range 0 to n. That range has n + 1 values in total. But the array only holds n of them. So exactly one number is missing. Your job is to find that one.
Input: [3, 0, 1]Output: 2
Here n = 3. The full range is 0, 1, 2, 3.The array has 3, 0, 1. So 2 is missing.A number being distinct just means no value repeats. Every number in the array shows up only once. That detail matters a lot. It is what lets us use the clever tricks below.
π’ The Brute-Force Way
The first idea most people get is the simplest one. For every number in the range 0 to n, check if it is present in the array. The moment you find one that is not there, that is your answer.
To check βis this number presentβ, you would scan the whole array. So you have one loop for the range and another loop inside it for the search. That is a loop inside a loop. It works, but it is slow. For a big array it does far too much repeated scanning.
A slightly better version is to put all the array numbers into a set. A set is a collection that lets you check βis this value here?β very quickly. Then you walk the range 0 to n and the first value missing from the set is the answer. This is faster, but the set itself takes extra memory. The interviewer will push you to drop that extra memory.
β The Optimal Way: The Sum Formula
Here is the key insight. We already know what the numbers 0 to n should add up to. There is a famous formula for that.
expected sum = n * (n + 1) / 2So we do two things:
- Work out the expected sum using the formula.
- Add up all the numbers actually in the array.
The missing number is just the difference. Subtract the actual sum from the expected sum. Whatever is left over is the value that was taken out.
Let us walk through the example [3, 0, 1]. Here n = 3.
expected sum = 3 * 4 / 2 = 6actual sum = 3 + 0 + 1 = 4missing = 6 - 4 = 2See? We never sorted anything. We never used a set. We just looped once to add the numbers. That is O(n) time and O(1) space. That is exactly what the interviewer wants to hear.
Steps to find the missing number
- Read
nas the length of the array. The full range is0ton. - Compute the expected sum with
n * (n + 1) / 2. - Loop through the array once and add up every value to get the actual sum.
- Subtract the actual sum from the expected sum.
- The result is the missing number.
def find_missing(arr): n = len(arr) expected = n * (n + 1) // 2 # sum of 0..n actual = sum(arr) # add up what is present return expected - actual # the gap is the answer
arr = [3, 0, 1]print("Missing number:", find_missing(arr))The output of the above code will be:
Missing number: 2Caution
The int version above is perfect for small inputs like our example. But for very large arrays the expected sum can grow past what an int can hold. If you need it for huge inputs, widen the accumulator to a bigger type (long long in C and C++, long in Java) or reach for the XOR method below.
π The XOR Way (No Overflow Worry)
There is one small problem with the sum method. If the array is huge, the expected sum can get very large. So large that it may not fit in a normal integer. That is called overflow, when a number grows past what the variable can hold and the value wraps around to something wrong.
The XOR method avoids that. XOR is a bitwise operation written as ^. It has a neat property: a number XORed with itself becomes 0, and any number XORed with 0 stays the same. So if you XOR every index 0 to n together with every value in the array, all the matching pairs cancel out. The only thing left standing is the missing number.
You still loop just once. Same O(n) time and O(1) space, but no risk of the number getting too big.
def find_missing(arr): n = len(arr) result = n # start with the last index n for i in range(n): result ^= i # XOR every index 0..n-1 result ^= arr[i] # XOR every value in the array return result # matching pairs cancel; the gap remains
arr = [3, 0, 1]print("Missing number:", find_missing(arr))The output of the above code will be:
Missing number: 2β±οΈ Time and Space Complexity
| Approach | Time | Space |
|---|---|---|
| Brute force (search for each value) | O(nΒ²) | O(1) |
| Using a set | O(n) | O(n) |
| Sum formula | O(n) | O(1) |
| XOR | O(n) | O(1) |
Tip
In the interview, start with the sum formula because it is the easiest to explain. Then mention that XOR does the same job without any overflow risk. Saying that one line shows you really understand the trade-off, and interviewers love that.
π§© Key Takeaways
- β
The numbers are distinct and come from
0ton, so exactly one value is missing. - β
The sum formula
n * (n + 1) / 2gives the expected total. Subtract the actual sum to get the answer. - β
Both the sum method and the XOR method run in
O(n)time andO(1)space. - β XOR is the safer choice for very large arrays because it never overflows.
- β
A set works too, but it costs
O(n)extra memory, so the interviewer will ask you to do better.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
For an array of n distinct numbers from 0 to n, what is the expected sum of the full range?
Why: The sum of all integers from 0 to n is n * (n + 1) / 2, the classic Gauss formula.
- 2
Why might an interviewer prefer the XOR method over the sum method?
Why: The sum can grow past what an integer holds, but XOR never overflows since it only flips bits.
- 3
What is the space complexity of the sum-formula approach?
Why: It only keeps a couple of running totals, so it uses constant extra space.
- 4
For the input [0, 1, 3], what is the missing number?
Why: n = 3, expected sum = 6, actual sum = 4, so 6 - 4 = 2 is missing.