Segment Tree

Say you have a big list of numbers. Someone keeps asking you β€œwhat is the sum of the numbers from position 2 to position 7?” Then they change a number. Then they ask again for a different range. A plain array can do this. But it gets slow. A segment tree is the trick that makes both the asking and the changing fast.

🎯 The Problem First

Imagine you keep daily sales for a shop in an array. Your manager keeps asking things like β€œtotal sales from Monday to Friday?” and β€œtotal from Wednesday to Sunday?”. And sometimes a day’s sales get corrected, so you update one value.

With a plain array, answering β€œsum from index l to index r” means you walk through every element in that range and add them up. That is fine once. But if the manager asks a thousand times, you do that walk a thousand times.

Here is the pain in plain terms.

  • One range-sum on a plain array takes O(n) time. You touch every element in the range.
  • Many queries means you repeat that slow walk again and again.
  • A prefix sum array makes queries fast. But the moment you update one value you have to rebuild a big chunk of it. So updates become slow.

So we want something where both the query and the update are fast. That is exactly what a segment tree gives you. Both operations run in O(log n).

🌳 What Is a Segment Tree?

A segment tree is a binary tree where each node stores the answer for one segment (one range) of the array.

The idea is simple when you say it out loud.

  • The root node covers the whole array.
  • Each node splits its range into two halves. It gives one half to its left child and the other to its right child.
  • A leaf node covers just a single element.
  • A parent’s value is made by combining its two children. For a sum tree, the parent value is just left child value + right child value.

So the root holds the sum of everything. Its two children hold the sum of the left half and the sum of the right half. And so on down to single elements.

Think of it like a company. The CEO knows the total revenue of the whole company. Below the CEO are two heads, each knowing the total for their half. Below them are smaller managers, each knowing their own small team. When you want the total for some group of teams, you do not ask every single person. You ask a few managers whose ranges line up with what you need.

πŸ“ A Small Example

Let us take a tiny array so we can see the whole tree.

index: 0 1 2 3
value: 1 3 5 7

We build a tree on top of it. Each node is labeled with the range it covers and the sum of that range.

[0..3] sum=16

[0..1] sum=4

[2..3] sum=12

[0..0] sum=1

[1..1] sum=3

[2..2] sum=5

[3..3] sum=7

See how it works.

  • The leaves at the bottom are the original values: 1, 3, 5, 7.
  • The node [0..1] is 1 + 3 = 4. The node [2..3] is 5 + 7 = 12.
  • The root [0..3] is 4 + 12 = 16, which is the sum of the whole array.

Now suppose you want the sum from index 1 to index 3. You do not visit every leaf. You grab the node [1..1] (value 3) and the node [2..3] (value 12). Add them: 3 + 12 = 15. Done in a couple of steps instead of walking the whole range.

Note

A segment tree built on n elements has about 2n nodes in total. A common, safe trick is to make the storage array size 4 * n so every node has room, even when n is not a power of two.

βš™οΈ How the Three Operations Work

A segment tree has three jobs. Here is each one in plain words.

Build. Start at the root and keep splitting the range in half until you reach single elements. Fill the leaves with the array values. Then, on the way back up, set each parent to the sum of its two children. This happens once and takes O(n) time.

Query (range sum). You ask for the sum from l to r. Starting at the root, at each node there are three cases.

  • The node’s range is fully outside [l, r] β†’ it contributes nothing, return 0.
  • The node’s range is fully inside [l, r] β†’ return that node’s stored sum directly.
  • The node’s range partly overlaps β†’ go into both children and add their answers.

Because the tree is only about log n levels deep, a query touches only a few nodes. That is the O(log n) part.

Update (point update). You change one element at some index. This is called a point update. Walk down to that leaf, change it, then on the way back up fix every parent’s sum. Only the nodes along one path change, and that path is log n long.

πŸ“ Steps to Build and Use a Sum Segment Tree

Before we read code, here is the plan in order.

  1. Create a storage array tree of size 4 * n, all zeros.
  2. Build: recursively split the range. At a leaf, copy the array value in. At an internal node, set its value to the sum of its two children.
  3. Query(l, r): recurse from the root. Use the three cases above (outside β†’ 0, inside β†’ stored sum, partial β†’ combine both children).
  4. Update(i, val): recurse down to the leaf for index i, set the new value, then re-sum the parents on the way back up.

We use a common indexing rule: the node at position node has its left child at 2 * node + 1 and its right child at 2 * node + 2.

πŸ’» Sum Segment Tree in Code

The code below builds a tree on the array [1, 3, 5, 7, 9, 11]. It asks for the sum of range [1, 3], then updates index 1 to 10, then asks the same range again. Watch how the answer changes after the update.

segment_tree.py
arr = [1, 3, 5, 7, 9, 11]
n = len(arr)
tree = [0] * (4 * n) # size 4*n is safe
# Build the tree for range [start..end] at position node
def build(node, start, end):
if start == end:
tree[node] = arr[start] # leaf holds one array value
return
mid = (start + end) // 2
build(2 * node + 1, start, mid) # left half
build(2 * node + 2, mid + 1, end) # right half
tree[node] = tree[2 * node + 1] + tree[2 * node + 2] # sum of children
# Sum of range [l..r]
def query(node, start, end, l, r):
if r < start or end < l:
return 0 # fully outside
if l <= start and end <= r:
return tree[node] # fully inside
mid = (start + end) // 2
left = query(2 * node + 1, start, mid, l, r)
right = query(2 * node + 2, mid + 1, end, l, r)
return left + right # partial overlap
# Set arr[idx] = val and fix the tree
def update(node, start, end, idx, val):
if start == end:
tree[node] = val # change the leaf
return
mid = (start + end) // 2
if idx <= mid:
update(2 * node + 1, start, mid, idx, val)
else:
update(2 * node + 2, mid + 1, end, idx, val)
tree[node] = tree[2 * node + 1] + tree[2 * node + 2] # re-sum parent
build(0, 0, n - 1)
print("Sum of range [1, 3]:", query(0, 0, n - 1, 1, 3))
update(0, 0, n - 1, 1, 10) # change index 1 from 3 to 10
print("After update, sum of range [1, 3]:", query(0, 0, n - 1, 1, 3))

The output of the above code will be:

Sum of range [1, 3]: 15
After update, sum of range [1, 3]: 22

The first query reads arr[1] + arr[2] + arr[3] = 3 + 5 + 7 = 15. Then we set index 1 to 10, so the same range becomes 10 + 5 + 7 = 22. Notice we never rebuilt the tree. The update only fixed the few nodes on one path.

Tip

Building the tree happens one time and costs O(n). After that, every query and every update costs only O(log n), because each one walks a small number of nodes down a tree that is about log n levels tall.

⏱️ Time and Space Complexity

Here is the cost of each operation, side by side with a plain array. You can see why the segment tree wins when there are many queries and updates. The big win is that range query drops to O(log n).

Operation Plain Array Segment Tree
Build O(n) O(n)
Range query (sum) O(n) O(log n)
Point update O(1) O(log n)
Space O(n) O(n)

A plain array updates in O(1) but queries in O(n). A segment tree gives up a little on update to make queries fast too. When you have lots of both, the segment tree is the clear winner.

Caution

A segment tree is not just for sums. Swap the combine step and you can answer range minimum, range maximum, or GCD over a range. The shape stays the same. Only the line that joins two children changes.

❌ Common Mistakes

A few things go wrong for people the first time. Watch out for these.

  • Storage too small. Sizing tree as exactly 2 * n can overflow when n is not a power of two. Use 4 * n and you are safe.
  • Forgetting the fully-outside case. If you do not return 0 when a node’s range is completely outside [l, r], your sum comes out wrong.
  • Mixing up child indices. Left child is 2 * node + 1 and right child is 2 * node + 2 with 0-based root. If you use them inconsistently, the tree reads garbage.
  • Updating the array but not the tree. Changing arr[i] by hand does nothing. You must call update, so every parent on the path gets re-summed.
  • Off-by-one in the range. Be clear whether your query range is inclusive on both ends. The code here treats [l, r] as including both l and r.

🧩 What You’ve Learned

βœ… A plain array answers a range-sum in O(n), which is slow when there are many queries.

βœ… A segment tree stores the answer for a range in each node, and a parent combines its two children.

βœ… Build is O(n) and runs once; query and update are both O(log n).

βœ… The query uses three cases: fully outside returns 0, fully inside returns the stored value, partial overlap combines both children.

βœ… Size the storage as 4 * n, and change only the combine step to handle min, max, or GCD instead of sum.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    Why is a plain array slow for many range-sum queries?

    Why: A plain array must add up every element in the range, so each range-sum query takes O(n) time.

  2. 2

    What does a single node in a sum segment tree store?

    Why: Each node holds the combined answer (here, the sum) for the one range of the array that it covers.

  3. 3

    What is the time complexity of a range query and a point update in a segment tree?

    Why: Both operations walk a small number of nodes down a tree about log n levels tall, so each is O(log n).

  4. 4

    During a query, what should a node do when its range is fully outside [l, r]?

    Why: A fully-outside node contributes nothing to the answer, so it returns 0 (the neutral value for sum).

πŸš€ What’s Next?

Share & Connect