Clone a Graph
Table of Contents + β
A graph can point back to itself. Node 1 connects to node 2, and node 2 connects right back to node 1. So if you try to copy a graph by just following neighbors and copying as you go, you never stop. You keep looping over the same nodes forever. The way out is simple. You remember what you have already copied. That memory is the real solution, and it is what the interviewer wants to see.
π The Problem
You are given a node from a connected undirected graph. Every node has a value and a list of its neighbors. Your job is to make a deep copy of the whole graph. A deep copy means brand new nodes in memory, not the same nodes reused. The copy should have the same shape: same values, same connections between nodes.
Here is a small graph and what the clone should look like.
Input graph (each node lists its neighbors): 1 -- 2 | | 4 -- 3
Node 1 neighbors: [2, 4]Node 2 neighbors: [1, 3]Node 3 neighbors: [2, 4]Node 4 neighbors: [1, 3]
Output: a new graph with the exact same connections,but every node is a fresh object in memory.See the trouble? Node 1 points to node 2, and node 2 points right back to node 1. If you copy node 1, then copy node 2, then from node 2 try to copy node 1 again, you are stuck in a loop. So you need a way to say βI already made a copy of this one.β
π’ The Naive Idea (and why it breaks)
The simple thought is this. Start at the given node, make a copy of it, then walk to each neighbor and copy that too. Keep going.
This breaks because of cycles. When you reach a neighbor you have already visited, you copy it again. Then its neighbors send you back. You never stop. Even if it did stop somehow, you would end up with many copies of the same node instead of one. The connections would be wrong.
So the naive walk has no memory. That is the real flaw.
β The Optimal Idea: a visited map
The fix is small but powerful. Keep a hash map that links each original node to its copy. A hash map is just a lookup table from one thing to another.
Before you copy any node, you check the map first. If the node is already in the map, you do not copy it again. You just return the copy you made earlier. If it is not in the map, you make the copy once, put it in the map right away, and then go copy its neighbors.
Putting the copy in the map before visiting neighbors is the key step. That is what stops the infinite loop. When a neighbor leads back to this node, the map already has it, so you stop and reuse it.
You can walk the graph two ways. DFS (depth-first search) goes deep down one path before backing up. BFS (breadth-first search) visits nearby nodes first using a queue. Both work fine here. The visited map is what matters, not the walk order.
π Steps to clone the graph (DFS)
- If the input node is empty, return empty. Nothing to copy.
- Make a hash map called
visitedthat maps an original node to its copy. - Write a helper that takes a node. If that node is already in
visited, return its copy. - Otherwise create a new node with the same value. Store it in
visitedimmediately. - For each neighbor of the original node, call the helper and add the returned copy to the new nodeβs neighbor list.
- Return the copy of the starting node.
# A node holds a value and a list of neighbors.class Node: def __init__(self, val): self.val = val self.neighbors = []
# original node -> its copyvisited = {}
# DFS clone. Store the copy before visiting neighbors.def clone_dfs(node): if node is None: return None if node in visited: return visited[node] # already copied
copy = Node(node.val) visited[node] = copy # store first, then recurse
for nb in node.neighbors: copy.neighbors.append(clone_dfs(nb)) return copy
# Build the graph: 1-2, 2-3, 3-4, 4-1n1, n2, n3, n4 = Node(1), Node(2), Node(3), Node(4)n1.neighbors = [n2, n4]n2.neighbors = [n1, n3]n3.neighbors = [n2, n4]n4.neighbors = [n1, n3]
clone = clone_dfs(n1)
# Print the cloned graph.for c in (visited[n1], visited[n2], visited[n3], visited[n4]): neighbor_vals = " ".join(str(nb.val) for nb in c.neighbors) print(f"Node {c.val} neighbors: {neighbor_vals}")print(f"Original n1 == clone? {'yes' if n1 is clone else 'no'}")The output of the above code will be:
Node 1 neighbors: 2 4Node 2 neighbors: 1 3Node 3 neighbors: 2 4Node 4 neighbors: 1 3Original n1 == clone? noβ±οΈ Time and Space Complexity
Here V is the number of nodes and E is the number of edges.
| Approach | Time | Space | Note |
|---|---|---|---|
| Naive walk (no memory) | Never ends | Never ends | Loops forever on cycles |
| DFS with visited map | O(V + E) | O(V) | Each node and edge touched once |
| BFS with visited map | O(V + E) | O(V) | Same cost, uses a queue instead of recursion |
We visit each node one time and look at each edge one time. So the time is O(V + E). The map holds one entry per node, so the space is O(V).
Tip
The line that puts the copy into the map happens before you loop over the neighbors. If you ever swap that order and copy neighbors first, you fall back into the infinite loop. So say this out loud in the interview: βI store the copy in the map first, then recurse.β It shows you understand why the map works.
π§© Key Takeaways
- β A deep copy means fresh nodes in memory with the same shape, not the same nodes reused.
- β The visited map links each original node to its copy. It is the heart of the solution.
- β Store the copy in the map before visiting neighbors. That is what stops the infinite loop on cycles.
- β DFS and BFS both work. The walk order does not matter; the visited map does.
- β The cost is O(V + E) time and O(V) space because each node and edge is touched once.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
Why does the naive copy without a visited map run forever?
Why: A graph can have cycles, so following neighbors with no memory keeps returning to the same nodes endlessly.
- 2
What does the visited hash map store?
Why: It maps each original node to its newly created copy so each node is cloned exactly once.
- 3
Why must you store the copy in the map BEFORE visiting its neighbors?
Why: Storing it first means a cycle that returns to this node sees it in the map and reuses it instead of copying again.
- 4
What is the time complexity of cloning the graph with a visited map?
Why: Each node and each edge is touched exactly once, giving O(V + E) time.