Redundant Connection: Union-Find Cycle Detection

May 21, 20267 min read

leetcodeproblemmediumgraphunion-find

You are given a graph that started as a tree with n nodes (labeled 1 to n), plus one extra edge. The extra edge creates exactly one cycle. Find and return that redundant edge.

This is LeetCode 684: Redundant Connection, and it is one of the best problems for learning how Union-Find detects cycles. A tree with n nodes has exactly n - 1 edges. Add one more edge and you get exactly one cycle. Your job is to find which edge completes that cycle.

Nodes 1 through 5 with edges. The dashed orange edge [2, 3] is redundant because nodes 2 and 3 are already connected through node 1.

Why this problem matters

Redundant Connection is the bridge between "I understand Union-Find" and "I can actually use it." In Number of Provinces, you use Union-Find to count connected components. Here, you use the same data structure to detect when adding an edge would create a cycle. That ability, detecting cycles as you process edges, is the foundation for Kruskal's minimum spanning tree algorithm and many dynamic connectivity problems.

The problem also teaches you to think about trees as a special case of graphs. A tree is a connected graph with no cycles. The moment you add one edge to a tree, you create exactly one cycle. The redundant edge is the one that "closes" that cycle.

The key insight

Process the edges one by one. For each edge [u, v], check whether u and v are already in the same connected component. If they are, adding this edge would create a cycle, so this is the redundant edge. If they are not, merge their components.

Union-Find gives you exactly the two operations you need:

Find(x): determine which component x belongs to (its root)
Union(x, y): merge the components of x and y

When find(u) == find(v) before calling union(u, v), both nodes are already connected. That edge is the answer.

The solution

def findRedundantConnection(edges):
    n = len(edges)
    parent = list(range(n + 1))
    rank = [0] * (n + 1)

    def find(x):
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x

    def union(x, y):
        rx, ry = find(x), find(y)
        if rx == ry:
            return False
        if rank[rx] < rank[ry]:
            rx, ry = ry, rx
        parent[ry] = rx
        if rank[rx] == rank[ry]:
            rank[rx] += 1
        return True

    for u, v in edges:
        if not union(u, v):
            return [u, v]

Let's break this down.

Initialization: parent[i] = i for nodes 1 through n. Every node starts as its own root. We use n + 1 entries because the nodes are 1-indexed.

Find with path compression: The find function follows parent pointers up to the root. Along the way, it applies path halving (parent[x] = parent[parent[x]]), which keeps the trees flat and makes future lookups nearly O(1).

Union by rank: When merging two components, the shorter tree gets attached under the taller one. This prevents the tree from degenerating into a linked list. If the ranks are equal, we pick one as the new root and increment its rank.

The critical check: union returns False when both nodes already share the same root. That means they are already connected, and adding this edge would create a cycle. The first edge that fails the union is our answer.

The problem guarantees that the answer is the last edge in the input that could be removed to leave a tree. Since we process edges in order and return the first one that creates a cycle, this naturally gives us the correct answer because the redundant edge is guaranteed to be the one that completes the cycle.

Practice Union-Find cycle detection with spaced repetition

Step-by-step walkthrough

Let's trace the algorithm on edges = [[1,2], [1,3], [2,3], [3,4], [4,5]]. The edge [2,3] is redundant because nodes 2 and 3 are already connected through node 1.

Initial state

Initialize parent = [_, 1, 2, 3, 4, 5], rank = [_, 0, 0, 0, 0, 0]

Every node is its own root. Index 0 is unused since nodes are numbered 1 to 5.

Edge [1, 2]: union(1, 2)

find(1)=1, find(2)=2. Different roots. Merge 2 under 1.

Nodes 1 and 2 are now in the same set.

Edge [1, 3]: union(1, 3)

find(1)=1, find(3)=3. Different roots. Merge 3 under 1.

Nodes 1, 2, and 3 are all in the same set now.

Edge [2, 3]: union(2, 3)Cycle!

find(2)=1, find(3)=1. Same root! Cycle detected.

Both nodes already share root 1. Adding this edge would create a cycle. Return [2, 3].

Edge [3, 4]: union(3, 4)

find(3)=1, find(4)=4. Different roots. Merge 4 under 1.

If we continued past the answer, node 4 joins the set containing 1, 2, 3.

Edge [4, 5]: union(4, 5)

find(4)=1, find(5)=5. Different roots. Merge 5 under 1.

All 5 nodes are now in one set. But we already found our answer at edge [2, 3].

The algorithm stops at edge [2,3]. At that point, find(2) and find(3) both return 1, meaning both nodes are already in the same component. Adding edge [2,3] would close the cycle 1 - 2 - 3 - 1. That is our answer.

Notice that the steps after [2,3] are shown for completeness, but in practice the algorithm returns immediately when it detects the cycle.

Complexity analysis

Metric	Value
Time	O(n * alpha(n)), effectively O(n)
Space	O(n)

Time: We process n edges, and each edge requires two find operations and at most one union. With path compression and union by rank, each operation runs in O(alpha(n)) amortized time, where alpha is the inverse Ackermann function. For all practical purposes, alpha(n) is a constant (it never exceeds 4 for any input that fits in memory). So the total time is effectively O(n).

Space: The parent and rank arrays each use O(n) space. No other significant data structures are needed.

The building blocks

This problem uses one core pattern with two important optimizations.

1. Union-Find with cycle detection

The key twist compared to basic Union-Find is using the return value of union to detect cycles:

def union(x, y):
    rx, ry = find(x), find(y)
    if rx == ry:
        return False
    parent[ry] = rx
    return True

When union returns False, both nodes are already in the same set. In a graph context, that means adding edge (x, y) would create a cycle. This pattern appears in Kruskal's MST algorithm (skip edges that would create cycles) and in problems that ask whether a graph is a valid tree.

2. Path compression

def find(x):
    while parent[x] != x:
        parent[x] = parent[parent[x]]
        x = parent[x]
    return x

Path halving (parent[x] = parent[parent[x]]) flattens the tree as you traverse it. Without path compression, find could take O(n) in the worst case. With it, each call is nearly O(1) amortized. This is what makes Union-Find practical for large inputs.

3. Union by rank

if rank[rx] < rank[ry]:
    rx, ry = ry, rx
parent[ry] = rx
if rank[rx] == rank[ry]:
    rank[rx] += 1

Always attach the shorter tree under the taller one. This keeps the maximum tree height logarithmic, which bounds the worst-case find depth. Combined with path compression, this gives the near-constant amortized guarantee.

You can use union by size instead of union by rank. Instead of tracking tree height, track the number of nodes in each tree and attach the smaller tree under the larger one. Both approaches achieve the same asymptotic performance. Rank is slightly simpler because you do not need to maintain exact counts.

Drill path compression and union by rank

Edge cases

Before submitting, verify your solution handles:

Minimum input: 3 nodes with edges [[1,2],[1,3],[2,3]]. The last edge [2,3] is redundant.
Linear chain plus one edge: edges like [[1,2],[2,3],[3,4],[1,4]]. The cycle is 1-2-3-4-1, and [1,4] is the answer.
Redundant edge is the first edge processed: this cannot happen. The problem guarantees the graph is a tree plus one edge, so the first edge always connects two unconnected nodes.
Large cycles: the cycle might involve all n nodes. The algorithm still works because it detects the cycle at the moment the last edge of the cycle is processed.
Multiple valid answers: the problem states to return the edge that occurs last in the input. Since we process edges in order and return the first failure, this is automatically correct.

Reading about it is not enough

You have read the solution and traced through the steps. Union-Find cycle detection makes sense right now. But will you remember the path compression line (parent[x] = parent[parent[x]]) three weeks from now? Will you remember that union returns False when it detects a cycle?

These are small details, but they are exactly the kind of thing that slips away under interview pressure. The gap between understanding and fluency is real. You can close it with spaced repetition. Practice writing the Union-Find template from scratch at increasing intervals. After a few rounds, the find and union functions come out automatically. You stop thinking about the mechanics and start thinking about the problem.

Redundant Connection is a great problem to drill because it combines the Union-Find template with a specific application (cycle detection). Once you can write the solution from memory, you are ready for harder variants like Redundant Connection II (directed graphs) and Kruskal's minimum spanning tree.

Number of Provinces - Uses the same Union-Find template to count connected components instead of detecting cycles
Number of Islands - Connected components on a grid, solved with DFS flood fill. A different approach to a related problem
Course Schedule - Cycle detection in directed graphs using DFS three-color marking, the directed counterpart to what Union-Find does here

CodeBricks breaks Redundant Connection into its Union-Find building blocks and drills them individually with spaced repetition. You practice find with path compression, union by rank, and the cycle detection check from scratch until they are automatic. When a graph connectivity problem shows up in your interview, you do not hesitate. You just write it.

Drill Union-Find with spaced repetition