Checking Existence of Edge Length Limited Paths: Offline Union-Find
Checking Existence of Edge Length Limited Paths is LeetCode problem 1697. You are given an undirected weighted graph and a list of queries. Each query asks whether there is a path between two nodes using only edges whose weight is strictly less than a given limit. You need to answer all queries and return the results as a boolean array.
The problem
You have n nodes labeled 0 to n - 1, and an edge list where each edge is [u, v, w] meaning there is an undirected edge between u and v with weight w. You also have a list of queries where each query is [p, q, limit]. For each query, determine whether there exists a path from p to q such that every edge on the path has weight strictly less than limit.
Input:
n = 5
edgeList = [[0,1,2],[1,2,4],[2,3,6],[3,4,3],[0,3,5],[1,3,8]]
queries = [[0,3,6],[1,4,5],[0,4,10]]
Output: [true, false, true]
For query [0,3,6], the path 0 -> 1 -> 2 uses edges with weights 2 and 4, both less than 6, but node 2 is not node 3. The path 0 -> 3 uses the edge with weight 5, which is less than 6. So the answer is true. For query [1,4,5], the only way to reach node 4 from node 1 requires crossing the edge [0,3,5] or [2,3,6], but we also need edges connecting to node 4 through node 3, and the available edges with weight less than 5 leave nodes {0,1,2} and {3,4} in separate components. So the answer is false.
Each edge has a weight. A query [p, q, limit] asks: is there a path from p to q using only edges with weight strictly less than limit?
Why this problem matters
This problem combines two important algorithmic ideas: offline query processing and union-find. Instead of answering each query independently (which would require repeated graph traversals), you sort both the edges and the queries, then process them together in a single sweep. This "offline" approach is a powerful technique that shows up in many competitive programming and interview problems.
The pattern also appears in real systems. Think of network reliability analysis, where you need to determine which pairs of servers can communicate under various bandwidth constraints. Or route planning, where you need to find paths that avoid roads above a certain toll threshold. Sorting and sweeping is often the key to turning an expensive per-query operation into an efficient batch computation.
The key insight
If you sort both the edges by weight and the queries by limit, you can process queries from smallest limit to largest. For each query, you only need to add edges that have weight less than the current limit. Since the queries are sorted, edges you added for a previous query are still valid for the current one. You never need to remove an edge.
This is a perfect fit for union-find. Each time you add an edge, you merge the two endpoints into the same connected component. To answer a query, you just check whether the two query nodes are in the same component. Because you process queries in order of increasing limit and edges in order of increasing weight, every edge gets added at most once across all queries.
The one wrinkle is that queries arrive in a specific order, and the answer array must match that original order. You handle this by attaching the original index to each query before sorting, then placing each answer at the correct position in the result array.
The solution
def distanceLimitedPathsExist(n, edgeList, queries):
parent = list(range(n))
rank = [0] * n
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
rx, ry = find(x), find(y)
if rx == ry:
return
if rank[rx] < rank[ry]:
rx, ry = ry, rx
parent[ry] = rx
if rank[rx] == rank[ry]:
rank[rx] += 1
edgeList.sort(key=lambda e: e[2])
indexed_queries = sorted(enumerate(queries), key=lambda x: x[1][2])
answer = [False] * len(queries)
edge_idx = 0
for orig_idx, (p, q, limit) in indexed_queries:
while edge_idx < len(edgeList) and edgeList[edge_idx][2] < limit:
u, v, w = edgeList[edge_idx]
union(u, v)
edge_idx += 1
answer[orig_idx] = find(p) == find(q)
return answer
Let's break this down.
Union-Find setup: parent[i] = i for all nodes. Every node starts as its own root. The find function uses path halving (parent[x] = parent[parent[x]]) to keep the trees flat. The union function uses union by rank to attach the shorter tree under the taller one.
Sort edges by weight: This lets us add edges in order of increasing weight. When we reach a query with limit L, we know all edges with weight less than L have already been added.
Sort queries by limit with original indices: We pair each query with its index using enumerate, then sort by the limit value. This lets us process queries from smallest to largest limit while still knowing where to put each answer.
The main loop: For each query, advance edge_idx to add all edges with weight strictly less than the current limit. Then check whether the two query nodes share the same root. If they do, there is a path between them using only the edges added so far (all with weight less than the limit). Store the boolean result at the original query index.
Why edge_idx never resets: Because queries are sorted by limit, the limit only increases. Every edge added for a previous query is still valid for the current one. The edge_idx pointer moves forward monotonically through the sorted edge list, never backwards.
The condition is strictly less than (edgeList[edge_idx][2] < limit), not less than or equal. This matches the problem statement, which says "strictly less than." Double-check this when implementing, because using <= instead of < will give wrong answers.
Visual walkthrough
Let's trace the algorithm on the example above. Watch how edges get added in weight order as we process each query, and how the union-find parent array evolves.
Step 0: Sort edges and queries
Sort edges by weight. Sort queries by limit, keeping original indices.
Sorted edges: [0,1]:2, [3,4]:3, [1,2]:4, [0,3]:5, [2,3]:6, [1,3]:8. Sorted queries: [1,4]:5, [0,3]:6, [0,4]:10.
Step 1: Query [1, 4] limit 5
Add all edges with weight < 5: union(0,1), union(3,4), union(1,2). Then check find(1)=0, find(4)=3.
Components: {0,1,2} and {3,4}. Nodes 1 and 4 are in different components. Answer[1] = false.
Step 2: Query [0, 3] limit 6
Add edges with weight < 6 not yet added: union(0,3). Then check find(0)=0, find(3)=0.
Edge [0,3]:5 bridges the two components. All five nodes now connected. Answer[0] = true.
Step 3: Query [0, 4] limit 10
Add edges with weight < 10 not yet added: [2,3]:6 and [1,3]:8 both skip (same root). Check find(0)=0, find(4)=0.
Already fully connected. Answer[2] = true.
Result: Reorder answers
Place each answer at its original query index: answer[0]=true, answer[1]=false, answer[2]=true.
Final result: [true, false, true].
The critical moment is Step 2. When we process query [0,3] with limit 6, we add the edge [0,3] with weight 5 (since 5 is less than 6). This bridges the two previously disconnected components {0,1,2} and {3,4} into one. From that point on, every node is reachable from every other node using only edges with weight less than 6.
Notice that in Step 3, the edges [2,3]:6 and [1,3]:8 both have their endpoints already in the same component, so the unions are no-ops. The union-find detects this instantly because find returns the same root for both endpoints.
Complexity analysis
| Metric | Value | Reasoning |
|---|---|---|
| Time | O(E log E + Q log Q + (E + Q) * alpha(N)) | Sorting edges takes O(E log E). Sorting queries takes O(Q log Q). Each edge is added at most once with a union operation, and each query requires two find operations. With path compression and union by rank, each operation is O(alpha(N)), effectively constant. |
| Space | O(N + Q) | The parent and rank arrays use O(N). The indexed queries array and answer array each use O(Q). |
Here E is the number of edges, Q is the number of queries, and N is the number of nodes. The inverse Ackermann function alpha(N) grows so slowly that it never exceeds 4 for any practical input. Sorting dominates the runtime.
The building blocks
Offline query processing
The "offline" technique means you do not answer queries one at a time as they arrive. Instead, you collect all queries, sort them, and process them in a convenient order. This works whenever the answer to a query depends on a threshold (here, the weight limit), and you can build up state incrementally as the threshold increases.
indexed_queries = sorted(enumerate(queries), key=lambda x: x[1][2])
answer = [False] * len(queries)
for orig_idx, (p, q, limit) in indexed_queries:
answer[orig_idx] = solve_query(p, q, limit)
The enumerate + sorted pattern preserves original indices so you can place answers back in order. You will see this same technique in problems involving range queries, threshold filtering, and sweep line algorithms.
Union-Find with path compression
The union-find data structure tracks connected components as you add edges. The two key optimizations, path compression and union by rank, keep the tree nearly flat:
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
Path halving makes every other node on the path point to its grandparent. This is simpler than full path compression (which requires recursion) and achieves the same amortized complexity. Combined with union by rank, every find and union operation runs in O(alpha(N)) amortized time.
Edge cases
Before submitting, verify your solution handles:
- Disconnected graph: Some nodes may have no edges at all. Queries involving those nodes should return false unless
p == q. - Self-loops: The problem says the graph has no self-loops, but if
p == qin a query, the answer is always true (a node can reach itself with zero edges, satisfying any limit). - Duplicate edges: Multiple edges between the same pair of nodes with different weights. The algorithm handles this naturally because it processes all edges.
- Limit of 1: The smallest possible edge weight is 1. A limit of 1 means no edge can be used (since all weights are at least 1 and must be strictly less than the limit). Only
p == qqueries return true. - All queries have the same limit: The sorting still works correctly. All edges below the limit get added before any query is checked.
- Large inputs: The problem allows up to 10^5 nodes and 10^5 edges with 10^5 queries. The O(E log E + Q log Q) complexity handles this comfortably.
From understanding to recall
You have seen how offline query processing and union-find combine to solve a problem that would be expensive to answer one query at a time. The sorting step is what makes it efficient: by processing queries in order of increasing limit, you turn a multi-pass algorithm into a single sweep.
But the details matter under pressure. Remembering to sort both edges and queries, attaching original indices before sorting, using strict less-than for the edge weight comparison, and placing answers back at the right positions. These are the things that trip people up in interviews.
Spaced repetition drills these details until they become automatic. You practice writing the union-find template, the offline sorting pattern, and the two-pointer sweep from memory. After a few rounds, you produce the full solution without hesitation. The mechanics disappear and you focus on recognizing which problems fit this pattern.
Related posts
- Redundant Connection - Union-Find for cycle detection in graphs, LeetCode 684
- Accounts Merge - Union-Find to group connected components by shared keys, LeetCode 721
- Number of Provinces - Basic union-find for counting connected components, LeetCode 547
Want to master graph algorithms? CodeBricks breaks problems like this into their core building blocks and drills them with spaced repetition. You practice union-find, offline processing, and sorting patterns individually until they are second nature. When a graph connectivity problem shows up in your interview, you do not hesitate.