Sort Items by Groups Respecting Dependencies: Double Topological Sort

May 28, 20267 min read

leetcodeproblemhardgraphsorting

You are given n items, each belonging to zero or one group. There are m groups total. Some items have no group at all (marked as -1). You are also given a list of dependency pairs: beforeItems[i] tells you which items must come before item i in the final ordering. Return a sorted list of all n items such that (1) items in the same group appear next to each other, and (2) all dependency constraints are satisfied. If no valid ordering exists, return an empty list.

This is LeetCode 1203: Sort Items by Groups Respecting Dependencies, and it is one of the hardest topological sort problems you will encounter. The twist is that you need topological sort at two different levels simultaneously.

Items colored by group. Solid arrows are within-group dependencies. Dashed red arrows cross group boundaries.

Why this problem matters

Real-world systems are full of grouped dependencies. Think of a build system where source files belong to different packages, and packages depend on each other. Within each package, files have their own compilation order. You need to sort files within packages and packages relative to each other, all at once.

This is exactly the pattern LeetCode 1203 tests. If you can solve this problem, you understand how to layer topological sorts. That skill transfers directly to task scheduling systems, CI/CD pipeline orchestration, and dependency resolution in package managers.

The key insight

You need topological sort at two levels:

Within each group: sort the items inside a group so that all intra-group dependencies are satisfied.
Between groups: sort the groups themselves so that all cross-group dependencies are satisfied.

The trick that makes everything work: assign every ungrouped item its own unique "virtual" group. An item with group[i] = -1 gets a brand-new group ID that no other item shares. This way, every item belongs to exactly one group, and the two-level topological sort handles everything uniformly.

Once you have both levels sorted, you concatenate each group's sorted items in the order determined by the group-level sort. That is your answer.

The solution

from collections import defaultdict, deque

def sort_items(n, m, group, beforeItems):
    for i in range(n):
        if group[i] == -1:
            group[i] = m
            m += 1

    group_items = defaultdict(list)
    for i in range(n):
        group_items[group[i]].append(i)

    item_graph = defaultdict(list)
    item_indegree = [0] * n
    group_graph = defaultdict(set)
    group_indegree = defaultdict(int)

    for i in range(n):
        for dep in beforeItems[i]:
            item_graph[dep].append(i)
            item_indegree[i] += 1
            if group[dep] != group[i]:
                if i not in group_graph[group[dep]]:
                    group_graph[group[dep]].add(group[i])

    group_adj = defaultdict(list)
    real_group_indegree = defaultdict(int)
    all_groups = set(group)
    for g in all_groups:
        real_group_indegree[g] = 0

    for g in all_groups:
        for ng in group_graph[g]:
            group_adj[g].append(ng)
            real_group_indegree[ng] += 1

    def topo_sort_items(items):
        local_indegree = {i: 0 for i in items}
        local_graph = defaultdict(list)
        item_set = set(items)
        for i in items:
            for dep in beforeItems[i]:
                if dep in item_set:
                    local_graph[dep].append(i)
                    local_indegree[i] += 1
        q = deque(i for i in items if local_indegree[i] == 0)
        result = []
        while q:
            node = q.popleft()
            result.append(node)
            for nei in local_graph[node]:
                local_indegree[nei] -= 1
                if local_indegree[nei] == 0:
                    q.append(nei)
        return result if len(result) == len(items) else []

    def topo_sort_groups():
        q = deque(g for g in all_groups if real_group_indegree[g] == 0)
        result = []
        while q:
            g = q.popleft()
            result.append(g)
            for ng in group_adj[g]:
                real_group_indegree[ng] -= 1
                if real_group_indegree[ng] == 0:
                    q.append(ng)
        return result if len(result) == len(all_groups) else []

    group_order = topo_sort_groups()
    if not group_order:
        return []

    result = []
    for g in group_order:
        sorted_items = topo_sort_items(group_items[g])
        if len(sorted_items) != len(group_items[g]):
            return []
        result.extend(sorted_items)

    return result

Let's walk through what this code does.

Assign virtual groups. The first loop gives every ungrouped item (group[i] == -1) its own unique group ID starting from m. After this, every item belongs to exactly one group.

Build two graphs. We construct an item-level dependency graph (who must come before whom) and a group-level dependency graph (which groups must come before which other groups). An edge from item dep to item i becomes a group-level edge from group[dep] to group[i] whenever those two groups differ.

Topological sort groups. We run Kahn's algorithm (BFS with in-degree counting) on the group-level graph. If a cycle exists among groups, no valid ordering is possible.

Topological sort within each group. For each group in the group-level order, we topologically sort the items inside that group. If any group has a cycle in its internal dependencies, we return an empty list.

Concatenate. We emit each group's sorted items in the order determined by the group-level sort. The result satisfies all constraints: items within the same group are adjacent, and all dependency edges point forward.

The virtual group trick is the key to a clean implementation. Without it, you would need special-case logic for ungrouped items. By giving each ungrouped item its own group, you treat everything uniformly. A group with one item trivially passes topological sort.

Practice topological sort patterns with spaced repetition

Visual walkthrough

Step 1: Identify groups and dependencies.

deps: [1->0, 2->1, 4->3, 5->2, 4->5]

Group 0 has items {0, 1, 2}. Group 1 has {3, 4}. Group 2 has {5}. Cross-group edges create group-level dependencies.

Step 2: Topological sort within each group.

grp 0 order: [0, 1, 2]

grp 1 order: [3, 4]

grp 2 order: [5]

Group 0: [0, 1, 2]. Group 1: [3, 4]. Group 2: [5]. Each group's items are in valid dependency order.

Step 3: Topological sort of groups themselves.

group order: [Grp 0, Grp 2, Grp 1]

Item 5 (grp 2) depends on item 2 (grp 0), so grp 0 comes before grp 2. Item 4 (grp 1) depends on item 5 (grp 2), so grp 2 comes before grp 1.

Step 4: Combine. Emit each group's sorted items in group order.

result: [0, 1, 2, 5, 3, 4]

Final answer: [0, 1, 2, 5, 3, 4]. All within-group and cross-group dependencies are satisfied.

The four-step process captures the entire algorithm. First, identify which items belong to which groups and what the dependencies look like. Second, sort items within each group independently. Third, determine the order of groups based on cross-group dependencies. Finally, stitch it all together.

Complexity analysis

Approach	Time	Space
Double topological sort	O(n + e)	O(n + e)

Here n is the number of items and e is the total number of dependency edges (the sum of lengths of all beforeItems[i] lists).

Time: Building both graphs takes O(n + e). Each topological sort visits every node and edge exactly once. The group-level sort processes all groups and inter-group edges. The item-level sorts collectively process all items and intra-group edges. Everything adds up to O(n + e).

Space: The adjacency lists and in-degree arrays for both graphs use O(n + e) total. The queues and result arrays use O(n). Total is O(n + e).

The building blocks

1. Topological sort with cycle detection

The core building block is Kahn's algorithm. You compute in-degrees, seed a queue with zero-in-degree nodes, and peel them off one by one. If the result has fewer nodes than expected, a cycle exists.

def topo_sort(nodes, adj, indeg):
    q = deque(n for n in nodes if indeg[n] == 0)
    order = []
    while q:
        node = q.popleft()
        order.append(node)
        for nei in adj[node]:
            indeg[nei] -= 1
            if indeg[nei] == 0:
                q.append(nei)
    return order if len(order) == len(nodes) else []

This pattern is reusable across dozens of problems. The only thing that changes is how you build the adjacency list and what the "nodes" represent (items, groups, courses, characters in an alien alphabet).

2. Group-level dependency graph

The second building block is extracting group-level edges from item-level edges. Whenever item a depends on item b and they belong to different groups, you add an edge from group[b] to group[a] in the group graph.

for i in range(n):
    for dep in beforeItems[i]:
        if group[dep] != group[i]:
            group_graph[group[dep]].add(group[i])

Using a set for group_graph values prevents duplicate edges. Without deduplication, you would inflate the in-degree counts and break the topological sort.

Drill topological sort building blocks with spaced repetition

Edge cases

No dependencies: beforeItems is all empty lists. Every ordering that groups items together is valid. The algorithm returns items grouped together in any valid order.
All items ungrouped: Every item gets its own virtual group. The problem reduces to a single topological sort since every "group" has exactly one item.
Cycle within a group: Items 0 and 1 are in the same group, 0 depends on 1, and 1 depends on 0. The intra-group topological sort detects this and returns an empty list.
Cycle between groups: Group A depends on Group B and Group B depends on Group A. The group-level topological sort detects this.
Single item, single group: Trivially valid. Return [0].
All items in one group: The problem reduces to a standard single topological sort.

From understanding to recall

You now understand the double topological sort approach. The structure makes sense: virtual groups for ungrouped items, item-level topo sort within groups, group-level topo sort between groups, then concatenate.

But in an interview, you need to reconstruct all of this from scratch. You need to remember the virtual group trick, build two separate graphs, run Kahn's algorithm twice, and stitch the results together. That is a lot of moving parts to hold in your head under pressure.

Spaced repetition bridges the gap. You practice writing the graph construction, the group-edge extraction, and the double topological sort at increasing intervals. After a few rounds, the two-level structure becomes second nature. You do not need to re-derive it each time.

Course Schedule - The foundation for this problem, teaching cycle detection and basic topological sort with Kahn's algorithm
Course Schedule II - Returns the actual topological ordering, which is exactly what we do at both levels in this problem
Alien Dictionary - Another problem where you build a graph from constraints and topologically sort it, good practice for graph construction from implicit edges

CodeBricks breaks Sort Items by Groups into its topological-sort and graph-construction building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the double-sort pattern is automatic. When a layered dependency problem shows up in your interview, you do not hesitate. You just write it.

Try CodeBricks free