Course Schedule: Cycle Detection with DFS

May 1, 20269 min read

leetcodeproblemmediumgraphs

You are given numCourses and a list of prerequisite pairs. Each pair [a, b] means you must take course b before course a. Can you finish all courses?

This is LeetCode 207: Course Schedule, and it boils down to one question: does the dependency graph have a cycle? If it does, you are stuck in an infinite loop of prerequisites and can never finish. If it does not, there is a valid ordering and you can complete everything.

Valid schedule (no cycle)

Courses 0 → 1 → 2 → 3. All prerequisites satisfied in order.

Impossible (cycle detected)

0 requires 1, 1 requires 2, 2 requires 0. Circular dependency. Impossible to finish.

Why this problem matters

The course schedule LeetCode problem is your introduction to directed graphs and cycle detection DFS. Unlike grid problems where the graph structure is implicit (each cell connects to its neighbors), here you build the graph yourself from an edge list. That is a different skill, and it is the foundation for problems involving dependencies, task scheduling, and compilation order.

Topological sort and cycle detection show up everywhere in software engineering. Package managers use topological sort to install dependencies in the right order. Build systems like Make use it to figure out which files to compile first. The concept is not just an interview trick.

The key insight

Model each course as a node. For every prerequisite pair [a, b], draw a directed edge from b to a (meaning "b must come before a"). Now the question becomes: does this directed graph contain a cycle?

If there is a cycle, it is impossible to finish all courses. If there is no cycle, the graph is a DAG (directed acyclic graph), and a valid ordering exists.

Two classic approaches work here:

DFS with three-color marking to detect back edges (cycles)
BFS with Kahn's algorithm (topological sort via in-degree counting)

Both run in O(V + E) time. Let's walk through each.

Solution 1: DFS cycle detection

The idea is to run DFS from every unvisited node. Track each node's state with three colors:

White (unvisited): have not touched this node yet
Gray (visiting): currently in the DFS call stack, meaning we are exploring its descendants
Black (visited): fully processed, all descendants explored

If DFS ever reaches a node that is currently gray, we have found a back edge, which means there is a cycle. A gray node is an ancestor in the current DFS path, so reaching it again means we have looped back.

def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
    # Build adjacency list
    graph: dict[int, list[int]] = {i: [] for i in range(num_courses)}
    for course, prereq in prerequisites:
        graph[prereq].append(course)

    # 0 = white, 1 = gray, 2 = black
    color = [0] * num_courses

    def has_cycle(node: int) -> bool:
        color[node] = 1  # Mark gray (visiting)

        for neighbor in graph[node]:
            if color[neighbor] == 1:
                return True  # Back edge = cycle!
            if color[neighbor] == 0:
                if has_cycle(neighbor):
                    return True

        color[node] = 2  # Mark black (done)
        return False

    for i in range(num_courses):
        if color[i] == 0:
            if has_cycle(i):
                return False

    return True

Let's break this down.

First, we build an adjacency list from the prerequisites. For each pair [course, prereq], we add an edge from prereq to course. This means "after finishing prereq, you can take course."

Then we run DFS from every unvisited (white) node. When we enter a node, we color it gray. When we finish exploring all its neighbors, we color it black. If we ever try to visit a gray node, that is a cycle.

The outer loop handles disconnected components. A graph with 5 courses might have two separate groups with no edges between them. We need to check each group independently.

The three-color approach is sometimes called "white-gray-black DFS." You can simplify it to two states (visited/visiting) using a visited set and a rec_stack set, but the three-color version maps directly to the textbook definition and is easier to reason about. Both detect cycles correctly.

Practice cycle detection with spaced repetition

Solution 2: BFS with Kahn's algorithm (topological sort)

Kahn's algorithm takes the opposite approach. Instead of looking for cycles, it tries to build a valid topological ordering. If it can process all nodes, there is no cycle. If some nodes remain unprocessed, a cycle exists.

The algorithm:

Compute the in-degree (number of incoming edges) for every node
Add all nodes with in-degree 0 to a queue (these have no prerequisites)
Process the queue: dequeue a node, add it to the ordering, and decrement the in-degree of all its neighbors
When a neighbor's in-degree reaches 0, add it to the queue
If the final ordering contains all nodes, there is no cycle

from collections import deque

def can_finish(num_courses: int, prerequisites: list[list[int]]) -> bool:
    graph: dict[int, list[int]] = {i: [] for i in range(num_courses)}
    in_degree = [0] * num_courses

    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_degree[course] += 1

    # Start with all zero in-degree nodes
    queue = deque(i for i in range(num_courses) if in_degree[i] == 0)
    processed = 0

    while queue:
        node = queue.popleft()
        processed += 1

        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    return processed == num_courses

This is elegant. You peel off nodes with no remaining dependencies, one layer at a time. If a cycle exists, the nodes in the cycle will always have in-degree > 0 because they depend on each other. They never make it into the queue, so processed ends up less than num_courses.

Visual walkthrough: Kahn's algorithm

Let's trace through a 4-course example with prerequisites [[1,0],[2,0],[3,1],[3,2]]. This means course 0 is a prerequisite for courses 1 and 2, and both 1 and 2 are prerequisites for course 3.

Initial: Build adjacency list and compute in-degrees.

in-degree: {0:0, 1:1, 2:1, 3:2}

queue: [0]

order: []

Node 0 has in-degree 0, so it goes into the queue first.

Step 1: Dequeue node 0. Add to order. Decrement neighbors.

in-degree: {0:0, 1:0, 2:0, 3:2}

queue: [1, 2]

order: [0]

Removing 0's edges drops in-degree of 1 and 2 to 0. Both enter the queue.

Step 2: Dequeue node 1. Add to order. Decrement neighbors.

in-degree: {0:0, 1:0, 2:0, 3:1}

queue: [2]

order: [0, 1]

Removing 1's edge drops in-degree of 3 from 2 to 1. Not zero yet.

Step 3: Dequeue node 2. Add to order. Decrement neighbors.

in-degree: {0:0, 1:0, 2:0, 3:0}

queue: [3]

order: [0, 1, 2]

Removing 2's edge drops in-degree of 3 to 0. Node 3 enters the queue.

Step 4: Dequeue node 3. Add to order. Done!

in-degree: {0:0, 1:0, 2:0, 3:0}

queue: []

order: [0, 1, 2, 3]

Queue is empty. Order has all 4 nodes, so all courses can be finished.

The algorithm peels off nodes layer by layer. Course 0 has no prerequisites, so it goes first. Removing it unlocks courses 1 and 2. Removing those unlocks course 3. Every node gets processed, confirming there is no cycle.

Kahn's algorithm naturally produces a topological ordering, not just a yes/no answer. If you need the actual order (like in Course Schedule II), collect the nodes as you dequeue them. The DFS approach can also produce an ordering by recording nodes in reverse post-order.

DFS vs BFS: which should you use?

Both approaches have the same time and space complexity. Here is when to pick each one.

DFS (three-color) is better when:

You only need to detect if a cycle exists (yes/no)
You are already comfortable with recursive DFS
The problem asks about paths or reachability (DFS gives you the current path for free)

BFS (Kahn's) is better when:

You need the actual topological ordering
You want to avoid recursion (no stack overflow risk)
The problem involves "levels" or "layers" of dependencies

In an interview, I would default to Kahn's algorithm. It is iterative, produces the ordering directly, and avoids the subtle off-by-one mistakes that come with recursive DFS coloring.

Drill topological sort building blocks

Complexity analysis

Approach	Time	Space
DFS (three-color)	O(V + E)	O(V + E)
BFS (Kahn's)	O(V + E)	O(V + E)

V is the number of courses, E is the number of prerequisite pairs.

Time: Both approaches visit every node and edge exactly once. Building the adjacency list takes O(E), and the traversal takes O(V + E).

Space: The adjacency list uses O(V + E). DFS uses O(V) for the color array and O(V) for the call stack in the worst case. Kahn's uses O(V) for the in-degree array and O(V) for the queue. Total is O(V + E) for both.

Edge cases

No prerequisites: prerequisites = []. Every course is independent. Return True. Kahn's processes all nodes immediately since every in-degree is 0.
Self-loop: [[0, 0]] means course 0 requires itself. That is a cycle. Return False.
Two-node cycle: [[0, 1], [1, 0]]. Course 0 needs 1, course 1 needs 0. Cycle. Return False.
Disconnected graph: Multiple independent groups of courses with no edges between them. Both approaches handle this correctly.
Single course: numCourses = 1, prerequisites = []. Trivially true.

The building blocks

This problem is built on two reusable pieces that CodeBricks drills independently:

1. Adjacency list construction

Converting an edge list into an adjacency list representation:

graph: dict[int, list[int]] = {i: [] for i in range(n)}
for u, v in edges:
    graph[v].append(u)

This is the starting point for virtually every graph problem. You get edges as pairs, and you need them organized by source node for efficient neighbor lookup. The direction matters: graph[v].append(u) means "v points to u." Getting this backwards is one of the most common bugs in graph problems. Always think carefully about which node is the source and which is the destination.

2. DFS with state tracking (three-color marking)

The pattern of tracking node states during DFS to detect back edges:

def dfs(node):
    state[node] = VISITING
    for neighbor in graph[node]:
        if state[neighbor] == VISITING:
            # cycle detected
        if state[neighbor] == UNVISITED:
            dfs(neighbor)
    state[node] = VISITED

This three-state DFS is different from the simple "visited or not" DFS you see in Number of Islands. In flood fill, you only need two states because you are exploring an undirected graph. In directed graphs, you need three states to distinguish between "I am still exploring this node's subtree" (gray/visiting) and "I am completely done with this node" (black/visited). That distinction is what lets you detect cycles.

A common mistake is using a simple visited set instead of three colors. With just two states, you cannot tell the difference between a back edge (cycle) and a cross edge (visiting a node that was fully processed by a different DFS branch). Two states work for undirected graphs but fail for directed cycle detection.

Common mistakes

1. Building the graph backwards. The pair [a, b] means "to take a, you need b." The edge goes from b to a, not a to b. Read the problem statement carefully.

2. Using two states instead of three for DFS. A visited set alone will give false positives for cycles. You need to distinguish "currently in the DFS stack" from "fully processed."

3. Forgetting disconnected components. If you only run DFS from node 0, you miss cycles that exist in a separate component. The outer loop that checks all unvisited nodes is essential.

4. Not handling the no-prerequisites case. When the prerequisite list is empty, the answer is always True. Both solutions handle this naturally, but it is worth checking mentally.

When to use this pattern

Look for these signals:

The problem involves dependencies or ordering constraints between items
You need to determine if a valid ordering exists
The problem mentions prerequisites, dependencies, or schedules
You need to detect circular dependencies

Problems that use the same cycle detection DFS or topological sort pattern:

Course Schedule II (LeetCode 210): same problem but return the actual ordering
Alien Dictionary (LeetCode 269): build a graph from character ordering constraints, then topological sort
Minimum Height Trees (LeetCode 310): peel off leaf nodes layer by layer (similar to Kahn's)
Parallel Courses (LeetCode 1136): Kahn's with level tracking to find the minimum semesters

From understanding to recall

You have seen both the DFS three-color approach and Kahn's algorithm. The logic makes sense when you read it. But in an interview, you need to reconstruct the adjacency list, set up the in-degree array, write the BFS loop, and check the final count, all from memory. Those details are easy to trip on under pressure.

The gap between understanding and fluency is exactly what spaced repetition is for. You practice writing the adjacency list construction and the Kahn's BFS loop from scratch at increasing intervals. After a few rounds, your fingers type in_degree[neighbor] -= 1 without your brain needing to think about it.

Graph problems feel intimidating because there are multiple pieces to wire together. But each piece (build adjacency list, compute in-degrees, BFS peel-off loop) is a small, learnable building block. Drill them separately, then combine.

Number of Islands - The essential grid DFS problem, teaches the flood fill pattern that is the simpler cousin of directed graph DFS
Word Search - Backtracking DFS on a grid, showing the choose-explore-unchoose pattern that contrasts with the permanent marking used in cycle detection
Climbing Stairs - Builds the recursive thinking that makes DFS intuition click

CodeBricks breaks Course Schedule into its adjacency-list-construction and DFS-cycle-detection building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a graph dependency problem shows up in your interview, you do not hesitate. You just write it.

Try CodeBricks free