Skip to content
← All posts

Largest Color Value in a Directed Graph

7 min read
leetcodeproblemhardgraphdynamic-programming

You are given a directed graph of n colored nodes and m edges. Each node has a lowercase letter as its color, given by a string colors where colors[i] is the color of node i. You also get an array edges where edges[j] = [a, b] means there is a directed edge from node a to node b.

A color value of a valid path is the number of nodes in that path that have the most frequent color. Return the largest color value of any valid path in the graph, or -1 if the graph contains a cycle.

This is LeetCode 1857: Largest Color Value in a Directed Graph, a hard problem that combines topological sort with dynamic programming.

anode 0bnode 1anode 2cnode 3anode 4path 0 → 2 → 3 → 4 has 3 nodes colored "a"
colors = "abaca", edges = [[0,1],[0,2],[2,3],[3,4]]. The largest color value is 3 (color "a" appears 3 times on the path 0 → 2 → 3 → 4).

Why this problem matters

This problem tests whether you can combine two fundamental techniques: topological sort and DP on a DAG. Each technique is useful on its own, but this problem requires wiring them together. Topological sort gives you the correct processing order (so you always process parents before children), and DP propagates color frequency counts along every path.

The pattern of "DP on a DAG in topological order" appears in many scheduling, dependency analysis, and optimization problems. If you can solve this one, you have the toolkit for a whole class of hard graph problems.

The key insight

For every node u, track how many times each color appears on the best path ending at u. Store this in a 2D array: dp[u][c] = the maximum count of color c across all paths that end at node u.

When processing node u in topological order, propagate its dp values to each neighbor v:

dp[v][c] = max(dp[v][c], dp[u][c])   for every color c

After propagation, increment dp[v] by 1 for node v's own color. This accounts for v itself being on the path.

If topological sort cannot process all nodes (because a cycle exists), return -1. Otherwise, the answer is the maximum value across the entire dp table.

The solution

from collections import deque

def largest_path_value(colors: str, edges: list[list[int]]) -> int:
    n = len(colors)
    graph = [[] for _ in range(n)]
    in_degree = [0] * n

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    dp = [[0] * 26 for _ in range(n)]
    queue = deque()

    for i in range(n):
        if in_degree[i] == 0:
            queue.append(i)
            dp[i][ord(colors[i]) - ord('a')] = 1

    processed = 0
    result = 0

    while queue:
        u = queue.popleft()
        processed += 1
        result = max(result, max(dp[u]))

        for v in graph[u]:
            for c in range(26):
                dp[v][c] = max(dp[v][c], dp[u][c])
            dp[v][ord(colors[v]) - ord('a')] += 1
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)

    if processed < n:
        return -1

    return result

Let's break this down.

First, build the adjacency list and compute in-degrees from the edge list. This is the same setup you see in Course Schedule and other topological sort problems.

The dp table is a 2D array where dp[i][c] tracks the maximum count of color c on any path ending at node i. We use 26 slots (one per lowercase letter).

Initialize by enqueuing all nodes with in-degree 0. For each of these source nodes, set dp[node][own_color] = 1 because the node itself starts a path of length 1 with its own color.

In the BFS loop, dequeue a node u and propagate its dp values to every neighbor v. For each of the 26 colors, take the max of dp[v][c] and dp[u][c]. Then increment dp[v] at v's own color index by 1. This "+1" accounts for v adding itself to the path.

After propagation, decrement v's in-degree. When it reaches 0, enqueue v. This is standard Kahn's algorithm.

If we process fewer than n nodes, the graph has a cycle. Return -1. Otherwise, the result is the maximum value found across all dp entries.

The "+1 for the neighbor's own color" happens when we propagate to the neighbor, not when we dequeue it. This means the dp values are fully computed by the time a node leaves the queue. You could also initialize dp values when enqueuing (set own color to 1) and only propagate without the +1. Both approaches work, but the version shown here keeps the logic in one place.

Step-by-step walkthrough

Let's trace the algorithm on colors = "abaca" with edges = [[0,1],[0,2],[2,3],[3,4]]. Node 0 is the only source (in-degree 0). The path 0 -> 2 -> 3 -> 4 has three nodes colored "a", giving the answer 3.

Step 1: Initialize. Compute in-degrees. Enqueue node 0 (in-degree 0).

Each node gets a dp row of [0, 0, 0] for colors a, b, c. Node 0 has in-degree 0, so it enters the queue first. Increment dp[0][a] = 1 because node 0 is colored 'a'.

abcnode 0 (a)100node 1 (b)000node 2 (a)000node 3 (c)000node 4 (a)000
queue: [0]
processed: []

Step 2: Process node 0. Propagate dp values to neighbors 1 and 2.

For each neighbor, update dp[neighbor][c] = max(dp[neighbor][c], dp[0][c]). Then add 1 for the neighbor's own color. Node 1 gets dp[1] = [1, 1, 0] (inherits a=1 from node 0, plus its own b). Node 2 gets dp[2] = [2, 0, 0] (inherits a=1, plus its own a).

abcnode 0 (a)100node 1 (b)110node 2 (a)200node 3 (c)000node 4 (a)000
queue: [1, 2]
processed: [0]

Step 3: Process node 1 (no outgoing edges). Process node 2, propagate to node 3.

Node 1 is a leaf, nothing to propagate. For node 2, update dp[3][c] = max(dp[3][c], dp[2][c]). Then add 1 for node 3's own color 'c'. Node 3 gets dp[3] = [2, 0, 1].

abcnode 0 (a)100node 1 (b)110node 2 (a)200node 3 (c)201node 4 (a)000
queue: [3]
processed: [0, 1, 2]

Step 4: Process node 3, propagate to node 4.

Update dp[4][c] = max(dp[4][c], dp[3][c]). Then add 1 for node 4's own color 'a'. Node 4 gets dp[4] = [3, 0, 1]. The value 3 at dp[4][a] is our answer.

abcnode 0 (a)100node 1 (b)110node 2 (a)200node 3 (c)201node 4 (a)301
queue: []
processed: [0, 1, 2, 3, 4]

The dp table builds up from source nodes outward. By the time we reach node 4, it has accumulated the color counts from every path that leads to it. The maximum across all entries (dp[4][a] = 3) is our answer.

The inner loop over 26 colors might look expensive, but 26 is a constant. The total work is O(26 * (V + E)), which simplifies to O(V + E) in big-O terms. In practice, the constant factor of 26 is small enough to pass within time limits.

Complexity analysis

ApproachTimeSpace
Topological sort + DPO(26 * (V + E)) = O(V + E)O(26 * V) = O(V)

V is the number of nodes, E is the number of edges.

Time: Topological sort visits each node and edge once. For each edge, we do a constant-time loop over 26 colors. Total: O(26 * (V + E)), which is O(V + E) since 26 is a constant.

Space: The dp table has 26 entries per node: O(26V) = O(V). The adjacency list and in-degree array add O(V + E). Total: O(V + E).

The building blocks

This problem combines two reusable building blocks:

1. Topological sort (Kahn's algorithm)

The BFS-based topological sort using in-degree counting:

queue = deque(i for i in range(n) if in_degree[i] == 0)
processed = 0

while queue:
    u = queue.popleft()
    processed += 1
    for v in graph[u]:
        in_degree[v] -= 1
        if in_degree[v] == 0:
            queue.append(v)

This is the same algorithm used in Course Schedule. It processes nodes in dependency order and detects cycles when processed is less than n. The key property is that when a node is dequeued, all of its predecessors have already been processed. That guarantee is what makes the DP propagation correct.

2. DP on a DAG

The pattern of propagating computed values along edges in topological order:

for v in graph[u]:
    for c in range(26):
        dp[v][c] = max(dp[v][c], dp[u][c])
    dp[v][own_color] += 1

DP on a DAG works because topological order ensures that when you process a node, all incoming paths have already been accounted for. You can compute longest paths, shortest paths, or (as in this problem) maximum color frequencies. The recurrence changes, but the processing pattern stays the same.

A common mistake is trying to propagate dp values when dequeuing a node instead of when processing its outgoing edges. If you wait until dequeue time to set dp[v] based on predecessors, you need to re-examine all predecessors. Propagating forward along edges (as shown) is simpler and avoids that issue.

Edge cases

  • Graph contains a cycle: return -1. Kahn's algorithm detects this when processed is less than n.
  • Single node, no edges: the answer is 1. That node's own color counts once.
  • All nodes have the same color: the answer is the length of the longest path in the DAG.
  • Disconnected graph: multiple source nodes with in-degree 0. Kahn's handles this naturally since all sources enter the queue at initialization.
  • Self-loop: edges = [[0, 0]]. Node 0 has in-degree 1 and is never enqueued. processed stays at 0, which is less than n. Return -1.
  • Linear chain: nodes 0 -> 1 -> 2 -> ... -> n-1. The answer depends on how many nodes share the most common color along this single path.

From understanding to recall

You have seen how topological sort and DP combine to track color frequencies along every path in a directed graph. The logic is clear: process nodes in topological order, propagate dp values forward along edges, increment for the current node's color.

But in an interview, you need to set up the adjacency list, compute in-degrees, initialize dp for source nodes, write the BFS loop with the inner 26-color propagation, and check for cycles. That is a lot of moving parts under pressure.

Spaced repetition is built for this. You practice writing the Kahn's BFS template and the DP propagation loop from scratch at increasing intervals. After a few rounds, the pieces snap together automatically. You do not re-derive the approach. You just write it.

Related posts

  • Course Schedule - The foundational topological sort problem, teaching Kahn's algorithm and cycle detection that this problem builds on
  • Clone Graph - Graph traversal with DFS and hash maps, building the graph manipulation skills needed for directed graph problems
  • Number of Provinces - Connected components with Union-Find, another core graph technique that complements topological sort

CodeBricks breaks this problem into its topological-sort and DAG-DP building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a hard graph DP problem shows up in your interview, you do not freeze. You just write it.