Skip to content
← All posts

Alien Dictionary: Topological Sort on Characters

10 min read
leetcodeproblemhardgraphs

You are given a list of words sorted in an alien language's dictionary order. The characters are lowercase English letters, but the ordering between them is unknown. Return a string of characters in the correct order for this alien alphabet. If the ordering is invalid (contradictions or impossible constraints), return an empty string.

This is LeetCode 269: Alien Dictionary, and it is one of the best graph problems on the platform. The twist is that you do not start with a graph. You have to figure out which edges exist by comparing adjacent words, build a directed graph from those edges, and then run topological sort to find the alphabet order. Three distinct steps, each requiring a different skill.

From word list to character ordering

Step 1: Compare adjacent words to extract edges

w
r
t
w
r
f
t -> f
w
r
f
e
r
w -> e
e
r
e
t
t
r -> t
e
t
t
r
f
t
t
e -> r

Step 2: Build graph and run topological sort

wertf

Result: w, e, r, t, f. This is the alien alphabet order.

Why this problem matters

The alien dictionary LeetCode problem combines two important skills: extracting hidden structure from data and applying topological sort on that structure. Most graph problems hand you the edges directly. This one makes you derive them.

That extraction step is the part people get wrong in interviews. The topological sort at the end is the same Kahn's algorithm you already know from Course Schedule. The challenge is everything that happens before it. Comparing adjacent words character by character, finding the first mismatch, and carefully handling edge cases like a longer word appearing before a shorter prefix.

LeetCode 269 is also a great test of how well you handle invalid inputs. A cycle in the graph means the ordering is contradictory. A word like "abc" appearing before "ab" violates dictionary rules (a prefix can never come after the full word). These edge cases separate clean solutions from buggy ones.

The key insight

A sorted dictionary encodes ordering constraints between characters. When two adjacent words share a common prefix but differ at some position, the character in the first word must come before the character in the second word in the alien alphabet. That single mismatch gives you one directed edge.

For example, if "wrt" comes before "wrf" in the dictionary, the first two characters ("wr") match. The third characters differ: "t" vs "f". So in this alien alphabet, "t" comes before "f". That is one edge: t -> f.

You repeat this for every pair of adjacent words, collecting all the edges. Then you build a directed graph and run topological sort. If the sort processes all characters, the result is the alien alphabet. If some characters remain (a cycle exists), the ordering is invalid.

Extracting edges from the word list

This is the step most people rush through. Take it slow.

For every pair of adjacent words words[i] and words[i+1]:

  1. Walk through both words character by character
  2. Find the first position where the characters differ
  3. That gives you one edge: words[i][j] -> words[i+1][j] (the first word's character comes before the second's)
  4. Stop comparing after the first mismatch. Characters after the first difference tell you nothing about ordering.

There is one critical edge case: if words[i] is longer than words[i+1] and words[i+1] is a prefix of words[i], the ordering is invalid. For example, "abc" before "ab" is impossible in any valid dictionary. A shorter prefix always comes first.

def extract_edges(words: list[str]) -> tuple[list[tuple[str, str]], bool]:
    edges = []
    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        min_len = min(len(w1), len(w2))
        found_diff = False

        for j in range(min_len):
            if w1[j] != w2[j]:
                edges.append((w1[j], w2[j]))
                found_diff = True
                break

        # If no difference found and first word is longer, invalid
        if not found_diff and len(w1) > len(w2):
            return [], False  # Invalid ordering

    return edges, True

Do not compare all positions. Only the first mismatch matters. If "abc" comes before "adc", you learn that b comes before d. The third characters ("c" vs "c") are equal and tell you nothing. If the third characters differed, that ordering is dependent on the second characters matching, which does not help because they already told you b comes before d.

The full solution

Once you have the edges, the rest is standard topological sort via Kahn's algorithm. The only addition is collecting all unique characters from all words (not just those that appear in edges) so that isolated characters still appear in the output.

from collections import deque

def alien_order(words: list[str]) -> str:
    # Step 1: Collect all unique characters
    chars = set()
    for word in words:
        for ch in word:
            chars.add(ch)

    # Step 2: Build adjacency list and in-degree map
    graph: dict[str, set[str]] = {ch: set() for ch in chars}
    in_degree: dict[str, int] = {ch: 0 for ch in chars}

    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        min_len = min(len(w1), len(w2))
        found_diff = False

        for j in range(min_len):
            if w1[j] != w2[j]:
                if w2[j] not in graph[w1[j]]:
                    graph[w1[j]].add(w2[j])
                    in_degree[w2[j]] += 1
                found_diff = True
                break

        # Invalid: longer word is prefix of shorter word that follows
        if not found_diff and len(w1) > len(w2):
            return ""

    # Step 3: Kahn's algorithm (BFS topological sort)
    queue = deque(ch for ch in chars if in_degree[ch] == 0)
    result = []

    while queue:
        ch = queue.popleft()
        result.append(ch)

        for neighbor in graph[ch]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    # If not all characters are in the result, there is a cycle
    if len(result) != len(chars):
        return ""

    return "".join(result)

Let's break this down step by step.

Step 1 collects every character that appears in any word. Even characters with no ordering constraints (they appear in no edges) need to be in the final output.

Step 2 compares adjacent words to build the graph. We use a set for each adjacency list to avoid duplicate edges. If we added the same edge t -> f twice, we would double-count the in-degree, and the topological sort would fail.

Step 3 is Kahn's algorithm, identical to Course Schedule. Start with all zero in-degree characters, peel them off one by one, decrement neighbors. If the result contains all characters, we have a valid ordering. If not, a cycle made it impossible to process some characters, and we return an empty string.

Using a set for the adjacency list (instead of a list) prevents duplicate edges. This is important because two different word pairs might give you the same edge. For example, both "ab" vs "ac" and "xb" vs "xc" tell you b comes before c. Without deduplication, you would increment c's in-degree twice for the same relationship, causing the topological sort to stall.

Visual walkthrough: topological sort on characters

Let's trace through the classic example: words = ["wrt", "wrf", "er", "ett", "rftt"].

From the word comparisons, we extract edges: t->f, w->e, r->t, e->r. Now watch Kahn's algorithm find the ordering.

Initial: Build graph and compute in-degrees from word comparisons.

wertf
in-degree: {w:0, e:1, r:1, t:1, f:1}
queue: [w]
order: []

Only 'w' has in-degree 0, so it enters the queue first.

Step 1: Dequeue 'w'. Add to order. Decrement neighbors.

wertf
in-degree: {w:0, e:0, r:1, t:1, f:1}
queue: [e]
order: [w]

Removing w's edge drops e's in-degree to 0. It enters the queue.

Step 2: Dequeue 'e'. Add to order. Decrement neighbors.

wertf
in-degree: {w:0, e:0, r:0, t:1, f:1}
queue: [r]
order: [w, e]

Removing e's edge drops r's in-degree to 0.

Step 3: Dequeue 'r'. Add to order. Decrement neighbors.

wertf
in-degree: {w:0, e:0, r:0, t:0, f:1}
queue: [t]
order: [w, e, r]

Removing r's edge drops t's in-degree to 0.

Step 4: Dequeue 't'. Add to order. Decrement neighbors.

wertf
in-degree: {w:0, e:0, r:0, t:0, f:0}
queue: [f]
order: [w, e, r, t]

Removing t's edge drops f's in-degree to 0.

Step 5: Dequeue 'f'. Add to order. Done!

wertf
in-degree: {w:0, e:0, r:0, t:0, f:0}
queue: []
order: [w, e, r, t, f]

Queue is empty. Order has all 5 characters: 'wertf' is the alien alphabet.

The algorithm peels off one character at a time. "w" has no incoming edges, so it goes first. Removing it unlocks "e", which unlocks "r", and so on. The final ordering "wertf" is the alien alphabet.

Complexity analysis

MetricValue
TimeO(C)
SpaceO(U + E)

C is the total number of characters across all words (the sum of word lengths). U is the number of unique characters. E is the number of edges (at most the number of word pairs, so at most len(words) - 1).

Time: We iterate through all characters in adjacent word pairs to extract edges. That costs O(C) in the worst case. The topological sort processes each node and edge once, costing O(U + E). Since E is at most len(words) - 1 and U is at most 26 (lowercase letters), the edge extraction dominates. Total: O(C).

Space: The adjacency list and in-degree map each store up to U entries. The edge set stores up to E entries. Total: O(U + E). Since both U and E are bounded by constants (26 letters, at most 26*25 edges), you could argue the space is O(1), but O(U + E) is more precise for general input.

Edge cases

  • Single word: ["abc"]. No adjacent pairs to compare, so no edges. All characters have in-degree 0. Any ordering of is valid. The algorithm returns one such ordering.
  • All identical words: ["abc", "abc", "abc"]. No mismatches found. No edges. Return any ordering of the characters.
  • Prefix violation: ["abc", "ab"]. The second word is a prefix of the first but comes after it. Invalid. Return "".
  • Contradictory ordering (cycle): ["a", "b", "a"]. This gives edges a->b and b->a, forming a cycle. Topological sort will not process all characters. Return "".
  • Single character repeated: ["z", "z"]. No edges. Return "z".
  • Two words, one character each: ["z", "x"]. One edge: z->x. Return "zx".

The building blocks

This problem is built on two reusable pieces that CodeBricks drills independently:

1. Adjacency list construction

Building a directed graph from derived constraints:

graph: dict[str, set[str]] = {ch: set() for ch in all_chars}
in_degree: dict[str, int] = {ch: 0 for ch in all_chars}

for u, v in edges:
    if v not in graph[u]:
        graph[u].add(v)
        in_degree[v] += 1

This is the same adjacency list pattern from Course Schedule, but with one twist: the nodes are characters instead of integers, and you derive the edges yourself instead of receiving them. The deduplication using a set is the key difference. In Course Schedule, the problem guarantees unique edges. Here, duplicate edges are possible and must be handled.

2. Topological sort (Kahn's algorithm)

The BFS peeling pattern for finding a valid ordering:

queue = deque(node for node in nodes if in_degree[node] == 0)
result = []

while queue:
    node = queue.popleft()
    result.append(node)
    for neighbor in graph[node]:
        in_degree[neighbor] -= 1
        if in_degree[neighbor] == 0:
            queue.append(neighbor)

This is identical to the Kahn's algorithm in Course Schedule. Start with zero in-degree nodes, process them, decrement neighbors, add newly zero-degree nodes to the queue. If the result is shorter than the total node count, a cycle exists. This exact code template works for any topological sort problem, whether the nodes are courses, characters, build targets, or anything else.

The hard part of this problem is not the topological sort. It is the edge extraction. If you can confidently write Kahn's algorithm from memory (the building block above), you can focus all your interview energy on the word-comparison logic and edge case handling.

Common mistakes

1. Forgetting the prefix-length check. If word A is longer than word B and B is a prefix of A, but B comes after A in the list, the input is invalid. Many people skip this check and return a wrong answer instead of an empty string.

2. Adding duplicate edges. If two different word pairs produce the same edge, adding it twice inflates the in-degree count. The topological sort will then fail to process that character because its in-degree never reaches 0. Use a set for the adjacency list.

3. Missing isolated characters. Some characters might appear in the words but never in any edge (they have no ordering constraint relative to other characters). You still need to include them in the output. Collect all characters from all words at the start, not just from edges.

4. Comparing beyond the first mismatch. Only the first differing character in two adjacent words gives a valid ordering constraint. Characters after the first mismatch are irrelevant. A common bug is extracting multiple edges from a single word pair.

5. Getting the edge direction wrong. If word A comes before word B and the first mismatch is A[j] vs B[j], the edge is A[j] -> B[j] (A's character comes first). Flipping this gives the wrong alphabet.

When to use this pattern

Look for these signals:

  • The problem gives you sorted data from which you need to infer an ordering
  • You need to determine a total ordering from pairwise comparisons
  • The problem involves dependencies between characters, tasks, or items that form a directed graph
  • You need to detect if the constraints are contradictory (cycle detection)

Problems that use the same edge-extraction-then-topological-sort pattern:

  • Course Schedule (LeetCode 207): direct edge list with topological sort
  • Course Schedule II (LeetCode 210): same but return the actual ordering
  • Sequence Reconstruction (LeetCode 444): verify a topological order from subsequences
  • Minimum Height Trees (LeetCode 310): peel off leaves layer by layer (similar BFS peeling)

From understanding to recall

You have walked through the entire Alien Dictionary solution: extracting edges from word comparisons, building the graph, and running Kahn's algorithm. The logic makes sense when you read it. But this problem has three phases, and fumbling any one of them means a failed interview.

The edge extraction loop, the prefix length check, the deduplication with sets, the in-degree initialization, the Kahn's BFS loop, and the final cycle check. That is a lot of details to reconstruct under pressure. The gap between "I understand it" and "I can write it cold in 15 minutes" is real, and spaced repetition is how you close it.

Each building block (adjacency list construction and topological sort) is a reusable piece that appears across many graph problems. Drill them separately until they are automatic. Then when the Alien Dictionary shows up, you only need to think about the word-comparison logic. The graph machinery is already in muscle memory.

Related posts

  • Course Schedule - The foundational topological sort problem, teaching Kahn's algorithm and cycle detection that this problem builds on directly
  • Hash Map Patterns - Understanding hash maps is essential for the adjacency list, in-degree tracking, and deduplication in this problem

CodeBricks breaks Alien Dictionary into its adjacency-list-construction and topological-sort building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a character ordering or dependency graph problem shows up in your interview, you do not hesitate. You just write it.