Skip to content
← All posts

Minimum Distance to Type a Word Using Two Fingers: DP on Finger State

8 min read
leetcodeproblemharddynamic-programming

You have a keyboard laid out as a 6-column grid, with the 26 uppercase letters arranged in order: A through F on row 0, G through L on row 1, and so on. You are typing a word using two fingers. Each finger starts at any position for free (the first time it presses a key, it costs nothing). After that, the cost to move a finger from one key to another is the Manhattan distance between them. Find the minimum total cost to type the entire word.

This is LeetCode 1320: Minimum Distance to Type a Word Using Two Fingers, and it is one of the harder DP problems because the state space is not obvious. You are not just tracking a single position or a single pointer. You are tracking two fingers simultaneously, and at each step you choose which finger types the next letter. The challenge is defining the right DP state so that you capture all the information needed to make optimal decisions without exploding the state space.

ABCDEFGHIJKLMNOPQRSTUVWXYZFinger 1Finger 2Word letterKeyboard: A-Z in rows of 6. Distance = |row1 - row2| + |col1 - col2|
6x5 keyboard layout for the word "CAKE". Each finger starts at any key for free. The cost to move a finger is the Manhattan distance between keys.

Why this problem matters

Two-finger typing is a disguised assignment problem. At each letter, you pick one of two resources (fingers) to handle it, and the cost depends on that resource's current position. This pattern appears in scheduling, routing, and resource allocation problems. If you can model the state as "which resource is where," you can apply the same DP framework.

In interviews, this problem tests whether you can identify a non-obvious state representation. Many candidates try to track the full history of which finger typed which letter. The key realization is that you only need to know where each finger currently sits, not how it got there. That compression from history to position is what makes the DP tractable.

The key insight

At any point during typing, the only information that matters for future decisions is: where is finger 1, and where is finger 2? The next letter in the word is fixed (you must type it), so the state is just the two finger positions.

One of the two fingers must always be on the most recently typed letter. If you just typed letter i, then either finger 1 or finger 2 is sitting on word[i]. That means you only need to track the position of "the other finger," the one that did not just type. The finger that just typed is always known: it is on word[i].

This reduces the state from tracking two arbitrary positions to tracking just one: the position of the idle finger. At each step, you decide whether the current finger continues (moving it to the next letter) or the idle finger takes over (moving the idle finger to the next letter instead, and the previous finger becomes idle).

The solution

def minimumDistance(word):
    n = len(word)

    def pos(c):
        idx = ord(c) - ord('A')
        return idx // 6, idx % 6

    def dist(a, b):
        r1, c1 = pos(a)
        r2, c2 = pos(b)
        return abs(r1 - r2) + abs(c1 - c2)

    dp = {None: 0}

    for i in range(n):
        new_dp = {}
        prev = word[i - 1] if i > 0 else None

        for other, cost in dp.items():
            move_same = dist(prev, word[i]) if prev is not None else 0
            key = other
            if key not in new_dp or new_dp[key] > cost + move_same:
                new_dp[key] = cost + move_same

            move_other = dist(other, word[i]) if other is not None else 0
            key2 = prev
            if key2 not in new_dp or new_dp[key2] > cost + move_other:
                new_dp[key2] = cost + move_other

        dp = new_dp

    return min(dp.values())

Let's walk through the logic.

The pos function converts a letter to its (row, col) position on the 6-column keyboard. The dist function computes the Manhattan distance between two letters.

The DP dictionary maps the position of the idle finger (the one that did not just type) to the minimum cost to reach this configuration. At the start, there is one state: both fingers are unplaced (None), cost 0.

For each letter in the word, you have two choices:

  1. Move the active finger (the one that typed the previous letter) to the next letter. The idle finger stays put. Cost: Manhattan distance from the previous letter to the current letter. If this is the first letter, the cost is 0 (free placement).

  2. Move the idle finger to the next letter. The previously active finger becomes the new idle finger. Cost: Manhattan distance from the idle finger's position to the current letter. If the idle finger has not been placed yet, the cost is 0 (free first placement).

After processing the entire word, the answer is the minimum value across all states.

The trick of tracking only the "other" finger works because one finger is always pinned to the last typed letter. This cuts the state space from 27 x 27 to just 27 states per letter (26 possible positions plus the unplaced state).

Visual walkthrough

Let's trace through the word "CAKE" on the keyboard to see how the DP builds up the optimal finger assignments step by step.

Step 1: Type "C" (first letter). Place finger 1 on C for free.

ABCDEFGHIJKLMNOPQRSTUVWXYZF1

F1=C

F2=-

0

First letter placement costs 0. The DP state is (F1=C, F2=none). Finger 2 has not been placed yet.

Step 2: Type "A". Move F1 from C to A, or place F2 on A for free.

ABCDEFGHIJKLMNOPQRSTUVWXYZF1F2

F1=C

F2=A

0

F1=A

F2=-

2

Option 1: Move F1 from C(row 0, col 2) to A(row 0, col 0), cost = 2. Option 2: Place F2 on A for free, cost = 0. Best: place F2, total cost = 0.

Step 3: Type "K". Move F1 from C to K (cost 2), or F2 from A to K (cost 2).

ABCDEFGHIJKLMNOPQRSTUVWXYZF1F2

F1=K

F2=A

3

F1=C

F2=K

5

F1=K

F2=-

7

From state (C, A): Move F1 C to K = |0-1|+|2-4| = 3, total 3. Move F2 A to K = |0-1|+|0-4| = 5, total 5. From state (A, -): Move F1 A to K = 5, total 7. Best: move F1 from C to K, total cost = 3.

Step 4: Type "E". Move F1 from K to E, or F2 from A to E.

ABCDEFGHIJKLMNOPQRSTUVWXYZF1F2

F1=E

F2=A

4

F1=K

F2=E

7

From best state (K, A) cost 3: Move F1 K to E = |1-0|+|4-4| = 1, total 4. Move F2 A to E = |0-0|+|0-4| = 4, total 7. Best: move F1 from K to E, total cost = 4. Answer: 4.

The walkthrough shows how each decision point offers two choices: move the active finger or swap to the idle finger. The DP explores all possibilities and keeps only the best cost for each idle-finger position. For "CAKE," the optimal strategy is to place F1 on C, place F2 on A for free, move F1 from C to K (cost 3), then move F1 from K to E (cost 1), for a total of 4.

Complexity analysis

ApproachTimeSpace
DP on finger statesO(n * 27)O(27)

Time: O(n * 27). For each of the n letters in the word, you iterate over at most 27 possible idle-finger positions (26 letters plus the unplaced state). Each transition is O(1) thanks to the Manhattan distance formula.

Space: O(27). You only keep the current DP dictionary, which has at most 27 entries. The previous dictionary is discarded after each letter. This is constant space relative to the alphabet size.

Note: you might also see this expressed as O(n * 26 * 26) for the version that tracks both finger positions explicitly. The "track only the idle finger" optimization reduces it to O(n * 27), which is equivalent to O(n) since 27 is a constant.

The building blocks

1. State compression via invariant

The most important building block here is recognizing an invariant that compresses the state. One finger always sits on the last typed letter. That is guaranteed by the problem structure: somebody just typed that letter, so one of the two fingers must be there. Exploiting this invariant cuts the state space from 27 x 27 down to 27.

This technique appears whenever you have multiple agents or pointers and one of them is constrained by the most recent action. Look for it in problems involving two pointers on a sequence, two robots on a grid, or any assignment problem where one resource is "pinned" to the latest task.

dp = {None: 0}

for i in range(n):
    new_dp = {}
    prev = word[i - 1] if i > 0 else None

    for other, cost in dp.items():
        move_same = dist(prev, word[i]) if prev is not None else 0
        if other not in new_dp or new_dp[other] > cost + move_same:
            new_dp[other] = cost + move_same

        move_other = dist(other, word[i]) if other is not None else 0
        if prev not in new_dp or new_dp[prev] > cost + move_other:
            new_dp[prev] = cost + move_other

    dp = new_dp

2. Free first placement

Each finger's first key press costs nothing. This is handled by treating None as a valid position with distance 0 to any target. Without this, you would need special cases for the first two letters. The None sentinel unifies the logic.

move_other = dist(other, word[i]) if other is not None else 0

This pattern generalizes to any DP where agents start at an undefined position. Instead of hardcoding the start, let the first transition be free and track "not yet placed" as a valid state.

Edge cases

  • Single letter: both fingers can go there for free. Answer is always 0.
  • Two letters: place one finger on the first letter for free, then either move it to the second letter or place the second finger on the second letter for free. Answer is 0 if you use both fingers, or the Manhattan distance if you insist on using one.
  • All same letter: e.g., "AAAA". Once a finger is on A, every subsequent press costs 0. Answer is 0.
  • Two alternating letters: e.g., "ABABAB". Place one finger on A and one on B. Every press costs 0 because the right finger is already in position. Answer is 0.
  • Worst case distance: letters at opposite corners of the keyboard (A at row 0, col 0 and Z at row 4, col 1). Manhattan distance is 4 + 1 = 5.
  • Long words (up to 300 characters): the O(n * 27) solution handles this easily. No performance concerns.

From understanding to recall

You now know how to model the two-finger typing problem as a DP on idle-finger position, why one finger is always pinned to the last typed letter, and how free first placement simplifies the state transitions. But reading this once will not make you fluent enough to reproduce it under interview pressure.

The state compression insight is the crux. In an interview, you need to quickly see that tracking both fingers independently wastes states, identify the invariant that one finger is always on the previous letter, and set up the DP dictionary with None as the sentinel for unplaced fingers. That chain of reasoning needs to be automatic, not something you reconstruct from scratch.

Spaced repetition locks this in. You revisit the problem at increasing intervals, each time defining the state, writing the transitions, and handling the None case from memory. After a few reps, you see "two resources, one always at the latest position" and the state compression clicks instantly.

The finger-state DP pattern is one of roughly 60 reusable building blocks in the CodeBricks system. You practice each block individually until writing the solution from scratch is second nature. Problems that look overwhelming at first become routine once you recognize which building blocks they combine.

Related posts

  • Edit Distance - Another DP where the state tracks positions in two sequences, with three choices at each step
  • Dungeon Game - Grid DP that requires careful state definition and a non-obvious fill direction
  • Coin Change - Classic DP with a dictionary-based state transition, showing the pattern of iterating over choices at each step