Predict the Winner: Game Theory DP

May 19, 20265 min read

leetcodeproblemmediumdynamic-programming

Two players take turns picking numbers from either end of an integer array. Player 1 goes first. Both players play optimally. Return true if Player 1 can win or tie (score greater than or equal to Player 2's score).

This is LeetCode 486: Predict the Winner, and it is a clean introduction to game theory DP. The trick is reframing "who gets what score" into a single recurrence that tracks how far ahead the current player can get.

Players take turns picking from either end of [1, 5, 2]. The DP table stores the maximum score advantage the current player can achieve. dp[0][2] = -2 means Player 1 ends up behind, so the answer is false.

Why the greedy approach fails

Your first instinct might be: always pick the larger of the two ends. But greedy falls apart fast. Consider [1, 5, 233, 7]. If Player 1 greedily takes 7 (the bigger end), Player 2 gets access to 233. The optimal play for Player 1 is actually to take 1, forcing a position where they can grab 233 later.

The issue is that each pick changes what both ends look like for the opponent. You cannot evaluate a pick without considering the entire chain of future moves. That is exactly where DP comes in.

The key insight: track the score difference

Instead of tracking each player's score separately, define a single value:

dp[i][j] = the maximum score difference the current player can achieve when choosing from nums[i..j].

"Current player" means whoever's turn it is. This is elegant because both players use the same strategy (play optimally), so the recurrence is symmetric. When it is your turn on nums[i..j], you have two choices:

Pick left (nums[i]): you gain nums[i], then your opponent faces nums[i+1..j] and plays optimally. Their best score difference from that subarray is dp[i+1][j]. Since their gain is your loss, your net advantage is nums[i] - dp[i+1][j].
Pick right (nums[j]): same logic. Your net advantage is nums[j] - dp[i][j-1].

You pick whichever is better:

dp[i][j] = max(nums[i] - dp[i+1][j], nums[j] - dp[i][j-1])

Base case: when i == j, there is only one number. The current player takes it. dp[i][i] = nums[i].

At the end, dp[0][n-1] gives Player 1's best net advantage over the full array. If it is zero or positive, Player 1 wins (or ties). If negative, Player 1 loses.

Practice this problem with spaced repetition

The bottom-up solution

Fill the table by subarray length, starting with length 1 (the diagonal) and working up to length n.

def predict_the_winner(nums):
    n = len(nums)
    dp = [[0] * n for _ in range(n)]

    for i in range(n):
        dp[i][i] = nums[i]

    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = max(
                nums[i] - dp[i + 1][j],
                nums[j] - dp[i][j - 1]
            )

    return dp[0][n - 1] >= 0

Let's trace through nums = [1, 5, 2] step by step:

Step 1: Base cases (single elements, length 1)

When only one number remains, the current player takes it. dp[i][i] = nums[i].

Step 2: dp[0][1] — choose from [1, 5]

Pick left: 1 - dp[1][1] = 1 - 5 = -4. Pick right: 5 - dp[0][0] = 5 - 1 = 4. Max is 4.

Step 3: dp[1][2] — choose from [5, 2]

Pick left: 5 - dp[2][2] = 5 - 2 = 3. Pick right: 2 - dp[1][1] = 2 - 5 = -3. Max is 3.

Step 4: dp[0][2] — choose from [1, 5, 2] (full array)

Pick left: 1 - dp[1][2] = 1 - 3 = -2. Pick right: 2 - dp[0][1] = 2 - 4 = -2. Max is -2.

Final answer: dp[0][2] = -2

Since dp[0][2] is negative, Player 1 cannot win or tie. Return false. A positive or zero value would mean Player 1 wins (or ties).

The subtraction nums[i] - dp[i+1][j] is the crux. You take nums[i], but then your opponent plays optimally on the remaining subarray. Their advantage dp[i+1][j] works against you, so you subtract it. This single recurrence captures the entire back-and-forth of the game.

Space-optimized version

Since we fill the table by increasing subarray length, and each cell dp[i][j] only depends on dp[i+1][j] (one row below) and dp[i][j-1] (one column left), you can compress the 2D table into a 1D array.

def predict_the_winner(nums):
    n = len(nums)
    dp = list(nums)

    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i] = max(nums[i] - dp[i + 1], nums[j] - dp[i])

    return dp[0] >= 0

Here dp[i] stores the current player's best advantage for the subarray starting at i with the current length being processed. Before the update, dp[i+1] still holds the value for the shorter subarray (i+1, j) and dp[i] holds the value for (i, j-1).

Build your pattern library with CodeBricks

Complexity analysis

Approach	Time	Space
2D DP	O(n^2)	O(n^2)
1D DP (space-optimized)	O(n^2)	O(n)

With n up to 20 (per LeetCode constraints), both versions run instantly. The 2D version is clearer for understanding. The 1D version is a nice bonus to show you understand the dependency structure.

The building blocks

This problem is built on interval DP with a minimax twist. The core building block is the same one you see in Burst Balloons: define a 2D table over subarrays, fill it by increasing length, and at each cell consider all choices for the current interval.

The difference here is the adversarial element. In Burst Balloons, you maximize a single player's total. In Predict the Winner, the subtraction nums[i] - dp[i+1][j] encodes the fact that your opponent is also playing optimally against you. This "score difference" trick is a standard technique in game theory DP:

Stone Game (LeetCode 877): nearly identical structure, but the array length is always even and values are distinct, which guarantees Player 1 always wins.
Stone Game II/III/IV: extensions that change how many elements a player can take per turn, but the underlying DP pattern remains the same.
Nim Game (LeetCode 292): a simpler game-theory problem solvable with pure math, but the "think about what position you leave for your opponent" logic is the same.

Once you internalize the "track the difference, subtract the opponent's optimal play" framework, the entire family of two-player game DP problems opens up.

Edge cases to watch for

Single element: nums = [5]. Only one pick available, Player 1 takes it. dp[0][0] = 5 >= 0, return true.
Two elements: nums = [1, 5]. Player 1 picks the larger end (5). dp[0][1] = max(1-5, 5-1) = 4 >= 0, return true.
All equal values: nums = [3, 3, 3, 3]. The total is always split evenly, so it is a tie. Return true.
Odd vs even length: with an even number of elements, Player 1 can always choose to take all even-indexed or all odd-indexed elements (by controlling which end to pick). This guarantees Player 1 can at least tie. With odd length, Player 2 may have the advantage.

From understanding to recall

Reading through this walkthrough, the recurrence probably makes sense. But will you remember the subtraction trick a week from now? Will you set up the interval lengths correctly under time pressure?

Understanding and recall are different skills. The gap between "I see how it works" and "I can write it from scratch" is real. Spaced repetition closes that gap. You revisit the score-difference recurrence at increasing intervals, write it from memory each time, and after a few reps the pattern becomes automatic.

The interval DP with minimax pattern is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Drilling them individually with spaced repetition beats random problem grinding every time.

Coin Change - Classic DP where each state considers multiple choices, the same "try all options, pick the best" structure
Burst Balloons - Interval DP over subarrays with a 2D table filled by increasing length
Nim Game - A simpler game-theory problem that introduces the idea of forcing your opponent into a losing position

Try CodeBricks free