Best Time to Buy and Sell Stock with Cooldown: State Machine DP

May 18, 20264 min read

leetcodeproblemmediumdynamic-programming

309. Best Time to Buy and Sell Stock with Cooldown adds one rule to the classic stock problem: after you sell, you must sit out for one day before you can buy again. That single constraint turns the problem from a greedy scan into a proper state machine. You need to track not just whether you hold a stock, but how you came to be in your current position.

Prices: [1, 2, 3, 0, 2]. Buy day 0, hold day 1, sell day 2 (+2), cooldown day 3, rest day 4. Profit = 3.

The state machine approach

The key insight is that at any point in time, you are in exactly one of three states:

held: you are currently holding a stock (you bought it on some earlier day and have not sold yet)
sold: you just sold your stock today (you trigger the cooldown)
rest: you are not holding a stock, and today is a cooldown day or you are simply waiting

Each state has a value: the maximum cash you can have in that state on day i.

The recurrences express how each state on day i can be reached from the previous day:

held[i] = max(held[i-1], rest[i-1] - prices[i])
- You were already holding from yesterday, or you bought today (which requires coming from rest, not sold)
sold[i] = held[i-1] + prices[i]
- You must have been holding yesterday, and you sell at today's price
rest[i] = max(rest[i-1], sold[i-1])
- You were already resting, or you just came off a cooldown (sold yesterday)

The cooldown rule is encoded in held[i]: you can only buy from the rest state, not from sold. If you could buy from sold, there would be no cooldown.

At the end, the answer is max(sold, rest). You never want to end while still holding stock, since that means you never collected the profit.

The code

def maxProfit(prices: list[int]) -> int:
    if not prices:
        return 0
    held = -prices[0]
    sold = 0
    rest = 0
    for price in prices[1:]:
        prev_held, prev_sold, prev_rest = held, sold, rest
        held = max(prev_held, prev_rest - price)
        sold = prev_held + price
        rest = max(prev_rest, prev_sold)
    return max(sold, rest)

The variables prev_held, prev_sold, and prev_rest capture the state from the previous iteration so that each update uses yesterday's values, not today's (which would be a circular dependency).

Practice DP state machines

Walking through it step by step

Let's trace through prices = [1, 2, 3, 0, 2], which has an optimal profit of 3 (buy at 1, sell at 3, cooldown, rest at 0, done).

Initialize: price = 1 (day 0). held = -1, sold = 0, rest = 0.

Set held = -prices[0] = -1. We bought at day 0. sold and rest start at 0.

Day 1: price = 2.

held = max(-1, rest - 2) = max(-1, -2) = -1. sold = held + 2 = -1 + 2 = 1. rest = max(0, 0) = 0.

Day 2: price = 3.

held = max(-1, rest - 3) = max(-1, -3) = -1. sold = held + 3 = -1 + 3 = 2. rest = max(0, sold_prev=1) = 1.

Day 3: price = 0.

held = max(-1, rest - 0) = max(-1, 1) = 1. sold = held_prev + 0 = -1 + 0 = 0 (no improvement). rest = max(1, sold_prev=2) = 2.

Day 4: price = 2.

held = max(1, rest - 2) = max(1, 0) = 1. sold = held_prev + 2 = 1 + 2 = 3. rest = max(2, sold_prev=0) = 2. Answer = max(sold=3, rest=2) = 3.

At the end of day 4, sold = 3 and rest = 2. The answer is max(3, 2) = 3.

Notice how held jumps to 1 on day 3 (price = 0) because rest - 0 = 2 - 0 = 2 beats the previous held of -1. But even though we could buy at 0, selling at 2 on day 4 gives 1 + 2 = 3, the same profit as our earlier sell. The state machine finds the optimal path automatically.

Drill this pattern with spaced repetition

Complexity analysis

	Complexity
Time	O(n)
Space	O(1)

Time: O(n). You scan through the prices array once. Each day requires a fixed number of comparisons and arithmetic operations.

Space: O(1). You only keep three rolling variables (held, sold, rest) plus three temporaries for the previous values. There are no arrays indexed by day, no memoization table. This is the rolling variable optimization: once you express the recurrence in terms of "yesterday's values only," you can throw away everything older.

Building blocks

This problem is built on two reusable ideas:

State machine DP: when a problem has a small number of distinct modes (held, sold, rest) with defined transitions between them, model those modes as DP states. The recurrences fall out naturally from "what states could I have come from yesterday to be in this state today?" You will see the same structure in paint fence, string edit distance, and many other problems.

Rolling variable optimization: when a DP recurrence only looks one step back, you do not need the full table. Three variables replace three arrays of length n. This always applies when the recurrence is purely in terms of i-1 (not i-2, i-k, etc.) and you do not need random access to old values.

Edge cases to watch for

Single price like [5]: the loop body never runs. held = -5, sold = 0, rest = 0. max(sold, rest) = 0. Correct since you cannot sell what you bought on the same day you return.
All decreasing like [5, 4, 3, 2, 1]: sold stays at 0 throughout (every potential sell price is lower than the previous held cost). The answer is 0.
Cooldown at the end: if you sell on the second-to-last day and the last day would be profitable, the cooldown forces you into rest. The state machine handles this without any special case because held on the cooldown day cannot come from sold.

From understanding to recall

You have seen why the three states exist and how the recurrences encode the cooldown constraint. But there is a gap between understanding and being able to derive the recurrences from scratch in an interview. The state machine structure is not something you want to re-invent under pressure.

Spaced repetition closes that gap. Instead of reading through this post once and moving on, you revisit the core pattern at increasing intervals until deriving held[i] = max(held[i-1], rest[i-1] - prices[i]) becomes as natural as writing a for loop. The three-state machine becomes a building block you reach for automatically whenever you see a sequence problem with constraints on consecutive actions.

Best Time to Buy and Sell Stock - The single-transaction baseline with the running minimum trick
Best Time to Buy and Sell Stock III - Two transactions with a four-state machine in one pass
House Robber - The classic "skip adjacent" DP, the same rolling variable pattern with a different decision

Try CodeBricks free