Linked List Random Node: Reservoir Sampling in One Pass

May 18, 20266 min read

leetcodeproblemmediumlinked-listsmath

Linked List Random Node is LeetCode problem 382. You are given a singly linked list and need to return a random node's value such that every node has an equal probability of being chosen. The twist: you do not know the length of the list ahead of time.

You could walk the list once to count nodes, then walk again to pick a random index. That works, but it requires two passes. Reservoir sampling does it in a single pass, and it generalizes to streams where you truly cannot know the size upfront. That is the technique interviewers want to see.

The problem

You are given the head of a singly linked list. Implement a class with:

__init__(self, head) - stores the head of the linked list
getRandom() - returns a random node's value where each node must be equally likely to be chosen

You must solve this without converting the list to an array first (or assume you cannot for the follow-up about very large or streaming data).

Each of the 5 nodes has an equal 1/5 probability of being selected after one pass through the list.

Why not just count and index?

The simplest approach: walk the list to find its length n, generate a random number between 0 and n - 1, then walk again to that index. Two passes, O(n) time, O(1) space.

def getRandom(self):
    n = 0
    node = self.head
    while node:
        n += 1
        node = node.next
    idx = random.randint(0, n - 1)
    node = self.head
    for _ in range(idx):
        node = node.next
    return node.val

This is fine for a fixed list. But consider the follow-up: what if the list is extremely large, or the data is streaming and you only get one pass? That is where reservoir sampling shines.

Practice this pattern

Reservoir sampling: Algorithm R

Reservoir sampling is a family of algorithms for choosing a random sample from a stream of unknown length. The simplest version, Algorithm R, keeps a "reservoir" of size 1 (one chosen element). As you scan through items:

For the i-th item (1-indexed), replace the current choice with probability 1/i

After scanning all n items, every item has exactly 1/n probability of being the final choice.

The intuition: the first item always gets picked. The second item replaces it with probability 1/2. The third replaces whatever is there with probability 1/3. And so on. The probabilities compound in exactly the right way so that every item ends up with equal weight.

Why does every node get 1/n?

Let's prove it for a list of length 5. For node k to be the final result, two things must happen:

Node k must be picked when we visit it (probability 1/k)
No later node must replace it (nodes k+1 through 5 must all fail to replace)

The probability of not being replaced at step j is (j - 1)/j. So the combined probability for node k is:

(1/k) * (k / (k+1)) * ((k+1) / (k+2)) * ... * (4/5)

This is a telescoping product. Everything cancels, leaving 1/5. The same argument works for any list length n - every node ends up with probability 1/n.

The telescoping cancellation is the heart of reservoir sampling. If an interviewer asks you to prove correctness, this is the argument. Write out the product for one specific node and watch the terms cancel.

Visual walkthrough

Let's trace through the list 1 -> 2 -> 3 -> 4 -> 5 with one specific run. At each node, we roll a random number from 1 to i. If we roll 1, we replace our current result. Watch how the result evolves.

Step 1: Visit node 1 (i = 1)

rand(1, 1) = 1. Only one option, so result = 1. This always happens for the first node.

Step 2: Visit node 2 (i = 2)

rand(1, 2) = 1. We rolled 1, so replace result with 2. Probability of replacing: 1/2.

Step 3: Visit node 3 (i = 3)

rand(1, 3) = 2. Not 1, so we keep result = 2. Probability of replacing was 1/3.

Step 4: Visit node 4 (i = 4)

rand(1, 4) = 1. We rolled 1, so replace result with 4. Probability of replacing: 1/4.

Step 5: Visit node 5 (i = 5)

rand(1, 5) = 3. Not 1, so we keep result = 4. Final answer: 4.

In this particular run, the final result was 4. A different set of random rolls would give a different result, but over many trials each node appears exactly 1/5 of the time.

The code

import random

class Solution:
    def __init__(self, head):
        self.head = head

    def getRandom(self):
        result = 0
        node = self.head
        i = 1
        while node:
            if random.randint(1, i) == 1:
                result = node.val
            node = node.next
            i += 1
        return result

Four lines of real logic inside getRandom. Let's break them down:

Initialize: Start at the head with i = 1 and result = 0.
Loop: For each node, generate a random integer from 1 to i.
Replace: If the roll equals 1 (probability 1/i), update result to the current node's value.
Advance: Move to the next node and increment i.

When the loop ends, result holds a uniformly random node value.

Why check random.randint(1, i) == 1 rather than == i or some other value? It does not matter which specific value you check for, as long as the probability is 1/i. Checking for 1 is the convention, but checking for i works identically.

Drill this solution with spaced repetition

Complexity

	Time	Space
`__init__`	O(1)	O(1)
`getRandom`	O(n)	O(1)
Per call	O(n)	O(1)

Each call to getRandom walks the entire list once. There is no way around this without storing extra data, because you cannot index into a linked list. The space is O(1) since we only keep a single variable for the result.

Building blocks

This problem is built from one core technique:

Reservoir sampling (Algorithm R). Maintain a reservoir of size k (here k = 1). For each new element at position i, include it in the reservoir with probability k/i. For k = 1, this simplifies to replacing the current choice with probability 1/i.

This technique appears in more contexts than you might expect:

Random node from a stream. If data arrives one item at a time and you cannot store it all, reservoir sampling is the standard approach. This is the canonical use case.
Random sampling from a database. When you need k random rows from a table of unknown size without counting first, Algorithm R (generalized to k > 1) gives you a uniform sample in one pass.
Shuffle an Array (LeetCode 384) also relies on randomization guarantees. Fisher-Yates shuffle and reservoir sampling share the same underlying probability argument about each element having equal likelihood of landing in any position.
Weighted random selection. Reservoir sampling extends to weighted variants where each item has a different probability of being chosen, useful in load balancing and recommendation systems.

Once you understand why the telescoping product yields uniform probability, the rest is just applying the pattern to different container sizes.

Edge cases

Single node. i starts at 1, randint(1, 1) always returns 1, so the single node is always chosen. Correct.
Two nodes. The first node is picked, then the second replaces it with probability 1/2. Each has 50% chance. Correct.
Very long list. No issue. The algorithm is O(n) time, O(1) space regardless of length. The random number generator handles large i values fine.
Multiple calls to getRandom. Each call is independent. The list is traversed fresh each time, and a new sequence of random numbers is generated.

From understanding to recall

Reservoir sampling is one of those algorithms that is easy to understand but tricky to reproduce from memory. The core loop is only a few lines, but getting the indexing right (1-indexed counter, checking for 1, incrementing after the replacement) requires practice.

The mental anchor: "for the i-th element, roll a die with i faces. If you roll a 1, swap it in." That single sentence encodes the entire algorithm. Practice typing it from that prompt until the code flows automatically.

Reverse Linked List - fundamental linked list traversal
Linked List Cycle - another single-pass linked list algorithm
Shuffle an Array - another randomization problem using Fisher-Yates

Try CodeBricks free