Skip to content
← All posts

Shortest Path Visiting All Nodes: BFS with Bitmask

8 min read
leetcodeproblemhardgraphbit-manipulationdynamic-programming

You are given an undirected, connected graph of n nodes labeled 0 to n - 1. You can start from any node and revisit nodes and edges as many times as you want. Return the length of the shortest path that visits every node at least once.

This is LeetCode 847: Shortest Path Visiting All Nodes, a hard problem that combines BFS with bitmask state tracking. The trick is encoding "which nodes have I visited so far" as a bitmask, then running BFS over (node, mask) pairs to find the shortest traversal.

graph012visited bitmask1n01n10n2mask = 0b110 = 3state: (node=1, mask=3)visitedcurrentunvisited
A 3-node graph with BFS state tracking. The bitmask encodes which nodes have been visited. The goal is to reach a mask where all bits are 1.

Why this problem matters

This problem teaches you how to combine graph traversal with state compression. In a standard BFS, you track which nodes you have visited with a simple set or boolean array. Here, the same node might need to be visited multiple times (once while heading toward node 3, once while heading toward node 5), so a simple visited set does not work. You need to track the combination of your current position and the set of nodes you have already reached.

Bitmask state tracking is a powerful technique that appears in traveling salesman variants, Hamiltonian path problems, and any graph problem where you need to remember a subset of visited elements. Once you understand how to encode sets as integers and run BFS over (position, set) pairs, a whole class of problems opens up.

The brute force approach

The most direct approach is to try every possible path permutation and find the shortest one. You could use DFS to explore all orderings of node visits.

def shortest_path_brute(graph):
    n = len(graph)
    if n == 1:
        return 0

    from itertools import permutations
    from collections import deque

    def bfs_dist(src, dst):
        if src == dst:
            return 0
        visited = {src}
        queue = deque([(src, 0)])
        while queue:
            node, dist = queue.popleft()
            for nei in graph[node]:
                if nei == dst:
                    return dist + 1
                if nei not in visited:
                    visited.add(nei)
                    queue.append((nei, dist + 1))
        return float('inf')

    best = float('inf')
    for perm in permutations(range(n)):
        total = 0
        for i in range(len(perm) - 1):
            total += bfs_dist(perm[i], perm[i + 1])
        best = min(best, total)
    return best

This generates all n! permutations of nodes and computes the shortest path between consecutive nodes in each ordering. The time complexity is O(n! * n * (n + E)), which is completely impractical for anything beyond tiny inputs. We need a way to avoid redundant exploration.

The key insight: BFS over (node, bitmask) states

The core idea is that the "state" of your traversal is not just which node you are on, but also which nodes you have already visited. You can encode the set of visited nodes as a bitmask: an integer where bit i is 1 if node i has been visited, and 0 otherwise.

Think of each BFS state as a pair: (current node, visited bitmask). Two states are different even if they are at the same node, as long as they have visited different subsets of nodes. BFS over these states guarantees the shortest path because every edge has weight 1.

The target state is any (node, mask) where the mask has all n bits set, meaning mask == (1 << n) - 1. Since you can start from any node, you initialize the BFS queue with one entry per node, each with a mask that has only that node's bit set.

Walking through it step by step

Let's trace through graph = [[1,2],[0,2],[0,1]], a triangle of 3 nodes. The answer is 2 because you can visit all 3 nodes in just 2 edges (for example, 0 to 1 to 2).

Step 1: Initialize BFS from every node.

graph012visited mask0n00n10n2queue(node=0, mask=001)(node=1, mask=010)(node=2, mask=100)

Each node is a starting point. The initial mask for node i has only bit i set, meaning only that node has been visited so far.

Step 2: Process (node=0, mask=001). Expand to neighbors 1 and 2.

graph012visited mask1n00n10n2queue(node=1, mask=011)(node=2, mask=101)

From node 0, visit neighbor 1 (mask becomes 011) and neighbor 2 (mask becomes 101). Each new state is a unique (node, mask) pair.

Step 3: Process (node=1, mask=010). Expand to neighbors 0 and 2.

graph012visited mask0n01n10n2queue(node=0, mask=011)(node=2, mask=110)

From node 1, visit neighbor 0 (mask becomes 011) and neighbor 2 (mask becomes 110). Different starting nodes produce different mask progressions.

Step 4: Process (node=1, mask=011). Expand to neighbor 2.

graph012visited mask1n01n10n2queue(node=2, mask=111)

From node 1 with nodes 0 and 1 already visited, we reach node 2. The new mask is 111, meaning all nodes visited.

Step 5: Reach mask=111. All nodes visited. Return distance = 2.

graph012visited mask1n01n11n2queuemask = 0b111 = 7 = (1 << 3) - 1

The mask equals (1 << n) - 1, so all nodes have been visited. BFS guarantees this is the shortest path: 2 edges.

BFS processes states level by level. Each level corresponds to one additional edge traversed. The first time we reach a mask with all bits set, we know we have found the shortest path. No backtracking or re-exploration is needed because BFS explores in order of increasing distance.

The BFS + bitmask solution

Here is the complete solution in Python:

def shortestPathLength(graph):
    n = len(graph)
    if n == 1:
        return 0

    target = (1 << n) - 1
    queue = collections.deque()
    visited = set()

    for i in range(n):
        mask = 1 << i
        queue.append((i, mask, 0))
        visited.add((i, mask))

    while queue:
        node, mask, dist = queue.popleft()
        for nei in graph[node]:
            new_mask = mask | (1 << nei)
            if new_mask == target:
                return dist + 1
            if (nei, new_mask) not in visited:
                visited.add((nei, new_mask))
                queue.append((nei, new_mask, dist + 1))

    return -1

Here is what each piece does:

  1. Compute the target mask (1 << n) - 1. For 3 nodes this is 0b111 = 7. When the mask equals this value, all nodes have been visited.
  2. Initialize the queue with every node as a potential starting point. Each entry is (node, mask, distance). The initial mask for node i is 1 << i, meaning only that node is visited.
  3. BFS loop: for each state, try moving to every neighbor. Compute the new mask by OR-ing in the neighbor's bit. If the new mask equals the target, return dist + 1. Otherwise, if this (neighbor, new_mask) state has not been seen before, add it to the queue.
  4. Visited set prevents processing the same (node, mask) pair twice. Without this, the BFS would loop forever in cyclic graphs.

Complexity analysis

MetricValue
TimeO(n * 2^n * n), processing up to n * 2^n states, each with up to n neighbors
SpaceO(n * 2^n), for the visited set and queue

The state space is n * 2^n: there are n possible current nodes and 2^n possible visited masks. Since n is at most 12 in this problem, 12 * 4096 = 49,152 states is very manageable. The exponential factor in 2^n means this approach is only practical for small n, but the problem constraints guarantee n is at most 12.

Building blocks

This problem is built on two reusable patterns that CodeBricks drills independently.

1. BFS for shortest path in unweighted graphs

BFS explores nodes level by level, guaranteeing that the first time you reach a state, you have found the shortest path to it. The template is always the same: initialize a queue with starting states, process each state by expanding neighbors, and track visited states to avoid cycles.

queue = deque(starting_states)
visited = set(starting_states)

while queue:
    state, dist = queue.popleft()
    for next_state in expand(state):
        if is_goal(next_state):
            return dist + 1
        if next_state not in visited:
            visited.add(next_state)
            queue.append((next_state, dist + 1))

In Number of Islands, the state is a grid cell. In Word Ladder, the state is a word. Here, the state is a (node, bitmask) pair. The BFS skeleton is identical across all of these.

2. Bitmask state compression

When you need to track a subset of up to ~20 elements, encoding the subset as an integer bitmask is far more efficient than using sets or tuples. Each bit represents membership, and set operations become bitwise operations:

  • Add element i: mask | (1 << i)
  • Check element i: mask & (1 << i)
  • Full set of n elements: (1 << n) - 1

This technique appears in problems like Partition to K Equal Sum Subsets, Can I Win, and any DP problem where the state involves a subset of items. The bitmask turns O(2^n * n) set operations into O(2^n) integer operations.

The bitmask approach works because Python integers can be dictionary keys and set members, making (node, mask) pairs hashable. In languages like C++ or Java, you can use a 2D visited array indexed by [node][mask] for even faster lookups.

Edge cases

Before submitting, make sure your solution handles these:

  • Single node graph = [[]]: the answer is 0 because the only node is already "visited." The early return handles this.
  • Linear graph [[1],[0,2],[1]]: nodes form a line 0-1-2. You must traverse from one end to the other in 2 steps.
  • Star graph where one central node connects to all others: BFS from the center visits each leaf in one step, so the answer is n-1.
  • Complete graph where every node connects to every other: the answer is n-1 because you can visit one new node per step.
  • Disconnected-looking paths where the shortest visiting path requires revisiting nodes: the bitmask tracks which nodes have been visited regardless of revisits, so this is handled automatically.

From understanding to recall

You have read the BFS + bitmask solution and it makes sense. A queue of (node, mask) pairs, an OR operation to update the mask, and a target check for all bits set. But can you write it from scratch in an interview without looking at it?

The details matter: initializing the queue with every node as a starting point, computing 1 << i for each starting mask, using mask | (1 << nei) to update the visited set, and checking new_mask == (1 << n) - 1 as the termination condition. These are small but critical, and they are easy to get wrong under pressure if you have not practiced writing them from memory.

Spaced repetition closes that gap. You practice writing the BFS bitmask loop from scratch at increasing intervals. After a few rounds, the pattern is automatic. You see "visit all nodes in shortest path" and the code flows out without hesitation.

The BFS + bitmask pattern is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping they stick.

Related posts

  • Number of Islands - BFS/DFS on a grid, the foundational graph traversal pattern
  • Clone Graph - Graph traversal with a hash map to track visited nodes
  • Word Ladder - BFS for shortest transformation sequence, another shortest-path-in-unweighted-graph problem

CodeBricks breaks the shortest path visiting all nodes problem into its BFS shortest-path and bitmask state compression building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a bitmask BFS question shows up in your interview, you do not think about it. You just write it.