Number of Ways to Wear Different Hats: Bitmask DP Pattern
There are n people and 40 different hats labeled 1 through 40. Each person has a list of hats they are willing to wear. You need to assign exactly one hat to each person so that no two people wear the same hat, and return the number of valid assignments modulo 10^9 + 7.
This is LeetCode 1434: Number of Ways to Wear Different Hats to Each Other, and it is a textbook application of bitmask DP. The critical insight is choosing what to iterate over and what to encode in the bitmask.
Why iterate over hats, not people?
Your first instinct might be to iterate over people and try assigning each one a hat. The problem is that there are up to 40 hats, so encoding "which hats have been used" would require a bitmask of 40 bits. That is 2^40 states, roughly a trillion, which is far too many.
Flip it around. Iterate over the 40 hats instead, and for each hat, decide which person (if any) gets it. Now the bitmask tracks which people have been assigned hats, and since n is at most 10, the bitmask has at most 10 bits. That is 2^10 = 1024 states. With 40 hats and 1024 states, the total work is roughly 40 * 1024 * 10 = 409,600 operations. That fits comfortably.
The first step is to invert the input. The problem gives you a mapping from people to hats. Build the reverse mapping: for each hat, store which people can wear it. Then iterate hat by hat.
The solution
def numberWays(hats):
n = len(hats)
MOD = 10**9 + 7
hat_to_people = [[] for _ in range(41)]
for person in range(n):
for hat in hats[person]:
hat_to_people[hat].append(person)
full_mask = (1 << n) - 1
dp = [0] * (full_mask + 1)
dp[0] = 1
for hat in range(1, 41):
for mask in range(full_mask, -1, -1):
if dp[mask] == 0:
continue
for person in hat_to_people[hat]:
if mask & (1 << person) == 0:
dp[mask | (1 << person)] = (
dp[mask | (1 << person)] + dp[mask]
) % MOD
return dp[full_mask]
How the code works
The DP array has 2^n entries. dp[mask] counts the number of ways to assign hats to exactly the people whose bits are set in mask, using only hats processed so far.
For each hat (1 through 40), you scan every mask in reverse order. If dp[mask] is positive, you try giving the current hat to each person who wants it and whose bit is not already set. Giving hat h to person p transitions from mask to mask | (1 << p).
The reverse iteration on masks prevents double-counting within the same hat. Since a single hat can only go to one person, you need to ensure you do not assign the same hat to multiple people in a single pass. Iterating from full_mask down to 0 guarantees that when you update dp[mask | (1 << p)], the value of dp[mask] you read is from before the current hat's round, not from an already-updated state.
After processing all 40 hats, dp[full_mask] holds the answer: the number of ways to assign hats so that every person has exactly one hat.
The reason you iterate over hats (the larger set) rather than people (the smaller set) is that the bitmask must represent the smaller dimension. With n up to 10 people, the bitmask fits in 10 bits. If you tried to bitmask over 40 hats instead, you would need 2^40 states. Always let the bitmask track the dimension with the smaller upper bound.
Step 1: Invert the mapping (hat to people)
Instead of iterating over people and trying hats, iterate over hats and decide which person (if any) gets each hat. This keeps the bitmask small (n people, not 40 hats).
Step 2: Initialize dp[mask]. dp[000] = 1 (no one has a hat yet)
dp[mask] counts the number of ways to assign hats so that exactly the people indicated by set bits in mask have hats. Starting state: nobody has a hat, and there is exactly 1 way to achieve that.
Step 3: Process hat 3 (wearable by P0)
Hat 3 can go to P0. For each mask where P0's bit is off and dp[mask] > 0, add dp[mask] to dp[mask | (1 shifted left 0)]. Mask 000 has value 1, so dp[001] gets +1.
Step 4: Process hat 4 (wearable by P0, P1)
Hat 4 can go to P0 or P1. From mask 000 (dp=1): assign to P0 gives dp[001]+=1, assign to P1 gives dp[010]+=1. From mask 001 (dp=1, P0 already has hat): assign to P1 gives dp[011]+=1.
Step 5: Process hat 5 (wearable by P1, P2)
Hat 5 can go to P1 or P2. We look at every mask and try each eligible person. The key transition: from mask 011 (P0 and P1 have hats, dp=1), assign hat 5 to P2 gives dp[111]+=1. The full mask 111 means all 3 people have hats.
Step 6: Read dp[111] for the answer
The full mask (all bits set) means every person has been assigned a hat. dp[111] = 1, so there is exactly 1 valid assignment. In general, return dp[(1 shifted left n) - 1] mod 10^9+7.
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Brute force (backtracking) | O(40^n) | O(n) |
| Bitmask DP | O(40 * 2^n * n) | O(2^n) |
With n at most 10, the DP approach processes at most 40 * 1024 * 10 = 409,600 transitions. That runs in milliseconds. The space is 2^10 = 1024 entries, which is negligible.
Building blocks
Bitmask to track subset membership
A bitmask of n bits encodes a subset of n elements. Bit i being set means person i has been assigned a hat. The operation mask | (1 << p) adds person p to the subset, and mask & (1 << p) checks whether person p is already in it.
full_mask = (1 << n) - 1
has_hat = mask & (1 << person) != 0
new_mask = mask | (1 << person)
This is the same bitmask machinery used in traveling salesman, Hamiltonian path, and other assignment problems where you need to track "which items have been used."
Inverting the mapping for efficient iteration
The input gives you hats[person] = [list of hats]. You need hat_to_people[hat] = [list of people] so that the outer loop iterates over hats while the inner loop only touches relevant people.
hat_to_people = [[] for _ in range(41)]
for person in range(n):
for hat in hats[person]:
hat_to_people[hat].append(person)
Without this inversion, you would need to search through every person's preference list for each hat, which adds unnecessary overhead and makes the DP transitions harder to express cleanly.
Edge cases
- Single person:
n = 1. The answer is simply the length of that person's preference list. Each hat is a valid assignment. - All people want the same single hat: e.g.,
hats = [[5], [5], [5]]. Only one person can get hat 5, so the answer is 0 (not enough hats for everyone). - Maximum
n = 10: the bitmask has 1024 states. Combined with 40 hats, this is the upper bound on work and still runs fast. - Disjoint preference lists: if no two people share any hat preference, the answer is the product of each person's preference list length. The DP handles this correctly without special casing.
- Large answer: return the result modulo
10^9 + 7. Apply the modulus at every addition to avoid overflow.
From understanding to recall
The core idea here, choosing which dimension to bitmask based on its upper bound, is a pattern that shows up in many bitmask DP problems. The mechanical steps are: invert the mapping, iterate over the larger set, and let the bitmask track the smaller set. If you can recall that framework, the code follows directly.
Spaced repetition helps cement this kind of structural decision. The first time you see it, the "iterate over hats, not people" trick feels surprising. After a few spaced reps, it becomes the first thing you check whenever you see an assignment problem with two sets of different sizes.
Related posts
- Partition Equal Subset Sum - Another DP problem where the state representation (which elements have been included) drives the solution. Same idea of encoding subset membership, but with a sum-based DP instead of a bitmask.
- Counting Bits - Builds intuition for how bit operations connect to DP recurrences. Understanding
i >> 1andi & 1helps when reasoning about bitmask transitions. - Target Sum - Reduces a sign-assignment problem to subset sum DP. Similar structural trick of reframing the problem before applying DP.
CodeBricks helps you internalize patterns like bitmask DP through spaced repetition. Instead of re-reading solutions, you practice writing them from scratch at increasing intervals until the pattern is automatic.