Sum of Subsequence Widths: Sorting and Combinatorics
The width of a subsequence is defined as the difference between its maximum and minimum elements. Given an integer array nums, return the sum of the widths of all non-empty subsequences of nums. Since the answer may be very large, return it modulo 10^9 + 7.
This is LeetCode 891: Sum of Subsequence Widths, a hard problem that looks intimidating at first but collapses into a clean formula once you ask the right question: "How many times does each element appear as the max versus the min across all subsequences?"
Why brute force fails
An array of length n has 2^n - 1 non-empty subsequences. For n = 20, that is already over a million. For the constraint of n up to 20,000, enumerating every subsequence is completely out of the question. You need a way to compute the total without ever generating a single subsequence.
The trick is to stop thinking about subsequences and start thinking about individual elements.
Key insight: count contributions after sorting
Sort the array. After sorting, each element's role in every subsequence becomes predictable.
Consider element nums[i] in the sorted array. It is the maximum of any subsequence that only picks from elements at indices 0 through i (and must include i itself). How many such subsequences are there? You need to include nums[i], and for each of the i elements before it, you can either include it or not. That gives 2^i subsequences where nums[i] is the max.
By the same logic, nums[i] is the minimum of any subsequence that only picks from indices i through n - 1 (and must include i). There are 2^(n - 1 - i) such subsequences.
Each time nums[i] appears as a max, it contributes +nums[i] to the total. Each time it appears as a min, it contributes -nums[i]. So the net contribution of nums[i] is:
nums[i] * (2^i - 2^(n - 1 - i))
The total answer is the sum of these contributions across all i from 0 to n - 1.
Sorting does not change the set of subsequences. It only relabels which element is the max and which is the min in each one. But after sorting, you can count those roles using simple combinatorics instead of brute force enumeration.
The solution
def sumSubseqWidths(nums):
MOD = 10**9 + 7
nums.sort()
n = len(nums)
# Precompute powers of 2
pow2 = [1] * n
for i in range(1, n):
pow2[i] = pow2[i - 1] * 2 % MOD
result = 0
for i in range(n):
result = (result + nums[i] * (pow2[i] - pow2[n - 1 - i])) % MOD
return result % MOD
Here is what each piece does:
- Sort the array. This is what lets you use index-based counting.
- Precompute
pow2[i] = 2^i mod MODfor alli. This avoids recomputing powers inside the loop. - Loop through each element. For index
i, addnums[i] * (2^i - 2^(n-1-i))to the running total, all under the modulus. - Return the result modulo
10^9 + 7.
Step-by-step walkthrough
Let's trace through the example nums = [1, 2, 3] (already sorted, n = 3). The non-empty subsequences and their widths are: [1] width 0, [2] width 0, [3] width 0, [1,2] width 1, [1,3] width 2, [2,3] width 1, [1,2,3] width 2. Total = 6.
Now let's verify with the formula:
Element 1 at index 0
1 * (2^0 - 2^2) = 1 * (1 - 4) = 1 * -3 = -3
Element 2 at index 1
2 * (2^1 - 2^1) = 2 * (2 - 2) = 2 * 0 = 0
Element 3 at index 2
3 * (2^2 - 2^0) = 3 * (4 - 1) = 3 * 3 = 9
Total = (-3) + (+0) + (+9) = 6
The sum of widths of all non-empty subsequences of [1, 2, 3] is 6.
The formula gives 6, which matches the brute-force enumeration. But the formula runs in O(n) instead of O(2^n).
Complexity analysis
| Approach | Time | Space |
|---|---|---|
| Sort + contribution counting | O(n log n) | O(n) for powers array |
Sorting dominates at O(n log n). The contribution loop is O(n). The powers array uses O(n) space, though you could compute powers on the fly and reduce to O(1) extra space if needed.
Building blocks
1. Contribution counting
Instead of iterating over all subsets, ask: "How many subsets does this element contribute to, and in what role?" This reframing turns an exponential problem into a linear one.
You will see this same pattern in problems like Count Unique Characters of All Substrings, where you ask how many substrings each character is unique in, and Sum of Subarray Minimums, where you count how many subarrays each element is the min of. The question is always the same: flip the perspective from "enumerate all groups" to "count each element's participation."
2. Powers of two precomputation
When you need 2^0, 2^1, ..., 2^(n-1) under a modulus, precompute them in an array with one multiplication per entry. This is a standard technique for combinatorics problems involving subset counting.
pow2 = [1] * n
for i in range(1, n):
pow2[i] = pow2[i - 1] * 2 % MOD
Avoid calling pow(2, i, MOD) inside a loop. While Python's built-in modular exponentiation is efficient, the precomputation approach is cleaner and makes the code easier to reason about.
3. Modular arithmetic
Since the answer can be astronomically large, you work modulo 10^9 + 7 throughout. The key rule: you can take the modulus after every addition and multiplication, but subtraction can produce negative values. In Python, the % operator handles negative numbers correctly (it always returns a non-negative result), so (a - b) % MOD works as expected. In other languages like Java or C++, you may need to add MOD before taking the modulus to avoid negative remainders.
Edge cases
- Single element
[5]: only one subsequence with width 0. The formula gives5 * (2^0 - 2^0) = 0. Correct. - All same elements
[3, 3, 3]: every subsequence has width 0 because max equals min. The formula gives3 * (1 - 4) + 3 * (2 - 2) + 3 * (4 - 1) = -9 + 0 + 9 = 0. Correct. - Two elements
[1, 5]: subsequences are [1] width 0, [5] width 0, [1,5] width 4. Total = 4. Formula:1 * (1 - 2) + 5 * (2 - 1) = -1 + 5 = 4. Correct.
From understanding to recall
You have read the contribution counting formula and it makes sense. Sorting lets you compute each element's role as max versus min using powers of two. Clean and elegant. But can you reproduce it in an interview without looking at it?
The details matter: remembering to sort first, getting the exponents right (2^i for max, 2^(n-1-i) for min), precomputing powers under the modulus, and handling the subtraction inside the modular arithmetic. These are small details, but they are easy to fumble when you are under time pressure and trying to recall the formula from memory.
Spaced repetition closes that gap. You practice writing the sort-then-contribution-count pattern from scratch at increasing intervals. After a few rounds, sorting and summing nums[i] * (2^i - 2^(n-1-i)) is automatic. You see "sum over all subsets" and the contribution counting approach flows out without hesitation.
Contribution counting is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping the patterns stick.
Related posts
- Maximum Subarray - Another problem where thinking about contributions simplifies the solution
- Subarray Sum Equals K - Contribution counting with prefix sums