Create Sorted Array through Instructions: Fenwick Tree for Insertion Costs

June 2, 20266 min read

leetcodeproblemhardarraysbinary-searchsorting

LeetCode Create Sorted Array through Instructions (Problem 1649) asks you to build a sorted array one element at a time and accumulate the total insertion cost. For each new element, the cost is the minimum of two counts: how many already-inserted elements are strictly less than it, and how many are strictly greater. You return the total cost modulo 10^9 + 7.

The problem

You are given an integer array instructions. Starting from an empty array nums, for each instructions[i] you insert it into nums while keeping nums sorted. Before each insertion, you compute two values:

strictly_less: the number of elements already in nums that are strictly less than instructions[i]
strictly_greater: the number of elements already in nums that are strictly greater than instructions[i]

The cost of that insertion is min(strictly_less, strictly_greater). You sum all costs and return the result modulo 10^9 + 7.

For instructions = [1, 5, 6, 2], the first three insertions all have cost 0 because they go at the end of the sorted array. Only inserting 2 into [1, 5, 6] costs anything: 1 element is smaller (the 1) and 2 elements are larger (5 and 6), so the cost is min(1, 2) = 1. The total is 1.

For instructions = [1, 5, 6, 2], only the last insertion (value 2) has a nonzero cost. A BIT tracks prefix counts so you can compute "strictly less" and "strictly greater" in O(log m) per insertion.

The brute force approach

The simplest solution maintains a sorted list. For each new element, you binary-search for its position to count how many elements fall on each side. That gives you strictly_less and strictly_greater in O(log n) time, but the actual insertion into the sorted list takes O(n) because you need to shift elements. Over n insertions, the total time is O(n^2). When instructions can have up to 100,000 elements, that is too slow.

You could also skip the binary search and just scan the list each time, but that is also O(n) per insertion and does not improve the situation.

The key insight

You do not actually need to maintain the sorted array itself. All you need for each insertion is two counts: how many values less than x and how many values greater than x exist so far. If you maintain a frequency array where freq[v] records how many times value v has been inserted, then strictly_less is the sum freq[1] + freq[2] + ... + freq[x-1] and strictly_greater is the sum freq[x+1] + ... + freq[max_val].

A Binary Indexed Tree (BIT), also called a Fenwick tree, lets you update a single frequency and query any prefix sum in O(log m) time, where m is the maximum value in the array. That turns the entire problem into O(n log m).

Think of this as a frequency counting problem, not an array insertion problem. The BIT is a compact data structure that answers "how many values in range [1, x] have been inserted so far?" in logarithmic time.

Step-by-step walkthrough

Here is how the algorithm processes instructions = [1, 5, 6, 2] using a BIT of size 7 (indices 0 through 6, since the max value is 6).

Setup: Initialize BIT of size max(instructions) + 1 = 7

BIT[0..6] = all zeros. Each index i will track how many times value i has been inserted.

Step 1: Insert 1 into empty array

Sorted array before: [ ] (empty)

strictly_less = 0, strictly_greater = 0, cost = min(0, 0) = 0

BIT after update:

No elements yet, so strictly_less = 0, strictly_greater = 0. Cost = min(0, 0) = 0. Update BIT at index 1.

Step 2: Insert 5 into [1]

inserting 5

strictly_less = 1, strictly_greater = 0, cost = min(1, 0) = 0

BIT after update:

Query BIT: prefix_sum(4) = 1 element less than 5. prefix_sum(6) - prefix_sum(5) = 0 greater. Cost = min(1, 0) = 0.

Step 3: Insert 6 into [1, 5]

inserting 6

strictly_less = 2, strictly_greater = 0, cost = min(2, 0) = 0

BIT after update:

Query BIT: prefix_sum(5) = 2 elements less than 6. prefix_sum(6) - prefix_sum(6) = 0 greater. Cost = min(2, 0) = 0.

Step 4: Insert 2 into [1, 5, 6]

inserting 2

strictly_less = 1, strictly_greater = 2, cost = min(1, 2) = 1

BIT after update:

Query BIT: prefix_sum(1) = 1 element less. Total inserted so far = 3. strictly_greater = 3 - prefix_sum(2) = 3 - 1 = 2. Cost = min(1, 2) = 1.

Total cost: 0 + 0 + 0 + 1 = 1

Return 1 mod (10^9 + 7) = 1. The BIT gave us O(log m) queries per insertion instead of O(n) scanning, turning the O(n^2) brute force into O(n log m).

Each step queries the BIT twice (once for strictly_less, once to derive strictly_greater), then updates the BIT at the inserted value. Every query and update is O(log m), so the full pass over n instructions takes O(n log m).

Practice the BIT pattern

The code

def createSortedArray(instructions):
    MOD = 10**9 + 7
    m = max(instructions)
    bit = [0] * (m + 2)

    def update(i):
        while i <= m:
            bit[i] += 1
            i += i & (-i)

    def query(i):
        s = 0
        while i > 0:
            s += bit[i]
            i -= i & (-i)
        return s

    cost = 0
    for i, x in enumerate(instructions):
        less = query(x - 1)
        greater = i - query(x)
        cost = (cost + min(less, greater)) % MOD
        update(x)

    return cost

The BIT array is sized m + 2 to avoid boundary issues when querying or updating at index m. The update function increments every BIT node responsible for index i by walking up the tree. The query function sums all values from index 1 to i by walking down.

For strictly_greater, notice the trick: after inserting i elements (0-indexed), the total count of elements already in the array is i. The count of elements with value x or less is query(x). So strictly_greater = i - query(x). This avoids a second range query.

The BIT uses 1-based indexing. Index 0 is unused. Make sure your values start at 1 (they do in this problem, since instructions[i] is at least 1).

Complexity analysis

Approach	Time	Space
Brute force	O(n^2)	O(n)
BIT / Fenwick tree	O(n log m)	O(m)

Here, n is the length of instructions and m is the maximum value in the array. Each of the n insertions performs a constant number of BIT operations, each taking O(log m). The BIT itself uses O(m) space. For this problem, m can be up to 100,000, so the space is manageable.

The building blocks

Binary Indexed Tree (Fenwick tree)

A BIT supports two operations in O(log n): point update and prefix sum query. It works by decomposing indices into powers of two. The key formula i & (-i) extracts the lowest set bit of i, which determines the range each BIT node is responsible for. If you have never seen a BIT before, practice it on a simpler problem first (like counting inversions or range sum queries) before tackling this one.

Frequency counting with prefix sums

Instead of maintaining a sorted array, you maintain frequencies. The number of elements strictly less than x is just the prefix sum up to x - 1. The number strictly greater is the total count minus the prefix sum up to x. This reframing is what makes the BIT applicable. Many "counting in a sorted structure" problems can be transformed into prefix sum queries on a frequency array.

Edge cases

All identical values. If every instruction is the same value, then strictly_less and strictly_greater are both 0 for every insertion. The total cost is 0.

Already sorted input. If the instructions arrive in increasing order, strictly_greater is always 0, so every insertion has cost 0. The same applies to strictly decreasing order, where strictly_less is always 0.

Large values with small n. The BIT size depends on the max value m, not on n. If m is 100,000 but n is 10, the BIT is still 100,000 entries. This is fine for this problem's constraints, but for extremely large value ranges you would need coordinate compression.

Modular arithmetic. The problem asks for the result mod 10^9 + 7. Apply the modulo after each addition to avoid overflow in languages with fixed-width integers. In Python, this is mostly a formality since integers have arbitrary precision, but it keeps the output correct.

From understanding to recall

The core pattern here has three pieces you need to internalize. First, reframe insertion cost as a frequency counting problem. Second, use a BIT to answer prefix sum queries in O(log m). Third, compute strictly_greater as total_inserted - query(x) instead of doing a separate range query.

The part that usually trips people up is the BIT mechanics. Practice writing update and query from scratch until the i & (-i) bit manipulation feels automatic. Once the BIT is second nature, the rest of the solution is just a loop that queries twice and updates once.

Space your practice sessions out over a few days. Each time, start from a blank editor and write the full solution without looking at notes. Pay special attention to the off-by-one details: querying x - 1 for strictly_less, using 1-based indexing, and sizing the BIT array to m + 2.

Build recall with CodeBricks

Reverse Pairs - Another hard counting problem that tracks relationships between elements during insertion, solved with a different divide-and-conquer approach.
Count of Smaller Numbers After Self - Counts elements smaller than each value to its right, using a BIT or merge sort, the same frequency-based counting pattern.
Kth Largest Element - Uses partitioning to find order statistics efficiently, a related skill for understanding element rankings in unsorted arrays.

Try CodeBricks free

The problem

The brute force approach

The key insight

Step-by-step walkthrough

The code

Complexity analysis

The building blocks

Binary Indexed Tree (Fenwick tree)

Frequency counting with prefix sums

Edge cases

From understanding to recall

Related posts