Skip to content
← All posts

Minimum Deletions to Make Character Frequencies Unique: Greedy Frequency Counting

6 min read
leetcodeproblemmediumstringssortinggreedy

A string s is called "good" if no two different characters have the same frequency. Given a string s, return the minimum number of characters you need to delete to make s good.

This is LeetCode 1647: Minimum Deletions to Make Character Frequencies Unique, a medium-difficulty problem that tests your ability to combine frequency counting with a greedy strategy. Once you see the right approach, the solution is short and clean.

s = "aaabbbcc"3a3b2cconflict: both 3
Frequencies for "aaabbbcc": a=3, b=3, c=2. The duplicate frequency 3 creates a conflict that requires deletions to resolve.

Why this problem matters

Frequency counting is one of the most common building blocks in string problems. You will see it in anagram detection, top-k element queries, and many sliding window variants. This problem adds a twist: after counting frequencies, you need to make them all unique using the fewest removals possible.

The greedy layer on top of frequency counting is what makes this problem valuable. It teaches you to sort frequencies, iterate through them, and resolve conflicts by decrementing. This pattern transfers to problems where you need to assign unique values to a set of items with minimal cost.

The key insight

Once you have the frequency counts, the original characters do not matter. All you care about is the list of frequencies. To make them unique with the fewest deletions, sort them in descending order and greedily assign each frequency the highest available value that does not conflict with any frequency you have already used. If a frequency is already taken, decrement it until you find one that is free (or reach 0, meaning you delete all occurrences of that character).

This greedy choice is optimal because reducing a frequency by 1 costs exactly 1 deletion, and you always want to keep frequencies as high as possible to minimize total deletions.

The solution

def minDeletions(s):
    freq = [0] * 26
    for ch in s:
        freq[ord(ch) - ord('a')] += 1

    freq.sort(reverse=True)
    used = set()
    deletions = 0

    for f in freq:
        if f == 0:
            continue
        while f > 0 and f in used:
            f -= 1
            deletions += 1
        used.add(f)

    return deletions

Here is what each piece does:

  1. Count frequencies into a fixed array of size 26 (one slot per lowercase letter).
  2. Sort descending so you process the most frequent characters first. This ensures that when a conflict arises, you push the later character down rather than the earlier one, minimizing total deletions.
  3. Iterate through frequencies. For each non-zero frequency, check whether it is already in the used set. If it is, decrement it (costing one deletion each time) until you find an unused value or reach 0.
  4. Add the final frequency to used and continue.
  5. Return the total deletions.

When a frequency gets decremented all the way to 0, that means every occurrence of that character is deleted. Adding 0 to the set is fine because only one character can have frequency 0 (it just means it is fully removed). If another character also needs to go to 0, the while loop stops at 0 and you simply do not add it again.

Visual walkthrough

Let's trace through s = "aaabbbcc" step by step. The frequencies are a=3, b=3, c=2.

Step 1: Count and sort frequencies descending.

3a3b2c

Frequencies sorted: [3, 3, 2]. We process from largest to smallest.

Step 2: Process frequency 3 (char a). 3 is not in used set. Add it.

3a3b2c

used = {3}. deletions = 0.

Step 3: Process frequency 3 (char b). 3 is already used. Decrement to 2.

3a3→2b2c

3 is taken, so try 2. But 2 is also taken (we will add c's 2 next). Keep decrementing.

Step 4: Continue. 2 is not yet used, so b takes frequency 2. Process c next. 2 is now taken, decrement c to 1.

3a2b2→1c

Actually we process in sorted order. b's 3 decrements: 3 taken, try 2. Add 2. deletions = 1. Then c's 2: 2 taken, try 1. Add 1. deletions = 1 + 1 = 2.

Step 5: All frequencies are unique. Total deletions = 2.

3a2b1c

Final frequencies: a=3, b=2, c=1. used = {3, 2, 1}. Answer = 2 deletions.

The greedy approach processes frequencies from largest to smallest. When it encounters a duplicate, it decrements until finding a free slot. For b with frequency 3, it drops to 2 (one deletion). For c with frequency 2, that slot is now taken by b, so it drops to 1 (one more deletion). Total: 2 deletions.

Complexity analysis

MetricValue
TimeO(n + k^2) where n is the string length and k is 26 (alphabet size). Counting takes O(n). The greedy loop runs at most k times, and each frequency can be decremented at most k times. Since k is fixed at 26, this simplifies to O(n).
SpaceO(k) for the frequency array and the used set, which is O(1) since k = 26.

The dominant cost is the single pass through the string to count frequencies. Everything after that operates on at most 26 elements, so it is effectively constant time.

The building blocks

This problem is built on two reusable patterns that CodeBricks drills independently.

1. Frequency counting

Counting how often each element appears is the foundation of many string and array problems:

freq = [0] * 26
for ch in s:
    freq[ord(ch) - ord('a')] += 1

You will use this exact pattern in problems like Group Anagrams, Valid Anagram, Top K Frequent Elements, and dozens more. The key detail is choosing between a fixed-size array (when the alphabet is known and small) and a hash map (when values are unbounded).

2. Greedy conflict resolution with a set

When you need to assign unique values and resolve duplicates, a set tracks what is already used while a greedy loop finds the next available slot:

used = set()
for value in sorted_values:
    while value > 0 and value in used:
        value -= 1
        cost += 1
    used.add(value)

This pattern appears whenever you need to make a collection of values unique with minimum adjustments. The descending sort ensures you always keep the largest values intact and push smaller duplicates down, which is provably optimal.

Edge cases

Before submitting, make sure your solution handles these:

  • All characters the same "aaaa": only one frequency (4), no conflicts. Answer is 0.
  • All characters appear once "abcde": frequencies are [1,1,1,1,1]. You keep one at 1 and push the rest to 0. Answer is 4.
  • Already good "aab": frequencies are [2,1], already unique. Answer is 0.
  • Single character "z": one frequency, no conflicts. Answer is 0.
  • Two characters with same frequency "aabb": frequencies [2,2]. Push one to 1. Answer is 1.
  • Long chains of duplicates "aaabbbccc": frequencies [3,3,3]. First takes 3, second decrements to 2 (1 deletion), third decrements to 1 (2 deletions). Answer is 3.

From understanding to recall

You have read the greedy frequency counting solution and it makes sense. Sort, iterate, decrement duplicates, done. But can you write it from scratch in an interview without looking back?

The details matter: using a fixed-size array versus a hash map for counting, sorting in descending order, the while loop condition that checks both f > 0 and f in used, and remembering to add the final decremented value to the set. These are small but critical, and they slip away under pressure if you have not practiced writing them from memory.

Spaced repetition closes that gap. You practice writing the frequency counting and greedy conflict resolution from scratch at increasing intervals. After a few rounds, the pattern is automatic. You see "make frequencies unique" and the code flows out without hesitation.

The greedy frequency counting pattern is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping they stick.

Related posts

  • Group Anagrams - Another frequency counting problem where character counts form the key for grouping strings
  • Top K Frequent Elements - Frequency counting paired with sorting or heap selection to find the most common elements
  • Sort Colors - A greedy partitioning problem where you classify and rearrange elements in a single pass

CodeBricks breaks this problem into its frequency counting and greedy conflict resolution building blocks, then drills them independently with spaced repetition. You type each piece from scratch until the pattern is automatic. When a greedy frequency question shows up in your interview, you do not think about it. You just write it.