Sort Characters By Frequency: Bucket Sort Approach

May 19, 20264 min read

leetcodeproblemmediumstringshash-mapheap

Sort Characters By Frequency (LeetCode 451) asks you to rearrange a string so that characters appearing most often come first. Given the string "tree", you return "eert" or "eetr" because e appears twice while t and r each appear once.

The most natural approach uses a heap, but bucket sort gives you O(n) time with simpler code. Let's break it down.

The problem

Given a string s, sort it in decreasing order based on the frequency of the characters. The frequency of a character is the number of times it appears in the string. If multiple answers are valid, return any of them.

Examples:

s = "tree" returns "eert" (or "eetr")
s = "cccaaa" returns "aaaccc" (or "cccaaa")
s = "Aabb" returns "bbAa" (or "bbaA")

Count character frequencies, then arrange characters from highest to lowest frequency.

Why this problem matters

Sort Characters By Frequency combines two fundamental techniques: frequency counting with a hash map and bucket sort for linear-time ordering. These same building blocks appear in dozens of other problems. Mastering this pattern gives you a reusable tool for any problem where you need to rank items by how often they appear.

Practice frequency-based sorting with spaced repetition

The key insight

The maximum frequency any character can have is n (the length of the string). That means you can create an array of n + 1 buckets, where bucket i holds all characters that appear exactly i times. Then you walk from the highest bucket down, appending each character repeated by its frequency. No comparison sorting needed.

This is the same bucket sort trick used in Top K Frequent Elements. The only difference is that here you collect every character instead of stopping at k.

Python solution

from collections import Counter

def frequency_sort(s: str) -> str:
    counts = Counter(s)
    buckets = [[] for _ in range(len(s) + 1)]

    for ch, freq in counts.items():
        buckets[freq].append(ch)

    result = []
    for freq in range(len(buckets) - 1, 0, -1):
        for ch in buckets[freq]:
            result.append(ch * freq)

    return "".join(result)

Here is what each part does:

Counter(s) builds the frequency map in one pass. For "tree" this gives {'t': 1, 'r': 1, 'e': 2}.
buckets = [[] for _ in range(len(s) + 1)] creates empty lists indexed by frequency. Index 0 stays empty since no character has frequency 0.
The first loop places each character into the bucket matching its count. Character e with frequency 2 goes into buckets[2]. Characters t and r with frequency 1 go into buckets[1].
The second loop walks from the highest bucket down to 1. For each character in a bucket, it appends that character repeated by its frequency. From buckets[2] we get "ee", then from buckets[1] we get "t" and "r".

Visual walkthrough

Step 1: Count character frequencies with a hash map

Walk through the string once. For each character, increment its count in the hash map.

Step 2: Place characters into frequency buckets

Create n+1 buckets indexed by frequency. Place each character in the bucket matching its count.

Step 3: Build the result string from highest to lowest frequency

For each bucket from highest to lowest, append the character repeated by its frequency.

Drill frequency sorting with CodeBricks

Complexity analysis

Metric	Value	Reason
Time	O(n)	One pass to count, one pass to fill buckets, one pass to build result
Space	O(n)	Frequency map plus bucket array, both proportional to n

The bucket sort approach avoids the O(n log n) cost of sorting the frequency map entries. Since character frequencies are bounded by n, we can use frequency as an array index and skip comparison sorting entirely.

Building blocks

Frequency counter

Count how many times each character appears. This is the same building block used in Group Anagrams, Valid Anagram, and First Unique Character in a String. The template:

from collections import Counter

counts = Counter(s)

Bucket sort by bounded value

When the value you want to sort by falls in a known range (here, 1 to n), you can place items into buckets indexed by that value. Walking the buckets in order gives you a sorted result without any comparison sort.

buckets = [[] for _ in range(max_val + 1)]
for item, val in mapping.items():
    buckets[val].append(item)

Edge cases

All characters are the same. s = "aaaa" produces "aaaa". One entry in the frequency map, one bucket used.
All characters are unique. s = "abcd" can return any permutation since every character has frequency 1. Bucket 1 holds all four characters and you append them in whatever order they appear.
Case sensitivity. s = "Aabb" treats A and a as different characters. b appears twice, so it comes first: "bbAa".
Single character. s = "z" returns "z". The frequency map has one entry and the result is trivial.

From understanding to recall

The frequency counting step is second nature for most people. You have written Counter(s) many times. The part that fades is the bucket sort step. Under pressure, you might reach for sorted() with a key function and end up with O(n log n).

The insight to lock in is: "frequency is bounded by n, so I can use frequency as an array index." Once that becomes automatic, the rest of the solution writes itself. Create buckets, fill by frequency, walk backwards, repeat characters.

Spaced repetition turns this from something you understand into something you recall. Practice writing the bucket sort version from scratch at increasing intervals. After a few reps, the pattern becomes second nature and you never waste time on the suboptimal approach.

Try CodeBricks free

Group Anagrams uses frequency counting to generate canonical keys for grouping strings
Top K Frequent Elements uses the exact same bucket sort trick, but stops after collecting k items
First Unique Character in a String builds a frequency map and scans for the first character with count 1