Sort Characters By Frequency: Bucket Sort Approach
Sort Characters By Frequency (LeetCode 451) asks you to rearrange a string so that characters appearing most often come first. Given the string "tree", you return "eert" or "eetr" because e appears twice while t and r each appear once.
The most natural approach uses a heap, but bucket sort gives you O(n) time with simpler code. Let's break it down.
The problem
Given a string s, sort it in decreasing order based on the frequency of the characters. The frequency of a character is the number of times it appears in the string. If multiple answers are valid, return any of them.
Examples:
s = "tree"returns"eert"(or"eetr")s = "cccaaa"returns"aaaccc"(or"cccaaa")s = "Aabb"returns"bbAa"(or"bbaA")
Why this problem matters
Sort Characters By Frequency combines two fundamental techniques: frequency counting with a hash map and bucket sort for linear-time ordering. These same building blocks appear in dozens of other problems. Mastering this pattern gives you a reusable tool for any problem where you need to rank items by how often they appear.
The key insight
The maximum frequency any character can have is n (the length of the string). That means you can create an array of n + 1 buckets, where bucket i holds all characters that appear exactly i times. Then you walk from the highest bucket down, appending each character repeated by its frequency. No comparison sorting needed.
This is the same bucket sort trick used in Top K Frequent Elements. The only difference is that here you collect every character instead of stopping at k.
Python solution
from collections import Counter
def frequency_sort(s: str) -> str:
counts = Counter(s)
buckets = [[] for _ in range(len(s) + 1)]
for ch, freq in counts.items():
buckets[freq].append(ch)
result = []
for freq in range(len(buckets) - 1, 0, -1):
for ch in buckets[freq]:
result.append(ch * freq)
return "".join(result)
Here is what each part does:
Counter(s)builds the frequency map in one pass. For"tree"this gives{'t': 1, 'r': 1, 'e': 2}.buckets = [[] for _ in range(len(s) + 1)]creates empty lists indexed by frequency. Index 0 stays empty since no character has frequency 0.- The first loop places each character into the bucket matching its count. Character
ewith frequency 2 goes intobuckets[2]. Characterstandrwith frequency 1 go intobuckets[1]. - The second loop walks from the highest bucket down to 1. For each character in a bucket, it appends that character repeated by its frequency. From
buckets[2]we get"ee", then frombuckets[1]we get"t"and"r".
Visual walkthrough
Step 1: Count character frequencies with a hash map
Walk through the string once. For each character, increment its count in the hash map.
Step 2: Place characters into frequency buckets
Create n+1 buckets indexed by frequency. Place each character in the bucket matching its count.
Step 3: Build the result string from highest to lowest frequency
For each bucket from highest to lowest, append the character repeated by its frequency.
Complexity analysis
| Metric | Value | Reason |
|---|---|---|
| Time | O(n) | One pass to count, one pass to fill buckets, one pass to build result |
| Space | O(n) | Frequency map plus bucket array, both proportional to n |
The bucket sort approach avoids the O(n log n) cost of sorting the frequency map entries. Since character frequencies are bounded by n, we can use frequency as an array index and skip comparison sorting entirely.
Building blocks
Frequency counter
Count how many times each character appears. This is the same building block used in Group Anagrams, Valid Anagram, and First Unique Character in a String. The template:
from collections import Counter
counts = Counter(s)
Bucket sort by bounded value
When the value you want to sort by falls in a known range (here, 1 to n), you can place items into buckets indexed by that value. Walking the buckets in order gives you a sorted result without any comparison sort.
buckets = [[] for _ in range(max_val + 1)]
for item, val in mapping.items():
buckets[val].append(item)
Edge cases
- All characters are the same.
s = "aaaa"produces"aaaa". One entry in the frequency map, one bucket used. - All characters are unique.
s = "abcd"can return any permutation since every character has frequency 1. Bucket 1 holds all four characters and you append them in whatever order they appear. - Case sensitivity.
s = "Aabb"treatsAandaas different characters.bappears twice, so it comes first:"bbAa". - Single character.
s = "z"returns"z". The frequency map has one entry and the result is trivial.
From understanding to recall
The frequency counting step is second nature for most people. You have written Counter(s) many times. The part that fades is the bucket sort step. Under pressure, you might reach for sorted() with a key function and end up with O(n log n).
The insight to lock in is: "frequency is bounded by n, so I can use frequency as an array index." Once that becomes automatic, the rest of the solution writes itself. Create buckets, fill by frequency, walk backwards, repeat characters.
Spaced repetition turns this from something you understand into something you recall. Practice writing the bucket sort version from scratch at increasing intervals. After a few reps, the pattern becomes second nature and you never waste time on the suboptimal approach.
Related posts
- Group Anagrams uses frequency counting to generate canonical keys for grouping strings
- Top K Frequent Elements uses the exact same bucket sort trick, but stops after collecting k items
- First Unique Character in a String builds a frequency map and scans for the first character with count 1