Skip to content
← All posts

Sort Characters By Frequency: Bucket Sort Approach

4 min read
leetcodeproblemmediumstringshash-mapheap

Sort Characters By Frequency (LeetCode 451) asks you to rearrange a string so that characters appearing most often come first. Given the string "tree", you return "eert" or "eetr" because e appears twice while t and r each appear once.

The most natural approach uses a heap, but bucket sort gives you O(n) time with simpler code. Let's break it down.

The problem

Given a string s, sort it in decreasing order based on the frequency of the characters. The frequency of a character is the number of times it appears in the string. If multiple answers are valid, return any of them.

Examples:

  • s = "tree" returns "eert" (or "eetr")
  • s = "cccaaa" returns "aaaccc" (or "cccaaa")
  • s = "Aabb" returns "bbAa" (or "bbaA")
Input: "tree"treefreq:t1r1e2sort by count descendingOutput:eert"eert"
Count character frequencies, then arrange characters from highest to lowest frequency.

Why this problem matters

Sort Characters By Frequency combines two fundamental techniques: frequency counting with a hash map and bucket sort for linear-time ordering. These same building blocks appear in dozens of other problems. Mastering this pattern gives you a reusable tool for any problem where you need to rank items by how often they appear.

The key insight

The maximum frequency any character can have is n (the length of the string). That means you can create an array of n + 1 buckets, where bucket i holds all characters that appear exactly i times. Then you walk from the highest bucket down, appending each character repeated by its frequency. No comparison sorting needed.

This is the same bucket sort trick used in Top K Frequent Elements. The only difference is that here you collect every character instead of stopping at k.

Python solution

from collections import Counter

def frequency_sort(s: str) -> str:
    counts = Counter(s)
    buckets = [[] for _ in range(len(s) + 1)]

    for ch, freq in counts.items():
        buckets[freq].append(ch)

    result = []
    for freq in range(len(buckets) - 1, 0, -1):
        for ch in buckets[freq]:
            result.append(ch * freq)

    return "".join(result)

Here is what each part does:

  • Counter(s) builds the frequency map in one pass. For "tree" this gives {'t': 1, 'r': 1, 'e': 2}.
  • buckets = [[] for _ in range(len(s) + 1)] creates empty lists indexed by frequency. Index 0 stays empty since no character has frequency 0.
  • The first loop places each character into the bucket matching its count. Character e with frequency 2 goes into buckets[2]. Characters t and r with frequency 1 go into buckets[1].
  • The second loop walks from the highest bucket down to 1. For each character in a bucket, it appends that character repeated by its frequency. From buckets[2] we get "ee", then from buckets[1] we get "t" and "r".

Visual walkthrough

Step 1: Count character frequencies with a hash map

s = "tree"treet1r1e2Counter("tree")

Walk through the string once. For each character, increment its count in the hash map.

Step 2: Place characters into frequency buckets

freq 0freq 1trfreq 2efreq 3freq 4buckets[freq].append(char)walk right to left (highest frequency first)

Create n+1 buckets indexed by frequency. Place each character in the bucket matching its count.

Step 3: Build the result string from highest to lowest frequency

bucket 2erepeat 2xee+bucket 1trtresult = "eert"Each character repeated by its frequency count

For each bucket from highest to lowest, append the character repeated by its frequency.

Complexity analysis

MetricValueReason
TimeO(n)One pass to count, one pass to fill buckets, one pass to build result
SpaceO(n)Frequency map plus bucket array, both proportional to n

The bucket sort approach avoids the O(n log n) cost of sorting the frequency map entries. Since character frequencies are bounded by n, we can use frequency as an array index and skip comparison sorting entirely.

Building blocks

Frequency counter

Count how many times each character appears. This is the same building block used in Group Anagrams, Valid Anagram, and First Unique Character in a String. The template:

from collections import Counter

counts = Counter(s)

Bucket sort by bounded value

When the value you want to sort by falls in a known range (here, 1 to n), you can place items into buckets indexed by that value. Walking the buckets in order gives you a sorted result without any comparison sort.

buckets = [[] for _ in range(max_val + 1)]
for item, val in mapping.items():
    buckets[val].append(item)

Edge cases

  • All characters are the same. s = "aaaa" produces "aaaa". One entry in the frequency map, one bucket used.
  • All characters are unique. s = "abcd" can return any permutation since every character has frequency 1. Bucket 1 holds all four characters and you append them in whatever order they appear.
  • Case sensitivity. s = "Aabb" treats A and a as different characters. b appears twice, so it comes first: "bbAa".
  • Single character. s = "z" returns "z". The frequency map has one entry and the result is trivial.

From understanding to recall

The frequency counting step is second nature for most people. You have written Counter(s) many times. The part that fades is the bucket sort step. Under pressure, you might reach for sorted() with a key function and end up with O(n log n).

The insight to lock in is: "frequency is bounded by n, so I can use frequency as an array index." Once that becomes automatic, the rest of the solution writes itself. Create buckets, fill by frequency, walk backwards, repeat characters.

Spaced repetition turns this from something you understand into something you recall. Practice writing the bucket sort version from scratch at increasing intervals. After a few reps, the pattern becomes second nature and you never waste time on the suboptimal approach.

Related posts