K-th Smallest in Lexicographical Order: Virtual Trie Traversal

May 19, 20267 min read

leetcodeproblemhardtrie

K-th Smallest in Lexicographical Order is LeetCode 440, rated Hard. Given two integers n and k, return the k-th smallest integer in the range [1, n] when the numbers are sorted in lexicographical (dictionary) order.

Example: for n = 13 and k = 2, the lexicographic ordering is [1, 10, 11, 12, 13, 2, 3, 4, 5, 6, 7, 8, 9], and the 2nd number is 10.

The brute force approach would be to generate all numbers, sort them as strings, and pick the k-th one. That is O(n log n) and far too slow when n can be up to 10^9. You need a way to jump directly to the k-th number without listing them all.

Numbers 1 to 13 as a virtual trie. The key operation is counting how many nodes sit under a prefix without visiting them one by one.

Why this problem matters

This problem teaches you virtual trie traversal, one of the most elegant tricks in competitive programming. You treat the numbers 1 through n as if they were inserted into a 10-ary trie, but you never actually build the trie. Instead, you count how many nodes fall under each prefix using arithmetic. That counting step lets you skip entire subtrees in a single operation, turning what would be an O(n) walk into an O(log^2 n) search.

The same "count without enumerating" idea shows up in many other problems. Anytime you can calculate the size of a range or subtree directly, you can avoid materializing it.

Practice this pattern with spaced repetition

The key insight

Numbers 1 through n form a virtual trie where the root has children 1 through 9, and each node with value v has children v*10 through v*10 + 9 (as long as they do not exceed n). A DFS preorder traversal of this trie produces the lexicographic order.

To find the k-th number, you do not walk every node. Instead, at each prefix you ask: "How many numbers exist in the subtree rooted at this prefix?" If the count is smaller than k, you skip the entire subtree and move to the next sibling prefix. If the count is large enough, the answer is somewhere inside, so you go one level deeper.

The helper function count_steps(prefix, next_prefix, n) calculates how many numbers fall between prefix (inclusive) and next_prefix (exclusive), capped at n. It works level by level: at the first level you have just the prefix itself, at the next level you have up to 10 children, then up to 100 grandchildren, and so on. At each level, you take the minimum of the actual range boundary and n + 1 to avoid counting numbers beyond n.

The solution

def find_kth_number(n, k):
    current = 1
    k -= 1  # current = 1 accounts for the first number

    while k > 0:
        steps = count_steps(current, current + 1, n)
        if steps <= k:
            # Skip this subtree, move to next sibling
            current += 1
            k -= steps
        else:
            # Answer is in this subtree, go deeper
            current *= 10
            k -= 1  # account for the current node

    return current


def count_steps(prefix, next_prefix, n):
    steps = 0
    while prefix <= n:
        steps += min(n + 1, next_prefix) - prefix
        prefix *= 10
        next_prefix *= 10
    return steps

Let us trace through each piece.

find_kth_number starts at current = 1 and immediately decrements k by 1, because 1 itself is the first number in lex order. Then it loops:

Count: call count_steps(current, current + 1, n) to figure out how many numbers live in the subtree rooted at current.
Skip or descend: if the subtree is not large enough to contain the k-th number (steps is at most k), skip the whole subtree by moving to the next sibling (current += 1) and subtracting steps from k. Otherwise, descend into the subtree (current *= 10) and subtract 1 from k to account for the node you just passed through.

count_steps counts nodes level by level. At the first level, there is one node (the prefix itself). At the next level, there are up to 10 nodes (prefix*10 through prefix*10 + 9). It adds min(n + 1, next_prefix) - prefix at each level, which correctly caps the count when n cuts off some of the children. Then it moves both prefix and next_prefix down one level by multiplying by 10.

The min(n + 1, next_prefix) - prefix expression is the heart of this solution. Using n + 1 instead of n ensures you count node n itself when it falls exactly at the boundary. Getting this off by one wrong is the most common mistake.

Drill this pattern from memory

Visual walkthrough

Step 1: Start at prefix 1

We begin with prefix = 1 and k = 2 (we want the 2nd number). Count how many numbers exist under prefix 1: the numbers 1, 10, 11, 12, 13 give us count = 5. Since 5 >= 2, the answer is inside this subtree. Subtract 1 for the node 1 itself (k becomes 1), then go deeper by setting prefix = 10.

Step 2: Deeper at prefix 10

Now prefix = 10 and k = 1. Count how many numbers exist under prefix 10: only the number 10 itself, so count = 1. Since 1 >= 1, the answer is inside this subtree. Subtract 1 for node 10 itself (k becomes 0).

Step 3: k = 0, found the answer

When k reaches 0, the current prefix is our answer. The 2nd smallest number in lexicographical order for n = 13 is 10. We found it by counting subtree sizes instead of generating all numbers.

The walkthrough for n = 13, k = 2 shows how quickly we reach the answer. We start at prefix 1, count 5 nodes in its subtree, see that 5 is enough to contain the 2nd number, then descend to prefix 10. At prefix 10, we count 1 node, see that 1 is enough to contain the remaining k = 1, and descend again. Now k = 0, and the current prefix 10 is our answer. Only two counting operations, not 13 steps.

Complexity analysis

Approach	Time	Space
Virtual trie	O(log^2 n)	O(1)

Time: O(log^2 n). The outer loop runs at most O(log n) times because in the worst case you traverse from the root to a leaf of the virtual trie, which has depth O(log10 n). At each step, count_steps iterates through at most O(log10 n) levels. Multiplying these gives O(log^2 n). For n = 10^9, that is roughly 10 * 10 = 100 operations, which is almost instant.

Space: O(1). Only a handful of integer variables are used. No trie is built, no recursion stack is needed, and no extra data structures are allocated.

Edge cases

k = 1. The answer is always 1, since 1 is the first number in lex order for any n >= 1. The loop body never executes because k is decremented to 0 before the loop.
k = n. The answer is 9 when n >= 9, because 9 is the last number in lex order among single-digit numbers, and all multi-digit numbers starting with lower prefixes come before it. For larger n, the answer depends on the structure.
n is a power of 10. The subtree under prefix 1 is very deep. The algorithm handles this correctly because count_steps loops through all levels until prefix > n.
n = 1. The only number is 1, and k must be 1. The function returns 1 immediately.
Large n (up to 10^9). The O(log^2 n) complexity means this runs in under 100 operations even for the maximum input.

The building blocks

Counting without enumerating

The core reusable pattern is computing the size of a range or subtree using arithmetic instead of iterating through every element. In this problem, you count how many integers fall between two prefixes by expanding level by level and capping at n. This same idea appears whenever you need to find the k-th element in a structured sequence: count how many elements are in each partition, then recurse into the right one.

Virtual trie navigation

The prefix-based navigation (multiply by 10 to go deeper, add 1 to move to a sibling) is the same pattern used in Lexicographical Numbers. The difference is that in that problem you visit every node (O(n) time), while here you skip entire subtrees (O(log^2 n) time). Recognizing when you can upgrade a full traversal into a skip-based search is a valuable skill.

Prefix partitioning

Breaking a problem into "which prefix does the answer fall under" is a form of divide and conquer. You narrow the search space by a factor at each step. Binary search does this by halving; here you narrow by choosing one of up to 10 children. The principle is the same: eliminate large chunks of the search space with each decision.

From understanding to recall

The hard part of this problem is not the idea (count nodes, skip or descend) but the implementation details. The count_steps function with its min(n + 1, next_prefix) - prefix formula and the prefix *= 10, next_prefix *= 10 loop are easy to get wrong under pressure. You also need to remember when to subtract 1 from k (when descending) versus subtracting steps (when skipping).

Spaced repetition locks in these details. Write the solution from scratch after one day, then three days, then a week. Each time, pay attention to the off-by-one in count_steps and the two branches of the main loop. After a few reps, the implementation becomes muscle memory rather than something you rederive each time.

Lexicographical Numbers - The O(n) DFS traversal of the same virtual trie, visiting every node instead of skipping subtrees
Implement Trie - Build the trie data structure that this problem navigates virtually

Try CodeBricks free