Skip to content
← All posts

GCD Sort of an Array: Union-Find with Prime Factorization

6 min read
leetcodeproblemhardarraysmathunion-findsorting

GCD Sort of an Array is LeetCode 1998. You are given an integer array nums, and you can swap nums[i] and nums[j] if gcd(nums[i], nums[j]) > 1. Return true if you can sort the array in non-decreasing order using any number of swaps, or false otherwise.

A few examples:

  • [7, 21, 3] returns true. 7 and 21 share factor 7, 21 and 3 share factor 3. Through transitive swaps, you can rearrange the array to [3, 7, 21].
  • [5, 2, 6, 2] returns false. 5 shares no common factor with any other element, so it is stuck at index 0. The sorted order needs 2 at index 0.
  • [8, 9, 4, 2, 3] returns true. All elements are connected through shared prime factors and can be rearranged into sorted order.

The key insight is that swap eligibility is transitive. If a can swap with b, and b can swap with c, then through a series of swaps, a, b, and c can all reach any position among themselves. This is exactly the structure that union-find was built for.

Input: nums = [7, 21, 3]7i=021i=13i=2Build union-find graph from shared factorsAll in one component: can swap any pair transitivelygcd=7gcd=37213Sorted = [3, 7, 21]. Each can reach its target position.3i=07i=121i=2Answer: true
Elements sharing a common factor greater than 1 can be swapped. Through transitive connections, 7, 21, and 3 are all in one group and can be rearranged into sorted order.

The approach

Checking every pair for gcd > 1 would be O(n^2), which is too slow when n can be up to 30,000. Instead, you use the same trick from Largest Component Size by Common Factor: union each number with its prime factors rather than comparing pairs.

Here is the plan:

  1. Factorize each number into its prime factors using trial division.
  2. Union each number with its primes. Two numbers sharing a prime factor end up in the same union-find component automatically.
  3. Sort a copy of the array. This gives you the target position for each element.
  4. Check each index. For every index i, verify that nums[i] and sorted_nums[i] are in the same union-find component. If they are, nums[i] can reach its sorted position through transitive swaps. If any index fails this check, return false.

That is the entire algorithm. The union-find groups tell you which elements can freely rearrange among themselves. If every element can reach where it needs to go, the array is sortable.

Think of union-find components as "swap neighborhoods." Any element can move to any position within its neighborhood through a chain of valid swaps. The array is sortable if and only if every element's target position is in the same neighborhood.

Python solution

def gcdSort(nums):
    parent = {}
    rank = {}

    def find(x):
        if x not in parent:
            parent[x] = x
            rank[x] = 0
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x

    def union(a, b):
        ra, rb = find(a), find(b)
        if ra == rb:
            return
        if rank[ra] < rank[rb]:
            ra, rb = rb, ra
        parent[rb] = ra
        if rank[ra] == rank[rb]:
            rank[ra] += 1

    def prime_factors(n):
        factors = []
        d = 2
        while d * d <= n:
            if n % d == 0:
                factors.append(d)
                while n % d == 0:
                    n //= d
            d += 1
        if n > 1:
            factors.append(n)
        return factors

    for num in nums:
        for p in prime_factors(num):
            union(num, p)

    sorted_nums = sorted(nums)
    for a, b in zip(nums, sorted_nums):
        if find(a) != find(b):
            return False
    return True

Visual walkthrough

Let's trace the algorithm on nums = [7, 21, 3]. We factorize each number, union it with its primes, then check whether each element can reach its sorted position.

Step 1: Factorize each number and union with its primes

numroot72137213

7 has prime factor {7}. Union(7, prime 7) is a self-link. 21 = 3 * 7. 3 is prime. Start processing.

Step 2: Process 7. Union 7 with prime 7 (no-op)

numroot72137213

7 is its own prime. Union(7, 7) does nothing. Component: {7}.

Step 3: Process 21. Union 21 with prime 3 and prime 7

numroot7213773

Union(21, 3) links 21 to 3. Union(21, 7) merges {21, 3} with {7}. Component: {7, 21, prime 3}.

Step 4: Process 3. Union 3 with prime 3

numroot7213777

Union(3, prime 3). Prime 3 is already in the {7, 21} component. Now all three are connected.

Step 5: Compare to sorted array

numroot3721777

Sorted = [3, 7, 21]. At each index, nums[i] and sorted[i] share the same root. Answer = true.

After processing all numbers, every element shares the same root. The sorted array is [3, 7, 21]. At each index, the original value and the sorted value have the same root, so the answer is true.

Complexity analysis

MetricValueWhy
TimeO(n * sqrt(M) + n log n)Factorizing each of n numbers costs O(sqrt(M)) per number where M is the max value. Sorting costs O(n log n). Union-find operations are nearly O(1) each.
SpaceO(n + M)The dictionary-based union-find stores entries for each number and each prime factor encountered.

The dominant cost is either the factorization pass or the sort, depending on the input. For typical constraints (n up to 30,000 and values up to 100,000), this runs comfortably within time limits.

Building blocks

GCD Sort is built from two reusable bricks that appear across many problems.

1. Union-Find with lazy initialization

When nodes span a wide numeric range or are not known ahead of time, use a dictionary-based union-find. Nodes are created on first access. This avoids allocating a massive array and naturally handles mixed node types, since both the original numbers and their prime factors live in the same structure.

parent = {}
rank = {}

def find(x):
    if x not in parent:
        parent[x] = x
        rank[x] = 0
    while parent[x] != x:
        parent[x] = parent[parent[x]]
        x = parent[x]
    return x

This same lazy union-find pattern appears in Largest Component Size by Common Factor, Accounts Merge, and any problem where the node space is dynamic.

2. Prime factorization by trial division

Extract the unique prime factors of a number in O(sqrt(n)) time. The inner loop divides out all copies of each prime, so each factor is collected exactly once.

def prime_factors(n):
    factors = []
    d = 2
    while d * d <= n:
        if n % d == 0:
            factors.append(d)
            while n % d == 0:
                n //= d
        d += 1
    if n > 1:
        factors.append(n)
    return factors

This is the standard approach when values are not too large (up to around 10^5 or 10^6). For larger values, you might switch to a sieve-based factorization, but trial division is sufficient here.

The "union through shared attributes" pattern is the real technique to internalize. Instead of comparing all pairs (O(n^2)), you union each element with its attributes. Elements sharing an attribute end up in the same component automatically. This reduces grouping problems from quadratic to nearly linear time.

Edge cases

All elements are the same. For example, nums = [5, 5, 5]. The array is already sorted. Every pair has gcd = 5, so they are all connected. Return true.

Element is 1. The number 1 has no prime factors greater than 1, so it cannot swap with anything. If 1 is already in its correct sorted position, that is fine. But if 1 needs to move, the answer is false. For instance, nums = [3, 1] returns false because 1 is stuck at index 1 but needs to be at index 0.

Already sorted. If the array is already in order, each nums[i] == sorted_nums[i], and find(a) == find(a) is always true. Return true without needing any swaps.

Two large primes. If nums = [99991, 99989], these are both prime and share no common factor. If the array needs them in the opposite order, the answer is false.

All even numbers. Every element shares the factor 2, so they all end up in one component. The array can always be sorted. Return true.

Duplicate values. Duplicates share all the same prime factors and end up in the same component. They do not cause issues because nums[i] == sorted_nums[i] for duplicates that are already in place, and duplicates that need to move share the same root.

From understanding to recall

You have seen how prime factorization feeds into union-find and how the final comparison against the sorted array determines the answer. The algorithm is clean: factorize, union, sort, compare roots. But there are details that slip away. The lazy dictionary-based union-find, the trial division loop with its inner while to divide out repeated primes, the final zip comparison. These are small pieces, but under interview pressure, any one of them can trip you up.

The good news is that both building blocks, union-find and prime factorization, are reusable across many problems. If you drill them individually with spaced repetition, GCD Sort becomes an assembly problem rather than a derivation problem. You pull out two bricks you already own, connect them, and add the sorted-comparison check. That is a five-minute solution instead of a thirty-minute struggle.

Related posts

  • Largest Component Size by Common Factor - The closest sibling problem, using the same prime factorization and union-find technique to find component sizes
  • Number of Provinces - A clean introduction to union-find for counting connected components
  • Accounts Merge - Union-find applied to grouping elements by shared attributes, the same "union through shared properties" pattern