Skip to content
← All posts

The Master-Worker Pattern in Software Architecture

8 min read
software-engineeringpatterns

Every time you write a recursive function that splits a problem in half, delegates each half, and combines the results, you are using the master-worker pattern. You just might not realize it yet.

The master-worker pattern is one of the most fundamental coordination patterns in software. It shows up in distributed systems, parallel computing, operating systems, and, maybe most surprisingly, in the coding problems you solve every day. Once you see it, you start recognizing it everywhere.

What is the master-worker pattern?

The master-worker pattern (historically called "master-slave") divides work between two roles:

  • The master breaks a problem into sub-tasks, assigns each sub-task to a worker, and combines the results. The master orchestrates but does not do the heavy computation itself.
  • The workers each receive a sub-task, process it independently, and return a result to the master.

That is the entire pattern. One coordinator, many workers, clean separation between "deciding what to do" and "doing it."

The master's job is planning and aggregation. The workers' job is execution. Neither role needs to understand the other's internals. The master does not care how a worker processes its task. The worker does not care how the master decided what to assign it.

How it works

The pattern follows four steps:

  1. Decompose. The master takes the full problem and splits it into smaller, independent sub-tasks.
  2. Assign. The master distributes those sub-tasks to available workers.
  3. Process. Each worker executes its sub-task independently. Workers do not communicate with each other.
  4. Aggregate. The master collects results from all workers and combines them into the final answer.

This flow is the same whether you are running a MapReduce job across a thousand servers or making two recursive calls inside a sorting function. The scale changes, but the structure stays the same.

Key properties

The master-worker pattern gives you three important properties.

Parallelism. Because workers operate independently, they can run concurrently. Two recursive calls can execute on separate threads. Ten map tasks can process ten data chunks at the same time. The work is naturally parallelizable because the master already divided it into independent pieces.

Fault tolerance. If one worker fails, the master can reassign that worker's task to a different worker. The other workers are unaffected because they do not depend on each other. This is why distributed systems like MapReduce can handle machine failures gracefully. The master just re-dispatches the failed task.

Scalability. Need more throughput? Add more workers. The master's decomposition logic stays the same. You just spread the sub-tasks across a larger pool. This is horizontal scaling, and it works because the pattern keeps coordination separate from computation.

Master-worker in code: merge sort

Merge sort is the classic example of master-worker in an algorithm. The function acts as a master that splits the array, delegates sorting of each half to recursive calls (workers), and merges the sorted results.

def merge_sort(arr):
    if len(arr) <= 1:
        return arr

    # Master: decompose the problem
    mid = len(arr) // 2

    # Master: delegate to workers (recursive calls)
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])

    # Master: aggregate results
    return merge(left, right)


def merge(left, right):
    result = []
    i = j = 0

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result.extend(left[i:])
    result.extend(right[j:])
    return result

Look at merge_sort through the master-worker lens. It does not sort anything itself. It splits the array (decompose), calls itself on each half (assign to workers), and merges the sorted halves (aggregate). The actual comparison work happens in merge, but even that is orchestrated by the master. Each recursive call is a worker that returns a sorted sub-array. The master trusts those results and combines them.

This is divide-and-conquer, and divide-and-conquer is master-worker recursion.

Master-worker in code: maximum depth of binary tree

Tree recursion is master-worker at every single node. Each node acts as a master that delegates to its children and combines their answers.

def max_depth(root):
    if root is None:
        return 0

    # Delegate to left child (worker 1)
    left_depth = max_depth(root.left)

    # Delegate to right child (worker 2)
    right_depth = max_depth(root.right)

    # Master combines results
    return 1 + max(left_depth, right_depth)

The root node does not compute the depth of the entire tree by itself. It asks the left subtree "how deep are you?" and the right subtree "how deep are you?", then combines with 1 + max(left_depth, right_depth). Each child does the same thing, all the way down to the leaves.

Every node is simultaneously a master (delegating to its children) and a worker (returning a result to its parent). This recursive nesting is what makes tree problems elegant once you trust the pattern.

Master-worker in code: concurrent processing

The pattern is not just theoretical. Python's concurrent.futures module gives you a literal master-worker implementation with thread pools and process pools.

from concurrent.futures import ProcessPoolExecutor
import math


def is_prime(n):
    """Worker function: check if a number is prime."""
    if n < 2:
        return False
    for i in range(2, int(math.isqrt(n)) + 1):
        if n % i == 0:
            return False
    return True


def find_primes(numbers):
    """Master function: distribute work and collect results."""
    primes = []

    # Master creates a pool of workers
    with ProcessPoolExecutor() as executor:
        # Master assigns each number to a worker
        futures = {executor.submit(is_prime, n): n for n in numbers}

        # Master collects results
        for future in futures:
            if future.result():
                primes.append(futures[future])

    return sorted(primes)


candidates = [112272535095293, 112582705942171, 115280095190773, 4, 15]
print(find_primes(candidates))

The master (find_primes) does not check any number itself. It submits each number to a worker process, waits for results, and aggregates them. The workers (is_prime) each handle one number independently. If you add more CPU cores, the executor spreads work across more workers automatically. Same pattern, different scale.

Real-world examples

The master-worker pattern powers some of the most important systems in software engineering.

MapReduce. Google's MapReduce framework is the textbook real-world example. The master splits input data into chunks and assigns each chunk to a map worker. Map workers process their chunks in parallel, producing intermediate key-value pairs. The master then assigns those intermediate results to reduce workers, which combine them into the final output. The master handles scheduling, fault recovery, and data routing. The workers just process data.

Database replication. In a primary-replica database setup, the primary server (master) handles all writes and coordinates replication to replicas (workers). The replicas serve read queries independently. If a replica goes down, the primary keeps working and read traffic shifts to the remaining replicas. This is master-worker applied to data consistency.

Load balancers. A load balancer is a master that distributes incoming HTTP requests across a pool of backend servers (workers). The workers handle the requests independently. The master tracks which workers are healthy and routes traffic accordingly. Need to handle more load? Add more worker servers behind the balancer.

CI/CD pipelines. A CI orchestrator like GitHub Actions takes your pipeline definition (master logic), breaks it into jobs, and assigns each job to a runner agent (worker). Test suites run in parallel across multiple runners. The orchestrator collects pass/fail results from every runner and determines the overall pipeline status.

Connection to coding problems

If you have been solving coding problems, you have already been using master-worker without naming it. Here is how it maps to specific problems.

Divide-and-conquer is master-worker recursion. Any time a function splits its input, delegates to recursive calls, and combines the results, that is the pattern. Merge sort, quicksort, and binary search on trees all follow this structure.

Merge Intervals has a master that sorts the intervals (preprocessing, like a master organizing work before delegation) and then iterates through them, making a merge-or-append decision at each step. The overall coordination logic is the master. Each comparison is a small worker task.

Maximum Depth of Binary Tree is the purest master-worker recursion. Every node delegates to its left and right children, then combines results with 1 + max().

Course Schedule has a master function that coordinates DFS across all nodes in the graph. It does not do all the traversal in one giant loop. Instead, it iterates over every node and delegates a DFS call for each one. Each DFS call is a worker exploring one connected component. The master tracks which nodes have been visited and determines the final answer.

Advantages

Natural parallelism. The decomposition step produces independent sub-tasks that can run concurrently. You do not have to redesign anything to parallelize. The structure already supports it.

Scalability. Adding more workers increases throughput without changing the master's logic. Whether you have 2 workers or 200, the coordination pattern stays the same.

Clean separation of concerns. The master handles strategy (what to do and how to combine results). The workers handle execution (how to do it). You can change the worker implementation without touching the master, and vice versa.

Fault isolation. A failing worker does not bring down the whole system. The master can detect the failure, reassign the task, and continue. Other workers are not affected.

Disadvantages

The master is a bottleneck. All communication flows through the master. If the master is slow at decomposing, assigning, or aggregating, the entire system slows down, even if the workers are fast. In distributed systems, the master can become a throughput ceiling.

Single point of failure. If the master crashes, the entire system stops. Workers cannot coordinate on their own. This is why production systems often add master redundancy (like a standby master that takes over on failure), but that adds complexity.

Communication overhead. Every task assignment and result collection requires communication between the master and workers. In a distributed system, that means network calls. In a recursive algorithm, that means function call overhead and stack frames. For very small tasks, the overhead of coordination can exceed the cost of just doing the work directly.

Load balancing is hard. If some sub-tasks take much longer than others, fast workers sit idle while slow workers are still processing. The master finishes only when the last worker finishes. Uneven task sizes mean wasted capacity. Designing a decomposition that produces evenly-sized sub-tasks is a real engineering challenge.

The takeaway

The master-worker pattern is about one thing: separating coordination from computation. The master decides what to do. The workers do it. Results flow back to the master for aggregation.

You have been using this pattern every time you write a recursive function that splits, delegates, and combines. Merge sort, tree traversals, and divide-and-conquer algorithms are all master-worker at their core. And the same pattern, scaled up, powers MapReduce clusters, load balancers, and CI pipelines.

Understanding the pattern by name gives you a vocabulary for reasoning about both algorithms and systems. When you see a tree recursion problem, you can think "each node is a master delegating to child workers." When you design a distributed pipeline, you can think about decomposition, assignment, processing, and aggregation as separate concerns.

The pattern is simple. Its applications are everywhere.

Related posts