Skip to content
← All posts

Incremental Testing: Adding One Module at a Time

12 min read
software-engineeringpatterns

Integration testing verifies that your modules work together. But how do you actually perform integration testing on a system with many modules? You have two broad choices: combine everything at once and hope for the best, or add modules one at a time and test at each step. Incremental testing is the second approach, and it is almost always the better one.

The problem with big bang integration

Big bang integration takes all your modules, wires them together in one shot, and runs tests against the full system. When it works, it feels efficient. When it fails, it is a nightmare.

Suppose you have a web application with a UI layer, a service layer, an authentication module, a payment processor, and a database layer. You finish building all five, plug them together, and run the tests. Something fails. Where is the bug?

It could be in any of the five modules. It could be in the interface between any two of them. With five modules, there are ten possible pairwise interactions. You are now debugging the entire system at once with no way to narrow down which connection is broken.

Big bang integration has a few specific problems:

  • Fault isolation is hard. When a test fails, the bug could be anywhere in the system. You spend more time debugging than testing.
  • You cannot start early. You need all modules to be complete before you can test anything. If one module is delayed, all integration testing is delayed.
  • Failures cascade. One broken interaction can cause dozens of test failures, making it difficult to tell which failure is the root cause and which ones are symptoms.

Incremental testing solves all three of these problems.

What incremental testing is

Incremental testing is an integration testing strategy where you add and test one module at a time. You start with one module, verify it works, add a second module, verify the pair works together, add a third, and so on. At each step, if something breaks, you know the most recently added module (or its interaction with the existing ones) is the cause.

This approach gives you two things big bang integration does not: early testing and precise fault isolation.

There are three main flavors of incremental testing: top-down, bottom-up, and sandwich (hybrid). Each one starts from a different end of the system and works toward the other.

Top-down incremental testing

Top-down testing starts with the highest-level module and works downward. You test the top module first, using stubs to stand in for the lower-level modules it depends on. Then you replace one stub with the real module, test again, replace another stub, test again, and continue until the entire system is integrated.

How it works

Consider a 3-tier web application:

UI Layer  ->  Service Layer  ->  Database Layer

With top-down testing, you proceed like this:

Step 1: Test UI Layer
        Service Layer = STUB (returns hardcoded responses)
        Database Layer = STUB

Step 2: Test UI Layer + Service Layer
        Database Layer = STUB (returns hardcoded data)

Step 3: Test UI Layer + Service Layer + Database Layer
        Everything is real

At step 1, the UI layer calls the service layer stub. The stub returns hardcoded data so you can verify that the UI renders correctly, handles errors, and sends the right requests. At step 2, you replace the service stub with the real service layer. If something breaks, the problem is in the service layer or its interface with the UI. At step 3, you replace the database stub with the real database. If something breaks now, it is the database layer or its interface with the service.

What is a stub?

A stub is a simplified fake implementation of a module. It accepts the same inputs as the real module but returns hardcoded or minimal responses instead of doing real work.

# Real database module
class UserDatabase:
    def get_user(self, user_id):
        # Connects to actual database, runs SQL query
        return self.db.execute("SELECT * FROM users WHERE id = ?", user_id)

# Stub for top-down testing
class UserDatabaseStub:
    def get_user(self, user_id):
        # Returns hardcoded data, no database involved
        return {"id": user_id, "name": "Test User", "email": "test@example.com"}

The stub has the same interface as the real module (get_user takes a user_id and returns a user dict), but it skips all the real work. This lets you test the modules above it without needing the database to be set up and running.

Advantages of top-down

Test user-facing behavior early. You can verify that the UI works correctly before the lower layers are even built. This is valuable because user-facing bugs are the ones that matter most to stakeholders.

Early feedback on system architecture. By testing the top-level module first, you validate that the overall structure and control flow make sense before investing time in lower-level details.

Natural for teams working top-down. If your team builds features starting from the UI and working toward the backend, top-down testing matches your development flow.

Disadvantages of top-down

Stubs can mask lower-level bugs. The stub always returns the "happy path" response. It does not simulate database errors, network timeouts, or malformed data. Bugs in the lower layers stay hidden until you finally replace the stubs.

Writing good stubs takes effort. A stub that is too simple does not test anything meaningful. A stub that is too complex becomes a maintenance burden. Finding the right level of fidelity is tricky.

Lower layers are tested last. If the database layer has a serious bug, you will not discover it until the very end of the integration process, when it is most expensive to fix.

Bottom-up incremental testing

Bottom-up testing starts with the lowest-level modules and works upward. You test the foundation first, using drivers (test harnesses) to simulate the higher-level modules that would normally call into them. Then you add higher-level modules one at a time, replacing drivers with real code as you go.

How it works

Using the same 3-tier application:

Step 1: Test Database Layer
        (use a test DRIVER to call database functions directly)

Step 2: Test Service Layer + Database Layer
        (use a test DRIVER to simulate API requests)

Step 3: Test UI Layer + Service Layer + Database Layer
        Everything is real

At step 1, you write a driver that calls the database layer directly, inserting data, querying it, and verifying results. At step 2, you add the real service layer on top and write a driver that sends requests to it (simulating what the UI would do). If something breaks, the problem is in the service layer or how it calls the database. At step 3, you add the real UI layer. If something breaks, it is in the UI or its connection to the service layer.

What is a driver?

A driver is a piece of test code that calls a module from above. It simulates the higher-level module that would normally invoke the one you are testing.

# Driver for testing the service layer (before UI is integrated)
def test_driver_for_service_layer():
    service = OrderService(real_database)

    # Simulate what the UI would do
    result = service.create_order(
        user_id=1,
        items=[{"product_id": 101, "qty": 2}, {"product_id": 102, "qty": 1}]
    )

    assert result["status"] == "created"
    assert result["total"] == 35.00

    # Verify the order was actually saved
    saved_order = service.get_order(result["order_id"])
    assert saved_order is not None
    assert saved_order["total"] == 35.00

The driver calls the service layer the same way the UI would, but it is just test code. It does not render any interface. It simply invokes methods and checks results.

Advantages of bottom-up

Solid foundation first. You know the database layer works before you build on top of it. You know the service layer works before the UI depends on it. Each layer is proven reliable before anything depends on it.

No stubs needed for lower layers. You are testing the real modules from the start, which means you catch real bugs in the real code. There is no risk of a stub hiding a problem.

Easier to test utility and infrastructure code. Libraries, data access layers, and shared services are naturally at the bottom of the dependency tree. Bottom-up testing lets you validate them thoroughly.

Disadvantages of bottom-up

No user-facing testing until the end. Stakeholders cannot see a working UI until the very last step. If there is a fundamental problem with the user experience, you will not discover it until most of the system is already built.

Drivers can be tedious to write. You need to write test harnesses that simulate the behavior of modules that do not exist yet. If you do not know exactly how the higher-level module will call the lower one, the driver might not test the right scenarios.

Does not validate top-level design early. You might build a perfect database layer and a perfect service layer, only to discover that the overall architecture does not support the UI flows you need.

Stubs vs. drivers: a quick comparison

Both stubs and drivers are temporary pieces of code used during incremental testing. They serve opposite purposes:

StubDriver
Used inTop-down testingBottom-up testing
ReplacesA lower-level moduleA higher-level module
PurposeProvides fake responses to the module being testedCalls the module being tested with simulated inputs
DirectionSits below the module under testSits above the module under test
ExampleA fake database that returns hardcoded dataA test script that calls the service layer directly

Think of it this way: stubs answer calls from above. Drivers make calls from above. A stub pretends to be a dependency. A driver pretends to be a caller.

Sandwich (hybrid) testing

Sandwich testing, also called hybrid testing, combines top-down and bottom-up approaches. You test the top layers with stubs going downward while simultaneously testing the bottom layers with drivers going upward. The two fronts meet in the middle.

How it works

Top-down front:                       Bottom-up front:
  UI Layer (with service stub)          Database Layer (with driver)
         |                                      |
         v                                      v
  UI + Service Layer (with DB stub)     Service + Database Layer (with driver)
         |                                      |
         +---------- MEET HERE ----------------+
         |
  UI + Service + Database (fully integrated)

The top-down team verifies that the UI works correctly using service stubs. The bottom-up team verifies that the database and service layers work correctly using drivers. When both sides have sufficient confidence, you connect the real service layer in the middle and test the full stack.

Why sandwich testing works well

Parallel progress. Two teams (or two test efforts) can work simultaneously. The top-down effort does not block the bottom-up effort and vice versa.

Early feedback at both ends. You get user-facing feedback early (from the top-down side) and foundation confidence early (from the bottom-up side).

Smaller risk window. The only "blind spot" is the middle integration point where the two halves meet. And by the time you reach that point, both halves have been tested thoroughly.

Sandwich testing is the most practical approach for large systems where waiting to test from one direction would take too long.

Practical example: integrating a 3-tier web app

Let's walk through a concrete example. You are building an e-commerce application with three layers:

  • UI Layer: A React frontend that displays products and a shopping cart.
  • Service Layer: An API that handles business logic like calculating totals, applying discounts, and managing inventory.
  • Database Layer: A PostgreSQL database with an ORM for storing products, orders, and users.

Top-down approach

# Step 1: Test UI with stubbed service
class ProductServiceStub:
    def get_products(self):
        return [
            {"id": 1, "name": "Widget", "price": 9.99},
            {"id": 2, "name": "Gadget", "price": 19.99},
        ]

    def create_order(self, items):
        return {"order_id": "fake-123", "total": 29.98, "status": "confirmed"}

# Test: Does the UI correctly display products?
# Test: Does the UI correctly show the order confirmation?
# Test: Does the UI handle an empty product list?

# Step 2: Replace ProductServiceStub with real service, stub DB
class DatabaseStub:
    def query_products(self):
        return [("Widget", 9.99), ("Gadget", 19.99)]

    def insert_order(self, order):
        return "fake-order-id"

# Test: Does the service correctly calculate totals?
# Test: Does the service apply discount rules?
# Test: Does the UI + service integration produce correct results?

# Step 3: Replace DatabaseStub with real PostgreSQL
# Test: Does the full stack work end to end?
# Test: Do database constraints (unique keys, foreign keys) hold?

You discover UI bugs at step 1, service logic bugs at step 2, and database bugs at step 3. At each step, you know where to look.

Bottom-up approach

# Step 1: Test database layer with a driver
def test_database_layer():
    db = Database("postgresql://localhost/test_db")

    # Driver: directly call database functions
    product_id = db.insert_product("Widget", 9.99)
    assert db.get_product(product_id)["name"] == "Widget"

    order_id = db.insert_order(user_id=1, items=[product_id], total=9.99)
    assert db.get_order(order_id)["total"] == 9.99

# Step 2: Add service layer, use a driver for the API
def test_service_with_real_db():
    db = Database("postgresql://localhost/test_db")
    service = OrderService(db)

    # Driver: simulate what the UI would do
    products = service.get_products()
    assert len(products) > 0

    order = service.create_order(
        user_id=1,
        items=[{"product_id": 1, "qty": 2}]
    )
    assert order["total"] == 19.98
    assert order["status"] == "confirmed"

# Step 3: Add real UI on top
# Test: Does the full stack work end to end?

You have high confidence in the database and service layers before the UI is even connected. But stakeholders do not see a working UI until step 3.

Sandwich approach

# Top-down front (run in parallel with bottom-up front)
# UI + Service stub: verify UI behavior, rendering, error handling

# Bottom-up front (run in parallel with top-down front)
# Database + driver: verify data storage, constraints, queries
# Service + Database + driver: verify business logic with real data

# Final merge: connect UI to real Service to real Database
# Test the complete flow end to end

The sandwich approach lets you catch UI issues and database issues at the same time, then merge in the middle for the final integration.

When to use each approach

Use top-down when:

  • Stakeholder demos matter early. If you need to show progress to non-technical stakeholders, top-down lets you demonstrate a working UI quickly.
  • The UI is the riskiest part. If you are unsure about the user experience or workflow, testing it early reduces risk.
  • You are building a prototype or MVP. Top-down testing aligns naturally with building the visible parts first and filling in the backend later.

Use bottom-up when:

  • The infrastructure is complex or risky. If your database schema, message queues, or third-party integrations are the biggest risk, test them first.
  • You are building a library or API. When there is no UI, bottom-up is the natural approach. Test the core logic first and layer the API on top.
  • Correctness of the foundation matters more than UX. Financial calculations, security systems, and data pipelines need a rock-solid base.

Use sandwich when:

  • The system is large. With many modules, testing from one direction takes too long. Sandwich testing lets you parallelize the effort.
  • Multiple teams are working in parallel. The frontend team can test top-down while the backend team tests bottom-up.
  • You want the benefits of both. Early user-facing feedback and a solid foundation, with the tradeoff of coordinating the merge in the middle.

For most real-world projects, sandwich testing is the most practical choice. It gives you the early UI feedback of top-down and the solid foundation of bottom-up, while letting different team members work in parallel.

The takeaway

Big bang integration is tempting because it seems simpler. Just wire everything together and test it. But when tests fail, you have no idea where the bug is. You end up spending more time debugging than testing.

Incremental testing trades that false simplicity for real productivity. By adding one module at a time, you always know what changed, you always know where to look when something breaks, and you can start testing before the entire system is complete.

Top-down gives you early user-facing feedback. Bottom-up gives you a solid foundation. Sandwich gives you both. Pick the approach that matches your project's risks, your team's workflow, and what matters most to your stakeholders.

The goal is not to avoid bugs entirely. It is to find them quickly, isolate them precisely, and fix them cheaply. Incremental testing makes all three of those easier.

Related posts

  • Software Testing Types covers unit testing, integration testing, white box testing, black box testing, and more.
  • Integration Testing dives deeper into integration testing strategies and how they fit into the test pyramid.
  • Verification vs. Validation explains the difference between checking that your software works correctly and checking that you built the right software.