Skip to content
← All posts

Set Intersection Size At Least Two: Greedy Interval Covering

5 min read
leetcodeproblemhardarraysgreedysorting

You are given a list of closed intervals, and you need to find the smallest set of integers that shares at least two elements with every interval. This is LeetCode 757: Set Intersection Size At Least Two, a hard greedy problem that becomes surprisingly clean once you see the right sorting strategy.

The key insight is to sort intervals by their right endpoint ascending (breaking ties by left endpoint descending), then greedily pick points from the right end of each interval. By processing the tightest intervals first, the points you choose have the best chance of overlapping with future intervals.

0123456[1, 3][1, 4][2, 5][3, 5]S2345|S| = 4
intervals = [[1,3],[1,4],[2,5],[3,5]]. The set S = {2, 3, 4, 5} has at least 2 points inside each interval. Green dots mark chosen points.

Problem statement

Given a 2D integer array intervals where intervals[i] = [start_i, end_i] represents all integers from start_i to end_i inclusive, find the minimum size of a set S such that for every interval, the intersection of S with the integers in that interval contains at least 2 elements.

In other words, for every [a, b] in the input, the set S must contain at least two integers x and y where a <= x <= b and a <= y <= b.

Approach: greedy with endpoint sorting

The algorithm has four parts:

  1. Sort by right endpoint ascending. For ties (same right endpoint), sort by left endpoint descending. This ensures you process the tightest (shortest) intervals first. When two intervals share the same right endpoint, the one with the larger left endpoint is more constrained, so you want to handle it first.

  2. Track the two largest chosen points. Maintain two variables, p1 and p2, representing the two largest points currently in your set (with p1 < p2). These are the only points that matter when checking coverage of the next interval, because the intervals are sorted by right endpoint.

  3. For each interval, count how many of p1 and p2 fall inside it. Since p2 >= p1 and the interval's right endpoint is at least as large as all previous right endpoints, you only need to check whether p1 and p2 are >= start of the current interval.

  4. Add new points as needed:

    • If 0 points are covered, add end - 1 and end.
    • If 1 point is covered, add end.
    • If 2 are already covered, skip the interval.
def intersectionSizeTwo(intervals):
    intervals.sort(key=lambda x: (x[1], -x[0]))
    res = []
    p1, p2 = -1, -1

    for start, end in intervals:
        if start > p2:
            p1 = end - 1
            p2 = end
            res.append(p1)
            res.append(p2)
        elif start > p1:
            p1 = p2
            p2 = end
            res.append(p2)

    return len(res)

Visual walkthrough

Step 1: Sort intervals by end ascending, then by start descending for ties.

Interval:[1, 3](end = 3)
Chosen so far:none
Points in interval:0 covered (need 2 more)
Add:23
Chosen after:23

No points chosen yet, so 0 are covered. Add end - 1 = 2 and end = 3. Sorted order: [1,3], [1,4], [2,5], [3,5].

Step 2: Process [1, 4].

Interval:[1, 4](end = 4)
Chosen so far:23
Points in interval:2 covered (skip)
Chosen after:23

Both 2 and 3 fall within [1, 4]. Already 2 points covered, so skip this interval.

Step 3: Process [2, 5].

Interval:[2, 5](end = 5)
Chosen so far:23
Points in interval:2 covered (skip)
Chosen after:23

Both 2 and 3 fall within [2, 5]. Already 2 points covered, so skip this interval.

Step 4: Process [3, 5].

Interval:[3, 5](end = 5)
Chosen so far:23
Points in interval:1 covered (need 1 more)
Add:5
Chosen after:235

Only 3 falls within [3, 5] (2 is out of range). 1 point covered, need 1 more. Add end = 5. Final set size = 3.

After processing all four intervals, the set contains 3 points: (2, 3, 5). Every interval has at least 2 points from S inside it. Notice how sorting by right endpoint meant the first interval [1, 3] forced us to pick 2 and 3, which happened to cover [1, 4] and [2, 5] for free. Only [3, 5] needed an extra point.

Why does sorting by right endpoint work? By processing the interval that ends earliest first, you are forced to pick points near that tight boundary. Those points are as far left as possible on the number line, which maximizes the chance they also fall inside later intervals (which all end at the same position or further right). This is the same principle behind the classic activity selection problem.

Complexity analysis

MetricValue
TimeO(n log n) for sorting, plus O(n) for the greedy pass
SpaceO(n) for the result set in the worst case, O(1) extra if you only track the two largest points

Building blocks

1. Interval sorting by endpoint

Sorting intervals by their right endpoint is the foundation of many greedy interval problems. The idea is that the interval ending soonest is the most constrained, so handling it first produces the best global outcome.

intervals.sort(key=lambda x: (x[1], -x[0]))

The secondary sort by left endpoint descending handles ties: among intervals ending at the same point, the narrower one is more constrained and should be processed first. If you skip the tie-breaker, you might pick points that satisfy a wider interval but miss the narrower one.

2. Greedy point selection with coverage tracking

Once intervals are sorted, you walk through them one by one and maintain just enough state to decide whether new points are needed. Here, that state is the two largest chosen points p1 and p2.

p1, p2 = -1, -1
for start, end in intervals:
    if start > p2:
        p1 = end - 1
        p2 = end
    elif start > p1:
        p1 = p2
        p2 = end

The reason you only need to track the two largest is that all future intervals have end >= current end. So the only question is whether the current interval's start is small enough to capture p1 and p2. Points smaller than p1 are irrelevant because if p1 is not in range, nothing smaller would be either.

Edge cases

  • Single interval [[1, 5]]: you must pick at least 2 points. The answer is always 2 (pick end - 1 and end).
  • All intervals identical [[2, 4], [2, 4], [2, 4]]: the first interval forces you to pick 3 and 4. All others are already covered. Answer is 2.
  • Non-overlapping intervals [[1, 2], [4, 5], [7, 8]]: no overlap at all, so each interval needs its own 2 points. Answer is 6.
  • Intervals of length 1 (containing only 2 integers) like [[3, 4]]: you must pick both 3 and 4. There is no room to choose, so every such interval adds exactly 2 points unless they overlap with existing ones.
  • Large overlap [[1, 10], [2, 9], [3, 8], [4, 7]]: the innermost interval [4, 7] is processed first (ends at 7, starts at 4). Picking 6 and 7 covers all four intervals since they all contain 6 and 7. Answer is 2.

From understanding to recall

You have read through the greedy sorting approach and it makes sense. Sort by end, track two points, update as needed. But the details matter under pressure: the tie-breaking rule (-x[0]), checking start > p2 versus start > p1, and remembering to shift p1 = p2 before assigning p2 = end. These are the kind of details that slip away after a few days.

Spaced repetition fixes that. You write the algorithm from scratch at increasing intervals until the pattern is automatic. After a few rounds, you see "minimum covering set with intersection constraints" and the sorting plus greedy template flows out without hesitation.

The greedy interval covering pattern is one of roughly 60 reusable building blocks that cover hundreds of LeetCode problems. Learning them individually and drilling them with spaced repetition is far more effective than grinding random problems and hoping they stick.

Related posts