Dynamic Programming

Minimum Window Subsequence

hard

DESCRIPTION (inspired by Leetcode.com)

Given a source string s1 and a pattern string s2, find the shortest contiguous portion of s1 that contains s2 as a subsequence.

Return the shortest such substring. If no valid substring exists, return an empty string.

When multiple substrings of equal minimum length exist, return the one that appears first (leftmost).

Reminder: A subsequence maintains the relative order of characters but doesn't require them to be consecutive.

Example 1

Input:

s1 = "hellointerview"
s2 = "her"

Output:

"hellointer"

Explanation: We need 'h' → 'e' → 'r' in order. The shortest window containing this subsequence starts at 'h' (index 0), includes 'e' (index 1), and ends at 'r' (index 9), giving us "hellointer".

Example 2

Input:

s1 = "codingisfun"
s2 = "xyz"

Output:

""

Explanation: None of the characters 'x', 'y', or 'z' appear in s1, so no valid subsequence window exists.

Code Editor

Run your code to see results here

Have suggestions or found something wrong?

Explanation

Building Intuition

We need to find the shortest substring of s1 that contains s2 as a subsequence. What's the difference?

Our goal: find the shortest window in s1 that contains s2 as a subsequence.

Find the shortest substring of s1 containing s2 as a subsequence

Starting Simple: The Brute Force Approach

The most straightforward approach? Try every possible substring of s1 and check if s2 is a subsequence of it.

brute_force(s1, s2)
    for each starting position i in s1
        for each ending position j >= i
            if s2 is subsequence of s1[i:j+1]
                track if this is the shortest so far
    return shortest window

This works, but it's painfully slow. We're checking O(m²) substrings, and each subsequence check takes O(window length) time. Since windows can be up to length m, that's O(m³) total - way too slow for large inputs.

A Smarter Approach: Forward Then Backward

A better approach would be, instead of checking random substrings, what if we:

Find where s2 ends - Scan forward through s1, matching characters of s2 one by one until we've matched all of s2
Shrink from the start - Once we've found a complete match ending at position i, walk backwards to find where this particular match started

This is more efficient because we're not checking unnecessary substrings. But there's still room for improvement - we might re-scan the same portions of s1 multiple times.

The Key Observation

Think about what information we actually need. When we're at position i in s1 and we find a character that matches s2[j]:

If this is the first character of s2 (j=0), we're starting a new potential window right here at position i
If this is a later character of s2 (j>0), we need to know: "Where did the window that matched s2[0..j-1] start?"

That second point is crucial. We don't need to re-scan backwards - we just need to remember where each partial match started.

Why Dynamic Programming?

This is exactly what DP is good for: remembering previous computations to avoid redoing work.

Let's define: dp[j] = the starting index in s1 from which we can form s2[0..j] ending at our current position.

When we find s1[i] == s2[j]:

When dp[n-1] becomes valid (not -1), we've found a complete subsequence! The window goes from dp[n-1] to our current position i.

The DP array stores starting indices, not counts or boolean values. This lets us immediately compute the window length when we find a complete match.

Why Iterate Backwards Through s2?

We iterate j from n-1 down to 0 (backwards) because each dp[j] depends on dp[j-1]. If we iterated forwards, we'd overwrite dp[j-1] before using it for dp[j].

Walkthrough

Let's trace through s1 = "abcdebdde", s2 = "bde":

Initial state: dp = [-1, -1, -1] (all -1 means no matches found yet)

Step 1: At position i=1, we find 'b'

We're scanning through s1 and encounter the character 'b' at index 1. This matches the first character of s2 (which is s2[0] = 'b'). Since this is the start of a potential subsequence match, we record where this match begins by setting dp[0] = 1.

Step 2: At position i=3, we find 'd'

Continuing our scan, we reach index 3 where s1[3] = 'd'. This matches s2[1] = 'd', the second character in our target. Since we already have a partial match for the first character (stored in dp[0] = 1), we now set dp[1] = dp[0] = 1. This means "the window that matches the first two characters of s2 ('bd') starts at index 1".

Step 3: At position i=4, we complete the first match!

We hit index 4 where s1[4] = 'e'. This matches the final character s2[2] = 'e'. Now dp[2] = dp[1] = 1, meaning we've found a complete match for all of "bde" starting at index 1. The window spans from index 1 to index 4: s1[1:5] = "bcde" with length 4. We save this as our current best answer.

Step 4: Continuing the scan, we find another match

The algorithm doesn't stop at the first match - it keeps scanning to see if there's a shorter window. At index 5, we find another 'b', starting a new potential match. Eventually at index 8, we complete another match: s1[5:9] = "bdde" with length 4. Since this is the same length as our first match and we already have one, we stick with "bcde".

Final Answer: "bcde" (length 4)

Both windows have the same length, so we return the first one we found

Visualization


def min_window(s1, s2):
    m, n = len(s1), len(s2)
    dp = [-1] * n
    result = ""
    min_len = float('inf')
    
    for i in range(m):
        for j in range(n - 1, -1, -1):
            if s1[i] == s2[j]:
                if j == 0:
                    dp[j] = i
                else:
                    dp[j] = dp[j - 1]
            
        if dp[n - 1] != -1:
            start = dp[n - 1]
            length = i - start + 1
            if length < min_len:
                min_len = length
                result = s1[start:i + 1]
    
    return result

Finding the shortest substring containing s2 as a subsequence

0 / 50

DP solution tracking start indices

Solution

The solution iterates through s1 once. For each character, we check all positions in s2 (in reverse order) to update our DP array. Whenever the last entry dp[n-1] becomes valid, we've found a complete subsequence and can check if it's the shortest so far.

What is the time complexity of this solution?

O(4^L)

O(n * logn)

O(m * n)

O(2ⁿ)

Alternative Approaches

Two Pointers with Expansion: Find each occurrence of s2 as a subsequence, then try to shrink from the left. Same time complexity but with higher constants.
2D DP: Use dp[i][j] to track starting positions for each (i, j) pair. Uses O(m * n) space but might be conceptually clearer for some.

The 1D DP approach we use is space-optimal while maintaining the same time complexity.

Mark as read

Next: Greedy Algorithms Overview