Experiments with LeetCode – Part 5
We’re onto our next pattern, which turns out to be quite an important one for interviews: the sliding window pattern. We have kind of touched upon this a little bit while solving problems related to the two pointer pattern, but this time we will be completely focused on this.
This pattern comes in handy while dealing with iterables that follow a certain condition. Maybe we want to find the subarray that has the greater sum or something of that nature. The sliding window pattern makes it easy to solve the problem without relying of two for loops.
Problem #27 – Maximum Sum Subarray of Size K
We’re given an array and we want to find the maximum sum any subarray of size K has within the array. Of course, the obvious solution for that would be to have two loops. One loop iterates until N – K + 1, where N is the length of the array. The second loop will iterate until i + K and accumulates the sum. Doing it this way allows us to keep track of every single sum. This approach, however, will be O(N^2)1.
To make this O(N), we can use the sliding window technique. We can do this by defining a variable which’ll represent the start of the window. Our for loop variable will represent the end of the window because we will be shifting both ends of the window by 1 unit every time. We accumulate the window’s sum until we reach element number k – 1, and at that point, we begin subtracting the element pointed by array[start] and incrementing start at every index. By doing so, we are guaranteeing that the sliding window stays at a length of K, and we are able to track the maximum sum occurred so far easily with something like:
max_sum = max(max_sum, current_sum)
Problem #28 – Smallest Subarray with a Greater Sum
Given an array and a value S, we want to find the length of the smallest contiguous subarray with a value that is greater than or equal to S. This problem builds upon the prior problem. Here too we will continuously expand the window towards the right and shrink from the left, but there is one small catch: we need to shrink the window using a nested while loop.
Why do we need that? This is best explained using an example. Take the test case of array = [2, 1, 5, 2, 3, 2] and S = 7. The first subarray we have that is greater than or equal to 7 is [2, 1, 5]. At this point, we need to shrink the window and evaluate the subarray [1, 5] separately. This, of course, does not meet the condition and we must expand the window to the right to continue the algorithm. Now check these two code samples out and see what differences they have!
# Version 1
for end in range(n):
curr_sum += arr[end]
while curr_sum >= s:
smallest_length = min(smallest_length, end - start + 1)2
curr_sum -= arr[start]
start += 1
# Version 2
for end in range(n):
curr_sum += arr[end]
if curr_sum >= s:
smallest_length = min(smallest_length, end - start + 1)
curr_sum -= arr[start]
start += 1
Both of these are identical except the first version has a while loop, as you can see. When I was building the solution to this problem first, I wrote Version 2 while struggling to entertain that this code would solve the problem. Yet, I felt that I was on the right track! Turns out, all I needed to do was change the if to a while. The purpose of the statement is to shrink the window when we find a valid subarray, like [2, 1, 5]. The if statement fails because it shrinks the window but we immediately expand the window in the next iteration of the for loop, hence we miss intermediate subarrays like [1, 5], which could be valid! Having a while loop is the way to go because, if our subarray was valid (example: [1, 6]), we would enter our while loop and record the answer before contract the window and then expanding again in the next iterations.
Problem #29 – Longest Substring with K Distinct Characters
Sliding window comes in handy with string problems as well. They are very much like arrays, after all. This problem is different because we want to build the longest substring with no more than K distinct characters. We also need to figure out how we will distinguish between distinct and non-distinct characters. For the latter, we probably will be using something like a hash table to keep track of visited characters. And for the former, what the problem is suggesting is that we want a maximum of K distinct characters in the substring we find.
When choosing to use a hash table in Python, I always recommend using defaultdict from the collections module because you will be able to avoid key not found errors. So, you will begin adding the counts of all characters you find and calculating the maximum subarray length. The moment you find an element that causes the length of your has table to be greater than K, you will start doing a series of operations within a while loop:
- Remove the element at the start of your window from the hash table
- If that element has 0 frequency after subtracting by 1, delete it from the hash table (since we are relying on the length of our hash table to enter and leave the inner while loop)
- Shrink our window start by incrementing it
And with this, the problem is solved with a time complexity of O(N) and space complexity of O(K) or O(1) in the worst case3!
Problem #30 – Fruits into Basket
I love problems that are phrased in a real world setting but are the same ones we solve programmatically. In this problem, we’re given an array of characters which represent fruits from trees. We need to find the longest contiguous subarray of trees that can have two distinct fruits i.e. two distinct characters. When we break down problems like this, we can see that they are not too different to what we have been doing so far. So step number one is always to try and break down the problem! Otherwise, you could get lost, and worst case scenario, start panicking. Even if the problem is phrased very differently, the solution to this problem is exactly the same as the last one.
Problem #31 – Longest Substring with Same Letters after Replacement
The trickiness of this problem comes from the fact that we need to deal with substrings which can have many repeated elements. Although we don’t need to know which element occurs how many times, we still need to use that information to determine how many ‘replacements’ we can make to find the longest substring we can.
For every window, we must keep a variable for tracking the most frequent character found so far. By doing so, we are able to check if that subtracted from the window size is greater than or smaller than k. What does that give us exactly? Given window, say “aabc”, we know that the size is 4 and the largest frequency of the characters is two. 4 – 2 is 2 which is exactly equal to k, making this a valid window. No problem, we calculate our maximum length as usual. But what if we add another character to the window, making it “aabcc”? Now the calculation becomes 5 – 2 = 3, meaning we must replace 3 characters to make the string letters all equal.
This is the case where we slide our window. We subtract the starting letter’s frequency count, delete the entry if its value is 0, and shrink the window. This clever logic lets us simplify tracking and knowing exactly which letters to replace because it doesn’t matter. All we care out is whether the window is valid or not, and for that, all we need is the size of the window and the frequency of the most commonly occurring character in it. We don’t even need to reset the variable tracking the maximum frequency character per window because our window condition only takes effect if we have an invalid window, so this algorithm is awesome!
Problem #32 – Longest Subarray with Ones after Replacement
This problem is exactly the same as the previous one in its essence, so it will be a good one to practice. We want the length of the longest contiguous subarray of 1’s, after potentially replacing K 0’s.
Since we’re only dealing with 0’s and 1’s for this problem, we don’t need a dictionary and keeping track of the maximum frequency encountered with the dictionary. Now, we can get by with just a variable counting the maximum number of 1’s found in the window.
With that, we do the same routine: shrink the window when we find maximum ones subtracted from (end – start + 1) is greater than k, the maximum number of replacements we can make. We can keep shrinking the window in a while loop or an if statement. Either will pass all the test cases, but, personally, I think that we need a while loop because just shrinking the window once may not guarantee the validity of the window. But, at every end of the iteration, we recalculate the length of the current window and return that as the answer at the end!
Problem #33 – Permutation in a String
In this problem, you have to check if a string contains any valid permutation of another string, called the “pattern”. At first glance, this may seem solvable by just using a hash table, counting the occurrences of the characters in the pattern, and subtracting those numbers as we iterate through the string. It would be O(N) time complexity and be an effective solution! But this doesn’t work. Because we are dealing with substrings, which are contiguous. We need to ensure that we come across contiguous set of characters that form a permutation of the pattern, so we’ll need to mix in the sliding window technique with this train of thought.
We will continuously decrement and increment the characters we encounter in the string from the hash table that keeps track of the pattern. When we find that our window size is greater than the pattern length, we will immediately check if we have matched all the characters first of all. This’ll be done by keeping track of all characters that have a frequency of 0 i.e. we found the number of characters needed in the string that satisfy the pattern’s requirements. If this is true, we have found that a permutation exists! If not, we need to shrink the window by incrementing the start character, decrement our match variable, and slide the pattern forward.
It’s guaranteed that we will find the answer we want with this technique. The solution is O(N) time complexity and O(1) or O(M)4.
Problem #34 – String Anagrams
This problem states that we need to return a list of starting indices of all anagrams of a given pattern within a string. So the problem is clearly a level up from the previous one, as we’re required to remember the starting indices. And how can you save the starting index before traversing further and knowing for sure that it’s an anagram? Welcome to the problem.
The solution is actually pretty simple, once you recognize the similarity of this problem and the last. All we need to do is to record the start position of our sliding window where we returned True in the last problem. Instead of just stating that there are valid permutations available, we get the index of that anagram through the start point of our window!
Problem #35 – Smallest Window Containing Substring
In this problem, we want to find the smallest substring that contains all characters in another given string. We will look to use our sliding window technique straight off the bat and see where it leads us. Given the nature of problems we have solved so far, we’ll also look out to use a hash table or a Python Counter object.
This problem is much trickier than the previous problem because of the “smallest window” component. Here, we need to continuously keep track of the smallest window, whenever we find the right number of matches for our string. Furthermore, we also need to have a variable to saves the starting index of the new smallest window we find, so we can return it appropriately at the end of the function.
Problem #36 – Words Concatenation
This problem is the same as the previous one except:
- We don’t need to find the smallest window
- We have a list containing words to concatenate instead of a substring
This is the trickiest problem we have looked at so far and also the last problem for this pattern. The difficulty of this problem lies in the fact that we want to make sure the substring we find matches the words concatenated exactly, no more no less. You must calculate substrings in fixed batches, use two dictionaries to keep track of words encountered so far in the string and the existing words in the concatenating string. Most tricky of all is knowing when our condition is met successfully. When have we found a valid substring that is a concatenation of the words.
The sliding window pattern in this problem is done differently, as there are no ‘start’ or ‘end’ variables. Instead, we’re sliding our window by two nested for loops, in which the inner loop iterates in batches. It is difficult to recognize that the pattern is being utilized here, but it is. The time complexity of this problem isn’t simple to calculate as there are a lot of things going on. Officially, we end up with an O(M X L X N) time complexity where M=number of words, L=length of each word, N=length of string. This is lesser than O(N^2) for sure. Space complexity wise, we have two dictionaries, leading to an O(M + N) solution.
- Not exactly. Perhaps it will be a little less than that, but we can round it up to be around O(N^2) ↩︎
- This is an easy way to track the length of your sliding windows ↩︎
- While we are populating a hash table, we are continuously shrinking that too and are containing the entries within K, making it a constant space operation ↩︎
- If you want to be super precise in your interview, say the space complexity is O(M) where M is the number of distinct characters encountered within the pattern. However, realistically, the solution really is O(1) because there’s a cap to the number of distinct characters you can save to a dictionary. It doesn’t grow proportionally with input size, so the solution is still O(1) when it comes to characters. ↩︎