
Experiments with LeetCode – Part 6
This week we’ll be talking about merge intervals. These series of problems stick out like a sore thumb when it comes to LeetCode questions. They are oddly specific. It is always about have several “intervals”, represented by a two dimensional array of numbers. Each array within our array contains two numbers or timestamps. The problem generally has us do CRUD operations or merge the intervals, which we will see in detail in later sections.
To solve these questions, we need to understand deeply the array data structure and what “intervals” are in these problems. We’re dealing with 2D arrays and every child array element has two timestamps, the start time and the end time. You can try to visualize these as a line from the start time to the end time. We have several of these within our interval array, and some of these could conflict with each other. For example, what’s wrong with these two intervals?
[[2, 6], [4, 8]]
Imagine trying to schedule two people for appointments with one particular doctor, using these intervals. That would be a problem because the doctor would still be serving patient A when the second appointment time would have begun already. These problems deal with this very real world scenario.1
Algorithmically, no special knowledge is needed, except a logical way to approach the problem in your mind. With that being said, let’s begin!
Problem #37 – Merge Intervals
This is the patient zero of all merge intervals problem. Given a list of intervals, that may possibly be conflicting, merge them and produce a new list that has mutually exclusive intervals. In non nerd terms, create valid timelines of intervals.
The first thing we need to think about is: what is considered a merge conflict with these intervals? First of all, it is probably useful to sort the array. While doing so, it is best to be explicit in how you’re sorting it because your implementation will be different depending on that. In Python, you can sort a list in-place – using the list.sort function – or return a new sorted list through the sorted keyword. In both, we have the ability to give the function a ‘key’ through which to sort the list.
We will sort the list in-place (to prevent additional memory usage) and pass it the interval start time as the key.
intervals.sort(key=lambda x: x.start) # these are called lambda functions. One liner functions that are pretty helpful!
Now we know that the start times of all intervals within our array are sorted. So, we can just compare them and ignore the end time altogether. This makes the problem more simple to think about. If the start time of the current interval is smaller than or equal to the end time of the previous interval, that is a conflict. We’re building our new intervals iteratively and we know that all intervals are already sorted by start times. So the moment we find an interval with start time that is less than or equal to the end of the previous interval, that is an overlap!
Now we get onto the other problem we have. How do we sort the merge conflict? Do we return a new array? Should we do these operations in-place? These are questions you must ask your interviewer. Not only could they change the nature of your solutions, but they also showcase how careful and detail oriented you are in requirement gathering.
In this problem, the description doesn’t mention anything, so we can continue with the assumption that it’s okay to create a new, merged interval array. We will continuously update an ‘end’ variable, which is also used as the main overlap check while we’re iterating through the array. The only time we will insert into our list is when we’ve built an interval that no longer overlaps. One additional thing: we need to add the last interval after the loop finishes. This is because the last interval will not have been processed, so we must make sure to add them!
Problem #38 – Insert Interval
Following the previous problem, we’re given a list of non-overlapping intervals this time. And the goal will be to insert an interval into the correct spot, while merging any overlapping intervals it may create. Looking at the problem constraint, the intervals are already sorted, so no need to worry about that. This problem also shows you how important it is to quickly scan through the problem constraints as it may have information that is important for you to know.
Basically, we will need to find the spot right before the interval which will have a greater start time than our given interval. After that, a ripple effect can potentially occur that will change the rest of the intervals array. So, we need to build the new list iteratively starting then.
Here are the conditions you need.
# Intervals that come before your new interval
intervals[i].end < new_interval.start
# Conflicting intervals that you need to continuously update
intervals[i].start <= new_interval.end
Problem #39 – Intervals Intersection
For this problem, we’re given two lists of intervals, and we need to find all intervals where they conflict. Sometimes, the conflict can be broken apart into multiple intervals. For example, in the intervals [[5, 6], [7, 9]] and [[5, 7]], we have two conflicts [5, 6] and [7, 7].
In this problem, we’re trying to find the intersections of the two lists, which changes the logic slightly. This is slightly tricky so you might need to practice it a couple times to get it! An intersection is when:
a. The start time of interval1 is lesser than the end time of interval2
b. The start time of interval2 is lesser than the end time of interval1
Confusing, isn’t it? But it makes sense if you draw it out. If interval1 begins at 8:00AM and interval2 ends at 12:00PM. And interval2 starts at 10:00AM and interval1 ends at 11:00AM, you can see the conflict occurring, right?
Interval1: 8:00AM – 11:00AM
Interval2: 10:00AM – 12:00PM
The intersection is 10:00AM – 11:00AM, which is the maximum start time and the minimum end time of the two intervals. This leads us to another important point. Draw your problems out. This’ll give you an idea on what the solution will be.
Another way to think about the intersection detecting logic is to find the intervals where interval1’s start time is greater than interval2’s start time and interval1’s start time is lesser than interval2’s end time or vice versa. This logic could come easier to you as it is more straightforward than the other one.
We will use a single while loop to iterate through both arrays, so the time complexity is O(N + M), where N is the length of interval1 and M is the length of interval2. The space complexity is O(1) as we haven’t used any additional data structures.
Problem #40 – Conflicting Appointments
Here comes a real world application of the merge interval pattern, the kind of stuff I was alluding to at the introduction section. This is the reason why I stand by data structures, algorithms, and LeetCode questions in general and scoff at people who think tech interviews are “broken” or LeetCode questions “don’t test engineering ability”. They absolutely do, and, to a degree, they determine how effective you will be on the job. And this is coming from someone who struggles with LeetCode. But this is a slight tangent. Back to the problem!
After having gone through tougher merge interval problems so far, this one will (hopefully) be a piece of cake. All we really need to do is check every interval against the interval prior. And we want to see if we find any appointment’s start time is lesser than the previous appointment’s end time. If that’s the case, we cannot make that appointment, so we return False. At the end of the loop, if no intervals were caught in our if statement, we can safely return True! The intervals array needs to be sorted for this solution to work properly, making this solution O(N Log N).
Problem #41 – Minimum Meeting Rooms (Introduction to Heaps)
Once again, we have a list of intervals for N meetings. We need to find the minimum number of rooms required to hold all meetings. Think about what the question is asking. Part of working through the patterns of LeetCode questions is also to recognize the kind of questions certain patterns ask for. Here, we’re basically being asked for the number of intersections given a list of intervals because we only need extra meeting rooms for intervals that conflict. Otherwise, only 1 room would suffice for all intervals!
Therefore, we will be doing what we did to find conflicts between intervals as we’ve done before. The one big additional difference is that we need to reuse rooms if the prior appointments occurring in them have finished. We cannot keeping adding new rooms whenever we find conflicts; we need to be smarter than that.
This difference makes the problem much more complex because we need to track the end times of appointments that are occupying rooms, and when they will free up to be used for the next series of conflicts. For this, we will introduce a new data structure called the heap, which are of two kinds: minimum and maximum. These structures are binary trees and, depending on the type of heap we have, the properties of all the keys change.
Minimum heap: the smallest element is at the root of the tree. Every child is greater than or equal to the parent
Maximum heap: the largest element is at the root of the tree. The rest is the same
Heaps are typically used when you want to access the maximum, minimum, or things that require them frequently (such as medians, averages, etc). Heaps are efficient as all insertions and deletions require O(Log N) time.
Back to the problem… what’s our game plan? We initialize a list that will act as our heap. Using the heapq module, we will constantly use the heappush or heappop operations while looping through our meetings. If any appointment’s start time is greater than the heap’s root’s end time – meaning this is the earliest ending appointment – we will pop from the heap. This is us reusing that room for that appointment. But we must also add that element’s end time to the heap! All appointment’s end times must be in the heap because that is what determines how many rooms we need.
In this problem, we can also merge overlapping meetings after sorting if there are any.
The time complexity of the algorithm is O (N log N) because of the sorting and doing N log N heap operations within the for loop. But the space complexity will be O(N) for the heap.
Problem #42 – Maximum CPU Load
Given a list of jobs, with start times, ,end times, and CPI loads, find the maximum CPU load at any time if all jobs are running on the same machine.
This problem is similar to the last one, except we also need to track the maximum value of the CPU load. So, we will keep track of the current and maximum CPU loads because these values will be fluctuating as we pop from our heap, meaning jobs finish after they get popped from the heap. We also need to push both the end time and the CPU load time as a tuple into our heap for this problem because at the point of popping, we need to update the value of the current CPU load.
One more observation to make is the difference of natures of this problem and the last and how that changes one more thing within our for loop. Earlier, we used an if statement inside the for loop to check for meetings that can be popped from the heap, but this time we’ll go for a while loop. We do this because multiple jobs can finish before the current job begins!
For the meeting room problem, we only needed the earliest ending meeting room, 1 meeting room, to accommodate the current meeting. That was a 1-1 relationship, but here there can be many jobs that finish before the current one, and we want to calculate the correct cumulative CPU load.
- American Airlines needs to work through some of these problems, given how frequently their connecting flights seem to have time conflicts. Don’t schedule flights which won’t land on time for the customers to make the other flights they have! ↩︎