data codes through eyeglasses
Data Structures & Algorithms

Experiments with LeetCode – Part 1

Subhanga 

These blogs are going to tackle a series of LeetCode problems using the Python programming language. Each part is going to tackle data structure and algorithm patterns used heavily in software engineering interviews. The goal of these blog posts is also to highlight some neat programming tips and tricks, when using Python to solve these questions. All in all, the aim will be to build a deep understanding of what certain types of questions need to be attacked, mental maps to keep in mind while using certain data structures or algorithms, how to handle edge cases, and Pythonic ways of thinking.

Problem #1 – Check if list contains duplicate

You have two ways of solving this. The first solution is O(N log N) time and space complexity since it requires sorting. Once you have sorted a list, all of its duplicates fall right next to each other. So, the solution is to sort the list and check the element to the right of each number in the list. If they’re equal, the list contains duplicates.

You can also solve the problem using a dictionary (or a hash table). You don’t necessarily need a Counter1 because a counter would record the frequency of each item in the list, which isn’t strictly needed. Think about it: in the list [1, 1, 2, 3, 4, 5, …..], we don’t need the rest of the items after the first 1, 1 duplicate we encounter. So, we could just loop through the list and add any item we encounter to the dictionary. If it has been added before, return True immediately. Otherwise, in the worst case, we look through each element in the list – which is needed to solve the problem anyways – and return False. No duplicates were encountered!

Problem #2 – Pangram

A pangram is a sentence in which every letter in the English alphabet occurs. The characters could be lower, upper, spaces, or other special characters. You need to handle those, meaning you must parse the string to only keep the alphabets.

You can solve this problem by using a hash table again (love love LOVE this data structure!). Here, we must take care to process the input string because it can have all sorts of characters, some upper, some lower, some special, all of which would count as a character in our hash table. ‘A’ and ‘a’ would have two entries, which is not what we want as that would not be precise. So, our best best would be to make everything lowercase and only keep alphabetical characters. Lucky for us, Python has a rich set of string processing functions and we’re going to use just that.

string = string.lower()

character.isalpha() -> a boolean function that checks whether or not a character is an alphabetical character. Alternatively, we can also use ASCII to do this!

Every character in our keyword is actually a number that the computer reads. Characters are assigned to these numbers in ranges of decimals or hexadecimal values. For example, ‘a’ through ‘z’ is 97 through 122 and ‘A’ through ‘Z’ is 65 through 90. The 26 characters of the alphabet, 0 indexed. Every other character is also mapped to an int value which will fall outside of the ranges listed above.

In Python, you can inter-convert character to ASCII values by using the functions chr and ord, the former converts an integer value to a character and the latter converts a character to an integer value.

So, instead of using the isalpha function, we would check if the character falls inside 97 through 122, since we’re only checking lowercase letters. But the rest of the solution would be the same! Knowing these kind of techniques is very useful because these are core computer science principles which are language agnostic.

The solution is O(N) because we must look through every character in the string to determine if it is a pangram. There is not going to be a solution faster than that for this problem.

Problem #3 – Reverse vowels

My solution vs cleaner solution

My solution was to use a 2-pointer approach. We start with two pointers, one at the beginning of the string and one at the end. We begin comparing the two characters and checking if they’re both vowels or not. Bear in mind that capitalized vowels count too. If they’re both vowels, we need to swap both and increment the first pointer and decrement the second. Then we need to handle each case of one pointer having a vowel and the other not. In each case, we handle them according, by incrementing or decrementing the pointer which doesn’t contain a vowel. Finally, we can also have a case where neither are vowels. In this case, we increment and decrement the pointers without swapping the characters.

I chose to use 4 if, elif, and else statements to cover each case and this worked perfectly well. However, the cleaner solution uses two inner while loops to continuously move the first pointers to the front and the last pointer to the back until we reach our ideal case where we swap both characters. This eliminates the need for using extra brain cells to cover all cases as we skip directly to the part where we’re most interested: swapping vowels.

You have to be careful while using the while loop 😉 because the choice of your operator could make the solution correct or incorrect. Furthermore, you also need to make sure that your inner loop condition doesn’t make either pointer cross each other. Let’s work through an example to showcase this. Let’s assume the input is “hello” and you want to end up with “holle” because that is the solution. Our while loops look like this:

while ptr1 < ptr2 and s[ptr1] not in vowels:
    ptr1 += 1

while ptr1 < ptr2 and s[ptr2] not in vowels:
    ptr2 -= 1 

If you forget to add the ptr1 < ptr2 check, your pointers may cross each other and you may exit your main loop, which is not what you want. More importantly, you need to take care of your boolean condition because the choice of that makes or breaks your solution. Having ptr1 <= ptr2 actually is incorrect because there are conditions where both the pointers can meet and can cause unnecessary swaps and even make the solution incorrect in some cases!

We’re not using any data structures other than two pointers so the space complexity of this solution is O(1) and the time complexity is O(N), as we need to traverse the entire string.

Problem #4 – Valid palindrome

We have started using the two pointer approach with the above problem and this is one of the patterns which we’re going to use often. This problem is another one where we will do the same. Let’s also remember the mini-pattern of skipping over all non required characters within the two pointer solution because this is often useful. An idea – while formulating these inner while loop or if statement conditions – is to think that we need to cover:

a) The condition we are looking for. Examples, checking whether the current element is not a vowel, non alphanumeric character, etc

b) The variable we are updating. The pointers in particular, since we update that for condition a). We need to make sure this variable will be in range

Python has a nifty function isalnum() to make checking for what we want easier, but once again, we could do this by creating a helper function like so:

def my_is_alnum(self, character):
    return 97 <= ord(character) <= 122 or 48 <= ord(character) <= 57

Once again, the two pointer solution is amazing because it uses no extra space. We end up with O(1) space and O(N) time complexities.

Problem #5 – Valid anagram

This problem has two obvious solutions.

An anagram is basically a different combination of the same series of letters. So, we can use some of the techniques that we’ve tried out so far with the addition of one new thing to add to our arsenal: sorting.

If we sort both the input strings, we will guarantee that what we end up with will be equal to each other if the inputs are anagrams. Otherwise, they will get sorted to different results. The time and space complexity of this solution is O(N log N) and O(N log N).

Since you will likely use a standard sorting function available in your programming language, it is useful to have a different solution in mind for this problem as your interviewer may think that this solution is too easy. In that case, you can write your own sorting function if you’d like. Something efficient like mergesort or quicksort. Otherwise, you can use a hash table solution as well2, since an anagram of one word contains the same letters. Here, the time and space complexity is slightly better, making this the ideal solution: O(N) time complexity3 and O(1) space complexity4!

There’s an elegant way to solve this problem with a hash table which is different than what some people (including me) default to, which is to use two hash tables and check if every element in both are equal. But this actually may not solve the problem since anagrams are ordered differently. That means our resulting dictionary will be ordered differently too! The actual solution is to keep a count of the letters found in string1 and subtract whatever letter occurs in string2. This can be done in one pass, and, even if the ordering is different, we’re adding string1 characters and subtracting string two characters, so it comes out to be 0 at the end if it’s an anagram. If the string is an anagram, all values should come out to be zero at the end. If not, it isn’t an anagram!

Little optimization techniques like checking if the lengths of both strings are equal at the start are really nifty and showcase that you have a good eye for solving the problem in an efficient manner.

Problem #6 – Shortest Word Distance

Given a list of words and two input strings – that are guaranteed to be in the word list – we need to find the shortest distance between the two input strings in the list. The question infers that there could be multiple repetitions of the input strings within the list, so we must account for that.

The default solution that occurs to me is to keep track of the first occurrences of each string and the difference of both their indices. If we encounter any of the input string later on, we update the difference if and only if this distance is smaller than the earlier distance. In problems where we must find the minimum or maximum of anything, it is useful to initialize a variable to be the complete opposite of that i.e. float(‘inf’) or float(‘-inf’) as it is a clean way of starting the solution process. Here, we must ensure that we have found both the strings at least once before starting to calculate the distance. After the first combination is found, we will update the distance anytime we find string1 or string2 and only keep the minimum5.

Time complexity is O(N) as we must loop through the list at least once. Space complexity is O(1).

Problem #7 – Number of Good Pairs

Given a list of numbers, we need to find how many pairs of numbers are equal to each other. The brute-force solution of two for loops is probably the most obvious thing that comes to your head when you read this problem. And, it is fair to start with that, if you’re confident you will not waste too much time writing it out.

The optimal solution for the problem sacrifices some space complexity of the brute force and uses a hash table to store the count of each number that occurs. It isn’t immediately obvious how a hash table can be used to solve the problem, so it is quite elegant and a tiny bit mathematical.

The key to this problem is that we need to realize: every new number can form a good pair with all of its previous numbers. Example: [1, 2, 1, 1, 2] has 4 good pairs. Imagine being the computer and going through the list. The moment we find the second ‘1’, we have found one good pair. But what happens when we find the third ‘1’? The third ‘1’ can make a good pair with either the first or second ‘1’. Meaning, we update the number of good pairs we have found to be frequency of the number – 1. Even for the example input of [1, 1, 1, 1], notice how the good pairs add together. It is 1 when the first pair is found, then it becomes 2, and finally the result is 3, for a total of 6.

Problem #8 – Sqrt

I’ll be the first one to admit it: this solution is not intuitively obvious at all. In this problem, we’re told to not use any default functions obviously (otherwise this wouldn’t be much of a problem at all). So, how on earth are we supposed to find the rounded down square root of a number naturally?!

The answer: binary search. If you don’t know what binary search is, don’t fret. It is quite simple actually. We begin by sorting the list. Binary search only works if the list is sorted, and, if the problem already mentions that the list is sorted, this is actually a big hint that we need to use binary search for the problem. We start off with two variables, low and high that will point to the first element and the last element of the list respectively. We calculate the average of these numbers, round it down, and find what value the number points to. If the number we’re searching for is greater than the mid point number, we will pull the “low” variable up to the mid point plus one. If the number is smaller, we shrink the “high” variable to be the mid point minus one. This algorithm can be implemented iteratively or recursively.

low, high = 0, 0

while low < high:
    mid = (low + high) // 26
    
    if array[mid] < target:
        low = mid + 1

    elif array[mid] > target:
        high = mid - 1

    else:
        return mid     # Our number has been found at index 'mid'

How does this help in finding the rounded down square root of a number?! The fact that our square root will be rounded down is an important bit to note here. This leads us to the fact that any square root of a number greater than or equal to 2 is going to fall within the range of [0, x // 2 + 1]7. Without knowing this, it is hard to know how the search algorithm can possibly lead us to the answer, so this problem is very difficult in my opinion, even if it is just tagged as a ‘medium’.

The best way to think about square roots – in the context of this problem – is to understand that the square roots of 5 through 8 all fall between 2 and 3. This is so because the numbers outside of 5 and 8 are 4 and 9, which are perfect squares. The roots of those are in the ranges [2, 3], so any number between 5 and 8 inclusive is guaranteed to have a root between 2 and 3 non-inclusive. Let’s take another example. What are the square roots of any number between 17 and 24 going to be? Well, the perfect squares right outside these ranges are 16 and 25, the roots of which are 4 and 5. So, any square root from 17 to 24 will be a 4 point something decimal. Mathematically speaking, this idea is expressed by this equation, where the integer square root of x is the largest integer k that follows this relationship:

\[ k^2 \leq x < (k + 1)^2 \]

With that said, we will start searching from 2 – as we will already have taken care of the x = 0 or x = 1 edge cases – till x // 2 + 1. There isn’t going to be a number we’re searching for; rather, the moment our high pointer becomes lower than the low pointer, our mid variable will have converged to the right answer already! This isn’t super obvious either.

That was all for the first part of this series! In this part, we covered basic problems involving strings and saw just how useful hash tables are. If creatively used, they can help with many kinds of problems. We also learnt how computers actually read letters using ASCII. And, we finished up with a difficult problem where we got introduced to the binary search. In upcoming posts, we will start covering independent patterns that are useful to remember for solving LeetCode style problems, and we will be solving questions in order to see these patterns at work. See you soon!

  1. The Counter data structure in Python’s collections module is a very useful dictionary-like structure that counts the frequencies of the values contained within the iterable you pass it ↩︎
  2. In Python, I really encourage you to use Counter from collections ↩︎
  3. Where N is the length of the string ↩︎
  4. Remember that letters have a finite set of combinations. If you only have lowercase English letters, the most a word can contain is 26 letters. This will not grow exponentially with increasing input sizes, so the size of your hash table is actually constant. Hence, the space complexity is constant. ↩︎
  5. Another technique that is super helpful in Python is to create a variable – often the minimum or maximum variable initialized to positive or negative infinity – and use the min/max function to update the same variable. Example, minvar = min(minvar, some_other_calculation) ↩︎
  6. There’s an alternative way to calculate this, which avoids arithmetic overflow errors in certain programming languages, even though this isn’t needed in Python. This way is: low + (high – low) // 2 ↩︎
  7. The square root of any number x cannot exceed x // 2 + 1 ↩︎

Recommended Posts