Crushing Tech Interviews with the Sliding Window Pattern

I'll be basing the next few blog posts around the topics discussed in https://www.educative.io/courses/grokking-the-coding-interview/. This will be my take on the content covered throughout the course. If you find this at all useful you should go and support them on educative.

Alright, so lets get started.

There are quite a few questions that are asked which require you to work out something among a continuous subarray of a given size.

At the beginning of my grind, I struggled to come up with a brute force solution to such questions. So in this article, I'm going to describe both the brute force solution and also the optimal solution to the following question:

Given an array, find the average of all contiguous subarrays of size ‘K’ in it.

Array: [1, 3, 2, 6, -1, 4, 1, 8, 2], K=5

One way of solving this would be to:

  1. Find the average of the first five numbers [0:6]
  2. Find the average of the next five numbers [1:7]
  3. Find the average of the next five number [2:8]
  4. and so on..

1.png

The code for this would look something like:

def find_subarray_averages_of_size_k(k: int, arr: List[float]) -> List[float]:
    result = []

    for i in range(len(arr) - k + 1):
        subarr_sum = 0
        for j in range(i, i + k):
            subarr_sum += arr[j]

        result.append(subarr_sum / k)

    return result

So what's the time complexity for something like this? I'd suggest you try working this out yourself.

Time complexity: O(N * k)

Explanation: We're looping through N - k +1 elements in the input array and for each element we're looping through K elements to add up the sums.

Spoiler: We can do better using the sliding window approach!

The first question to ask ourselves is: Where is the inefficiencies?

Well, if you look at each one of the steps where we're calculating the averages, you can see that most of the elements that we're using are the same! The only differences are at the beginning and the end of subarray. (Or if you prefer Linked List terminology, the head and the tail)

Here's a visual representation of the differences and similarities between step 1 and step 2:

2.png

As you can see, all the elements in blue are the same. The differences are circled in red.

Let's bring in the sliding window.

Essentially, you need to visualise a sliding window of k elements which slides over one element at each iteration. What I mean by 'slides over' is that we remove an element from the beginning of the sliding window, and we add an element at the end of the sliding window.

Have a look at this beautiful diagram:

3.png

Pretty simple right? It's crazy to think that such a small change in the algorithm can lead to such drastic improvements. We've just reduced the space complexity from O(N * k) → O(N) !

Here's some code for you:

def find_subarray_averages_of_size_k(k: int, arr: List[float]) -> List[float]:
    result = []
    window_sum, window_start_idx = 0, 0

    for window_end_idx in range(len(arr)):
        window_sum += arr[window_end_idx] # Add the next element to the sum.

        if window_end_idx > k: # We only want to add to the result when the sliding window has reached size k
            result.append(window_sum / k) 

            window_sum -= arr[window_start_idx] # Take away the first element from the sum
            window_start_idx += 1 # move the start element of the window ahead.

Note: In this case the sliding window was a fixed size. This may not always be the case; you may need to resize based on the constraints of the problem.

Problems to practice:

https://leetcode.com/problems/maximum-average-subarray-i/ (easy)

https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/ (medium)

https://leetcode.com/problems/fruit-into-baskets/ (medium)

https://leetcode.com/problems/max-consecutive-ones-iii/ (hard)

Comments (1)

Alex Zaharia's photo

Nice article. Sliding window is an useful technique. I think though you have some issues in your logic and code. You mention that for k = 5 you should take elements from [0:6]. This range will take the first 6 elements not the first 5, and so on for the rest of the ranges.

Also in the optimzed code you have a problem with the result appending condition: if window_end_idx > k: should be if window_end_idx - window_start_idx == k -1. You want ranges of 5 elements. Also you need to return the results: return result