This interesting and difficult problem was asked by Google recently.

Given K sorted lists of integers, return the smallest interval (inclusive) that contains at least one element from each list. If there are multiple intervals of the same size, return the one that starts at the smallest number.

For example, given:

1

2

3

[[ 0 , 1 , 4 , 17 , 20 , 25 , 31 ],

[ 5 , 6 , 10 ],

[ 0 , 3 , 7 , 8 , 12 ]]



The smallest range here is [3, 5] , since it contains 4 from the first list, 5 from the second list, and 3 from the third list.

Before we dive into the solution, you should take a moment to think of a solution yourself!

Naive Solution

The brute force solution is to compare every pair of elements in the lists and consider their intervals. After finding the interval, traverse every list to make sure there is at least one element contained by this interval. In order to find the smallest such interval, we’ll need to store the smallest seen so far, and update if we see a smaller interval.

This would be an expensive O(N^3), where N is the total amount of elements in all K lists. There are N^2 intervals, and in each one we need to do a linear scan to determine if the interval contains elements from all K lists. On the bright side, this solution uses O(1) memory, since it only needs to store the current smallest interval.

Solution 1: K-Pointers

The problem statement itself gives us two hints: the lists are all sorted, and we need to return the smallest interval if there are multiple. This suggests iterating over the arrays from beginning (smallest elements) to end (largest elements).

Imagine we compared the minimum values of all the arrays. In the example above, these values would be [0, 5, 0] , and the interval would be the minimum and maximum of these values: [0, 5] . Note that this is guaranteed to contain an element from each of the arrays.

This is one such interval, but we’re not sure yet if this is the smallest interval, so we must keep looking. Since the values are already the minimum values of all the arrays, there is no way to decrease the interval by reducing the maximum value, e.g. [0, 4] or [0, 3] . Thus, we must step along by increasing the minimum. In this case, the next interval we should consider is [1, 5] .

To translate this into an algorithm:

Initialize K pointers, one for each of the K lists, pointing to the minimum element of the list. Initialize variables to track the right and left boundaries of the interval. Find the pointer that points to the minimum and the pointer that points to the maximum of all values pointed to. This is your interval. If this interval is smaller than the current tracked interval, update your tracked interval to be this interval. Increment the pointer that points to the minimum value. Note that after incrementing this pointer, it may not point to a minimum value anymore. Repeat steps 3 - 5 until we’ve finished scanning one of the lists.

In code, it will look something like this:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

def smallest_interval (nums) :



pointers = [ 0 ] * len(nums)

ans = -inf, inf



while True :



local_max = -inf

local_min = inf

local_min_index = -1

reached_end = False





for i in xrange(len(pointers)):





if pointers[i] >= len(nums[i]):

reached_end = True

break





if nums[i][pointers[i]] > local_max:

local_max = nums[i][pointers[i]]







if nums[i][pointers[i]] < local_min:

local_min = nums[i][pointers[i]]

local_min_index = i







if reached_end:

break





if local_max - local_min < ans[ 1 ] - ans[ 0 ]:

ans = local_min, local_max





pointers[local_min_index] = pointers[local_min_index] + 1



return ans



This code runs in O(K * N) where K is the number of lists and N is the total number of elements in all the lists. In the worst case, we will need to perform the inner for-loop, which takes K time, for every element in every list. The space complexity is O(K), since we are storing a K length array of pointers.

Solution 2: Heap

Note that in the above, much of the work in the inner loop is spent trying to find the local maximum and local minimum values. Fortunately, we can use a heap to simplify this!

If we used a heap instead of an array of pointers to track the values we are currently looking at, we would be able to find the local minimum in O(1) time. However, we still need to know which list the local minimum is from: for this, we can make use of Python’s tuple capabilities.

Consider a min-heap (a heap where the first element is guaranteed to be the minimum of all elements in the heap) consisting of tuples that hold the following information: (value, which list it is from, index of value in that list)

Now, let’s see how we can adapt the algorithm above to use a heap instead.

Initialize a heap of size K, with all the tuples being: (first value of the list, which list it is from, 0). The zero here is because we are starting at all the minimum values, so index 0. Initialize variables to track the right and left boundaries of the interval. Initialize the local_maximum variable to the max of the first set of values. Since we are using a min-heap, there is no easy way to retrieve the maximum value, so we will need to track it. Pop an element from the top of the heap. The element contains the local_minimum , list it is from, and index within that list. Compare the new range ( local_maximum - local_minimum ) and update the current tracked interval if necessary. Increment the local_minimum ’s index, and read the value. If the value is larger than the local_maximum , update the local_maximum . This sets it up so that the next iteration has an updated version of local_maximum . Create a heap element using the new value, and insert it into the heap. Repeat steps 4-8 until we’ve exhausted a list.

In code, it will look like this:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

def smallest_interval (nums) :



heap = [(row[ 0 ], i, 0 ) for i, row in enumerate(nums)]

heapq.heapify(heap)





local_maximum = max(row[ 0 ] for row in nums)

ans = [-inf, inf]

while heap:



local_minimum, num_list, local_min_index = heapq.heappop(heap)





if local_maximum - local_minimum < ans[ 1 ] - ans[ 0 ]:

ans = [local_minimum, local_maximum]





if local_min_index + 1 == len(nums[num_list]):

return ans





next_val = (nums[num_list][local_min_index + 1 ])

local_maximum = max(new_val, local_maximum)





heapq.heappush(heap, (next_val, num_list, local_min_index+ 1 ))



Popping an element from the heap as well as pushing it onto the heap takes O(log(n)) time, where n is the number of elements in the heap. Since our heap will be maximum size K (the number of lists) and in the worst case we will need to iterate for every value in the lists, our total time complexity is O(N log K), where N is the total amount of elements in the lists. Our space complexity is O(K), as we are storing at most one element per list in the array.

Conclusion

This problem definitely looks daunting at a first glance, but we can see that there are logical steps to move from the naive solution to the optimal solution. When struggling on a certain problem, don’t be afraid to work with the naive solution first and think of incremental optimizations to bring you closer to a better answer.

Are you interviewing for programming jobs, or do you just enjoy fun programming questions? Check out our newsletter, Daily Coding Problem, to get a question in your inbox every day.