Ah, the famous “Google-style” algorithmic coding interview.

If you’ve never had one of these interviews before, the idea is to see if you can write code that’s not only correct, but efficient, too. You can expect to spend lots of time diagramming data structures and talking about big O notation. Popular hits include “reverse a linked list in place,” “balance a binary search tree,” and “find the missing number in an array.”

This kind of interview is the standard at all the big tech companies—Facebook, Amazon, Microsoft, Twitter, and Google. And many smaller companies use this kind of interview, as well.

But the algorithmic coding interview is contentious these days. Some people say it’s too academic and tests things that have little to do with real-life software engineering. Not long ago, Max Howell, the author of Homebrew (software that basically every engineer with a Mac uses), famously quipped about being rejected from Google after being unable to invert a binary tree.

Like it or not, a “Google-style” coding interview may stand between you and your dream job. So it’s in your best interest to learn how to beat it.

And you can. The algorithmic coding interview is a game you can learn how to win.

Patterns

One of the biggest fears candidates have is that coding interview questions rely on “blind insights”—fancy tricks you couldn’t possibly know unless you’d seen the answer already.

But these insights aren’t blind. They come from knowledge of a small set of patterns that come up again and again. If you can learn the patterns, you can unlock the insights.

Today, I’m going to teach you the most common pattern: hashing.

First, a couple disclaimers:

To really understand what we’re doing here, you’ll want a working understanding of big O notation. If you’re a bit rusty, you might want to read up on big O.

We’ll be writing Python code, but we’ll avoid some fancy “Pythonisms” in favor of making our procedure as transparent as possible. Think of it as pseudocode that happens to be runnable Python.

Okay, on with the show! We’ll learn about hashing by solving a few problems with it.

Problem 1: Mode

Given an array of numbers, return the mode—the number that appears the most times.

This is a common programming interview question. And as with many great interview questions, there are at least three ways to solve it.

Solution 1: Brute Force

By “brute force” we mean “looking at each possible answer.“ In this case, that’s looking at each number and getting its count by looping through the array again. From all these counts, we return the number with the max count.

def get_mode_brute_force(nums): mode = None max_count = 0 for potential_mode in nums: #get its count count = 0 for num in nums: if num == potential_mode: count += 1 #is it a new mode? if count > max_count: max_count = count mode = potential_mode return mode

This’ll work, and it’s an okay place to start. It’ll take O(n²) time, since we’re nesting two loops that each go through all n items in our array.

How could we bring down that time cost? We’d need to avoid looping through the array twice…

But how can we know how many times each number appears, without looping through the whole array once for each number?

Solution 2: Sorting

If we sort the array, all occurrences of a number will be next to each other. So in just one pass, we can count the occurrences for each number as we encounter it.

def get_mode_sorting(nums): #edge cases if len(nums) == 0: return None elif len(nums) == 1: return nums[0]

sorted_nums = sorted(nums) mode, previous_num = None, None max_count, current_count = 0, 0 for current_num in sorted_nums: #update count if current_num == previous_num: current_count += 1 #new mode? if current_count > max_count: max_count = current_count mode = current_num # reset for next iteration if current_num != previous_num: current_count = 1 previous_num = current_num return mode

This’ll cost O(nlogn) time in total. That’s faster than O(n²), so it’s an improvement. But the driver of our time cost is that first sorting step, which costs O(nlogn) time.

If we avoid sorting, we can bring the time cost all the way down to O(n).

Solution 3: Hashing

Let’s take one walk through the array, collecting the counts for each number in a dictionary (called a “hash map,” “hash table,” or “hash” in many non-Python languages).

That’s the key—using a dictionary. They have O(1)-time insertions and lookups (on average). So we can collect the counts for all of our numbers in just one pass, by updating each number’s count in our dictionary as we go.

def get_mode_hashing(nums): # keys: numbers in our input array # values: the number of times that key number appears # e.g.: if 5 appears 3 times, then counts[5] == 3 counts = {} # step 1: get the counts for num in nums: if num in counts: counts[num] += 1 else: counts[num] = 1 # step 2: get the number with the max count max_count = 0 mode = None for num, count in counts.iteritems(): if count > max_count: max_count = count mode = num return mode

This takes O(n) time, which is the optimal runtime for this problem! And we unlocked it by exploiting the O(1)-time insertions and lookups that dictionaries give us.

Using a dictionary or similar data structure to save time is so often the “trick” or “blind insight” needed to get the optimal answer that it should just always be our first thought in a coding interview.

To help us build an intuition for how to apply hashing to other problems, we’ll look at a couple more examples.

Problem 2: Repeat Number

Given an array of numbers where one number appears twice, find the repeat number.

If we make hashing our first thought, we can derive the optimal solution pretty quickly! We could even use the exact same “counts” dictionary again, returning a number as soon as its count crept above 1.

But we don’t really need the count for each number—we just need to know whether it has appeared at all. So it’s a bit cleaner to use a set in this case. The idea is similar—under the hood, sets use “hashing” to get O(1)-time insertions and lookups, just like hash maps. In fact, the common set data structure in Java is called “HashSet.”

def get_repeat_number(nums): nums_seen = set() for num in nums: if num in nums_seen: return num nums_seen.add(num)

This takes O(n) time, which is the best we can do for this problem!

And just as with our first problem, we could have used a brute force approach (nesting two loops) to get O(n²) time. Or we could have started off by sorting, to get O(nlgn) time. See? There are repeating patterns here!

If the brute force and/or sorting solutions don’t come to you right away, try to derive them. Part of being great at algorithm design is being able to see the different ways to solve the same problem—including the less efficient ways!

Problem 3: Sorting

Given an array of numbers in the range 1..1000, return a new array with those same numbers, in sorted order. There may be repeats in the input array. If there are, you should include those repeats in your sorted answer.

First thought: hash maps!

Let’s use that same “counts” dictionary we used in problem 1. Once we have the count for each number (which we can get in O(n) time) we just need to walk through the range 1..1000 once, adding each number to our “sorted” array x times, where x is its count from the counts dictionary.

def sort_nums(nums): counts = {} # step 1: get the counts for num in nums: if num in counts: counts[num] += 1 else: counts[num] = 1 # step 2: generate the sorted array sorted_nums = [] for num in range(1, 1001): if num in counts: for _ in range(counts[num]): sorted_nums.append(num) return sorted_nums

This is called a counting sort. Importantly, it only works well if we can “bound” our numbers into a certain small-ish range (in this case, 1..1000). Otherwise step 2 above could be quite inefficient.

As with the previous two problems, we could use a “brute force” approach, by nesting two loops—we’d end up with a selection sort or an insertion sort, taking O(n²) time. See? Brute force is a pattern, too!

Practice, And Look for More Patterns

Hashing isn’t always the answer. But it often is: It’s the most common pattern. It should always be your first thought.

If you really want to ace your coding interviews, you’ll need to invest time in running practice problems. There’s no way around it. But each practice problem will take you much farther if you pay careful attention to patterns. Don’t just read the answer and understand why it works—identify the pattern or “blind insight” at play, and think about how it could be applied to other problems.

Once you’ve learned some patterns, those “insights” won’t seem so “blind” anymore. Then you’ll really feel confident solving problems you’ve never seen before!

Parker Phinney founded Interview Cake, a website that teaches software engineering job-seekers how to beat the coding interview. He’s sick of seeing engineers miss out on the jobs they deserve because they’re underprepared for coding interviews. He runs a weekly email list that sends engineers a new practice problem every Tuesday. He lives in San Francisco and loves riding his bicycle.