Spotify is having a coding challenge to find "top-notch talent to join our NYC team". The challenge is to solve the most algorithmic puzzles in four hours... alone. "You may not cooperate with anyone, and you may not publish any discussion of solutions." What sort of developer will win this competition? Someone who is quick, dirty, has a mathematical mindset and lucky enough to write something that happens to work for the test data set. The "rockstar". Is this somebody you want on your team? Would you want to maintain their code?

Last year while on contract, the company in question was passing around their coding problem they used to test new hires. It was pretty typical stuff: give the data going in, the data they want out, and write a little program to do the transform. They even supplied most of the program, including a test; the prospective hire just needed to write one sort subroutine which could deal with "Low", "Medium" and "High" as well as numbers.

Predictably, this halted all coding in the office for a solid half day while everyone figured out the most clever way to sort the data. My opus was to observe that the input data was already sorted, so I redefined the shuffle() routine. The best one was from a co-worker who observed that "Low", "Medium" and "High" sort correctly by their last letter in reverse order. It was fun for us, but it wasn't very useful.

This is a pretty typical coding problem used to judge potential hires, and it sucks. All it tells you is the candidate is not completely incompetent. Why do we keep using them? They're easy. They're easy to think up, easy to judge and easy to administer. They're also the sort of clean, algorithmic problems a stereotypical programmer loves to solve. Do they have anything to do with detecting a good developer? No. Can we fix it? Yes!

Before this can be fixed, first we have to work out what a project wants in a developer. What do developers do all day? We can work this out by reversing every contrived element of our typical programming contest: clean, well defined inputs; clean, well defined expected output; a clear description of the algorithm; pre-existing template code; pre-existing acceptance tests. When's the last time you were handed a problem like that in the real world?

Instead, we get poorly defined inputs, sample input that has anything to do with reality being a luxury, inputs riddled with mistakes. Expected behavior and output are vaguely defined. As a developer we're presented with a clean sheet, no template, no tests, just a blinking cursor and a blank page.

How does one solve the contrived example? First the specification, inputs and outputs are carefully examined. If there's any ambiguity it's discussed up front with the person who probably wrote the problem and understands it perfectly. Then the candidate goes off and writes some code until it passes the tests. There is little or no interaction with people, everything is handed on a silver platter in unambiguous terms.

How does one solve the real example? First you have to find somebody who understands the problem, usually not a programmer, and discuss the problem. Then you drag some samples out of them, converting them into a format you can actually use. The user probably doesn't know what they really want, so the behavior/output will be ill defined. Pressed, the user will make up something that you know will be wrong as soon as you show it to them. Armed with this "information" you hammer it into some sort of algorithm and now need to write some code.

But you don't just hammer out a script. You need to do it in a way that matches the team's coding style. You'll probably want to make the meat of it reusable, so you need to write it as a library not just a one-off script. It'll have to deal with the inevitable bad input, which means good error handling and recovery. Other people will have to use it, which means good documentation. It needs tests written in a way which works with whatever integration server the team is using. And, of course, it should all be checked into version control with well defined and logged commits.

And then, when you've written all that and got it working, you take it back to the user and they tell you it's not what they wanted. Or they show you some new input that doesn't match what they originally said. If you're good... and lucky, your code is robust enough to handle it. If you're not... back to coding with you! Repeat until dead.

In that light, a good developer is one who works well with others, but also can make a multitude of small, detailed decisions about a problem they know very little about. They need to take vague requirements and work with a user to turn them into something a computer can do repeatedly. A good developer looks beyond the immediate requirements and thinks ahead ensuring the code is flexible enough to withstand future change. A good developer writes code not for themselves, but for everyone else on the team.

So... how do you test all that? And in less than an hour? Turns out it's pretty easy with some simple modifications to the classic example. You keep the same basic algorithmic problem, but you give it to the candidate the way a user would. It can be as simple as this:

Could you sort this data please? [Excel document attached]

You can be a bit more clever, feeding the candidate what seems like enough information to do the work but is full of subtle ambiguities, tempting them to take the easy route and just get to coding, weeding out those with the undesirable tendency to program without questioning what they're doing and why, but that's the crux of it.

Now sit back and see what the candidate does with that. Respond to their questions, but remain in the persona of a user. What you're looking for here is how they push back. What sort of questions to do they ask to gather and clarify the requirements? What sort of assumptions do they make? How well do they bridge the communications gap between customer and programmer?

Given a blank page to code on, what do they do with that? Do they write tests and documentation? Do they write a quick script or a library? Do they use version control? Do they ask about your code standards? Do they use pre-existing libraries? Do they write to the letter of the requirements or leave some flexibility? And again, what are their communications like through this process?

Once they've submitted their first solution, change the requirements on them subtly by offering a second set of inputs. These will be the same but subtly different. Maybe throw in some Unicode or deliberately malformatted lines. Maybe change the sort criteria. Does their code fail gracefully? Messily? Silently? Do they notice the changes? What shape does the discussion over the change take? Are they indignant about the changing requirements, do they take them in stride? How much work does it take them to adapt their code to the new requirements?

This test will take more time than a traditional puzzle test, but it can be administered easily enough over email and doesn't involve more than a few minutes attention at any given time. A dud developer can sink your team and cause far more damage than good, so it's worth the little bit of extra effort to find out if they can actually do what developers do, and not just solve clever puzzles.