How to hire your first data scientist has been a common conversation with me lately so I figured I would share my perspective. Hiring your first data scientist is critical, that point can not be stressed enough. Hiring the right type of individual, what I am calling a Type-E, will launch you ahead of the competition, give you appropriate intellectual property to buffer your imitators, explode the scope of your potential data problems beyond your original hopes, and break through every damn obstacle put in their path.

The appropriate skills needed for a data scientist chart has spread like a virus all over the internet:

This chart will do absolutely nothing to help you hire your first data scientist. So how do you hire one of these rare power houses to launch your new data science arm for your business? Consider these 6 points.

1. 30,000 hour expert

Everyone has heard of the 10,000 hour rule coined by Malcolm Gladwell on what it takes to become an expert. For a type-E individual you have to raise the bar to ridiculous. If I had to throw a number out there I would say a 30,000 hour expert. The main point that this is hitting on is you need to find an individual where data science is not a career, curriculum, or hobby, but an addiction. They hack every day and going on vacation without their laptop or a high-speed internet connection gives them anxiety. What experience does someone need? We are talking 10 years of serious daily hacking, not necessarily school. Advanced degrees are optional, it depends on the merit of the individual. Below is a study from Burtch Works from 171 data scientists that shows advanced degrees are common but not required.

Likewise, the majority of PhD graduates from top schools will fail this screen because they lack the addiction and time factor. They have chosen data science because it is a great career choice.

2. Reality Distortion On High

Steve Jobs is associated with the term reality distortion. The individual you want as your first data scientist has such a high degree of unstoppability that they begin to bleed into the category of reality distortion. You also want to make sure you are looking at someone who has been burned before (some experience when things don't work out as expected). They will succeed and break through any barrier in their path. So many data scientists out there are constrained to similar problems, classes, and types they have seen before. #LimitedToTheBook

The individual with reality distortion on high doesn't give a !$%@ about what has been done before, they are not constrained by that.

3. High On Crazy

The best data scientists are the crazy ones, crazy passionate, willing to do whatever it takes to succeed. Someone exceeding the new 30,000 hour rule has a ridiculous amount of self-taught experience. Their toolkit is massive, and the problems they have encountered are very diverse. For the common data scientist branches (coding, hacking, stats/math) they should be very strong on all of them compared to their data science peers. A high degree of crazy/hyper-confidence is often found in the type-E individual.

4. Previous Intellectual Property Experience

If you are in a market where you have imitators and you need intellectual property development to protect your buffer you need to consider hiring someone with previous experience. LaTeX or Markdown are common filters used to find someone that might have this experience. Also, if someone is solid in the first three points they may be able to stretch into this area as well if they are lacking with examples. So I wouldn't use this as a hard filter, but a bonus.

5. Large Social Footprint/Skills

Someone with a large social footprint is most likely actively engaged in the community. Claiming to be the smartest individual in a room of peers is a dramatically different statement if they know their peer pool well. Also, most likely after you hire this individual you will need them to build out a data science team of top data scientists in the future. That will be hard for them to do if they don't know what they are looking for, or if they don't have strong ties with the data science community. Another thing to consider with this is your first data scientist will need to be customer facing. They will likely be interacting with your largest customers around data solutions or algorithms they are developing and beta testing MVPs with them. So during the screening phase ask yourself "Do I feel comfortable with this individual leading a technical call with my most valuable customer?". If the answer is no move on.

6. Volatile Ambition

A type-E individual doesn't settle anywhere. If you ask an individual where do you see yourself in 5 years and they respond "Not working here" you have found a real winner. If they see themselves retiring with you move on. Type-E individuals are also rare because their ambition is sucking them into full time independent consulting roles where they make more than you are willing to pay them, or their ambitions for launching a billion dollar startup are also pulling on them. Most of the type-Es I would tap have expired into independent consultants. So you are looking for that rare transition where someone is willing to transform your company's data in exchange for the public recognition that comes from the success of your company so they can move on to their full potential. This last point makes finding legit type-Es extremely difficult.

Title response:

So before I get the hate comments :) the title is of course cheeky, but it does hit on a sad realization. If you have a single data scientist and you already think they should be delivering more to your bottom line than they are news flash: "They suck" and you hired the wrong caliber individual for the job. You may still be able to keep them if they are good, but you need to bring in a type-E rockstar to cement your data arm and redirect the unstoppable ship. Your new type-E might fire them BTW ;)

These are 3 of my most popular articles:

















4 reasons to work for a startup instead

















Death of the data scientist













This is why your plots suck

Keywords:

Data scientist, predictive analytics, big data, hadoop, spark, random forest, deep learning, descriptive analytics, cloud computing