Microsoft Data Science Interview

Microsoft’s dominance in the Enterprise is well known. Microsoft has ridden the cloud-computing wave. In the fiscal first quarter, its Azure services and Office 365 online-productivity business — saw revenue soar 90% and 42%, respectively.

In a recent letter by the CEO Satya Nadella to all Microsoft employees, there are two new teams formed in Microsoft, Intelligent Cloud and Intelligent Edge to shape the next phase of innovation. This announces the profound shift to weave Artificial Intelligence in to all that Microsoft does. Needless to say Microsoft following this announcement may increase the AI related hires to the company.

Interview Process

Microsoft has a typical interview process like most other companies who hire Engineers. The Data Science roles usually have a process tweaked a little which reflects the importance of different aspects under the umbrella of Data Science. There are usually phone interviews(involve coding) followed by onsite interviews. Onsite there are about 4–5 interviews. There might be 2–3 of them really going deep on Data Science related questions, research and models. The remaining ones are aimed to test the coding skills.

Important Reading

Source: Microsoft Blog

Like Google, Microsoft has its own version of the AI School which was released very recently. Its core AI platform is sliced into three components, service, infrastructure and tools.

AI/Data Science Related Questions

Merge k (in this case k=2) arrays and sort them.

How best to select a representative sample of search queries from 5 million?

Three friends in Seattle told you it’s rainy. Each has a probability of 1/3 of lying. What’s the probability of Seattle is rainy?

Can you explain the fundamentals of Naive Bayes? How do you set the threshold?

Can you explain what MapReduce is and how it works?

Can you explain SVM?

How do you detect if a new observation is outlier? What is a bias-variance trade off ?

Discuss how to randomly select a sample from a product user population.

How do you implement autocomplete?

Describe the working of gradient boost.

Find the maximum of sub sequence in an integer list.

What would you do to summarize a twitter feed?

Explain the steps for data wrangling and cleaning before applying machine learning algorithms.

How to deal with unbalanced binary classification?

How to measure distance between data point?

Define variance.

What is the difference between box plot and histogram?

How do you solve the L2-regularized regression problem?

How to compute an inverse matrix faster by playing around with some computational tricks?

How to perform a series of calculations without a calculator. Explain the logic behind the steps.

What is a difference between good and bad Data Visualization?

How do you find percentile? Write the code for it.

Find max sum subsequence from a sequence of values.

What are the different regularization metrics L1 and L2?

Create a function that checks if a word is a palindrome.

Reflecting on the Questions

Microsoft interviews have a lot of open ended questions where the solutions are open to interpretation. Many questions are also based on data presentation and visualization. This is different from the other companies we have looked at previously. Data Presentation and Visualization is explained in this article where I talk about how to prepare of such interviews.