As a Software Engineer Co-op at SnapTravel, my project was to work on machine learning to introduce intent detection into the chatbot. Through experimenting with different advanced ML algorithms (such as those used by the teams at Google and Facebook respectively), my project would allow the chatbot to determine the intent of natural language (such as when a user wanted to search for a hotel or cancel their booking) and respond accordingly, without the need for human involvement.

Like most machine learning work, my project was broken up into 4 steps:

data collection/data labelling feature engineering training of the machine learning model model evaluation

Data collection/labelling is used to collect data that can then be used to train the machine learning model, such as by ‘tagging’ the data with different labels. To do this I created an internal tool that would allow humans to review phrases that the chatbot did not understand, and tag them to identify the intent with different labels.

Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. Engineers usually put lots of effort on feature engineering and try their best to select the features that can uniquely represent each sentence.

Experimenting with, or training machine learning models, is when you use a set of features (created through feature engineering) and labels (from data labelling) for each sentence to train a machine learning algorithm.

Finally, model evaluation is used to measure precision and recall. Precision is the percentage of correct predictions out of all prediction results, and Recall is the percentage of correct prediction out of the number of results that should have been predicted correctly. In my project I applied two separate ML algorithms to the model, and then measured their precision and recall for intent detection in three different categories. As you can see, the machine learning model using features generated from Algo 1 performed better than those from Algo 2.

In addition to these machine learning models & algorithms, used by teams at Google and Facebook, I also had the opportunity to apply statistics theory (like chi square, information gain) to conduct feature selection. From Logistic Regression models to Support Vector Machine and Tree algorithms, I was able to solve classification problems and experiment with Neural Network and deep learning algorithms. This project, and my co-op placement in general, gave me the opportunity to learn an interesting field of engineering, apply industry best practices, and have significant impact on a product serving millions of users worldwide.