To avoid “reinventing the wheel,” programmers typically search popular discussion forums such as Stack Overflow to find examples that can guide them on particular coding tasks. This approach works most of the time, and especially when the coding task is a popular one. However, when it comes to a niche task or uncommon programming languages or APIs, a simple keyword search on Google or Stack Overflow may not provide the answers programmers are looking for.

Last week, Facebook published a blog article demonstrating their newly developed code search tool, Neural Code Search (NCS). Powered by neural networks, NCS takes natural language queries and returns relevant code snippets retrieved directly from a large codebase. Facebook introduced the NCS model along with a supervised extension, UNIF.

As demonstrated in the graph below, the NCS model uses embeddings to map code snippets and natural language queries as vectors in the same vector space, and calculates the cosine similarities between embedded code snippets and the given query to locate and deliver the most relevant code as the output.

NCS model generation and search retrieval processes

A newly created dataset based on public queries on Stack Overflow was used to evaluate NCS performance, which was comparable to the traditional IR (Information Retrieval) benchmark method BM25.

Evaluation results between the model NCS and BM25, “Answered@n” represents whether the correct answers were retrieved from the top 1,5, or 10 results generated. MRR (mean reciprocal rank) is a popular IR evaluation metric that measures at which n^th result the correct answer was found.

As an unsupervised model, NCS can be trained quickly and easily to learn embeddings directly from the search corpus. NCS however struggles in situations where there is no overlap of words between the queries and the source code. To solve that problem researchers added the Embedding Unification model (UNIF).

There are two main differences between NCS and UNIF processes:

Instead of using a single token embedding matrix (T) for both code and query input, UNIF produces two embedding matrices for code (Tc) and query tokens (Tq) respectively. Instead of using TF-IDF weighting for the code token embeddings, UNIF adopts a learned attention-based weighting scheme.

NCS and UNIF model architecture comparisons

UNIF’s advantage is that improvements can be observed when there is good labelled data available for training. As the comparison demonstrates, UNIF significantly outperforms NCS in “number of questions answered.”

NCS and UNIF model evaluation results

As the number of publicly available code repositories continues to grow we can expect machine learning to play an increasing role in improving the coding experience. Facebook previously released its code recommendation tool Aroma and Auto-Debug tool Getafix, and we can expect these and NCS and UNIF to be integrated into popular IDEs in the near future.

More details on the neural code search models can be found in the papers Retrieval on Source Code: A Neural Code Search and When Deep Learning Met Code Search, and on the Facebook Artificial Intelligence blog.