The hype around ML resulted in a dramatic increase in research and interest in this field. In this post, I’m sharing my way of keeping track of the latest research and trends.

The number of publications in ML is growing exponentially. The following chart from Jeffrey Dean, 2020, shows that we have a 30x increase since 2009. It has become even harder now to keep up with the trends and research progress in the field. How do you stay up to date?

Number of ML publications onArxiv taken from Dean, 2020

In the following, we will first have a look at how others keep track of the field, and then I’ll share my tips and tricks I collected within the last years working at WhatToLabel.

How do other researchers keep up with the latest research?

In 2019 we conducted a survey with authors of publications accepted to conferences such as CVPR, ICML, ICCV, NeurIPS. Fifty researchers from academia, as well as industry research labs, took part in the survey.

From our survey, we conducted in March 2019

Three sources stand out, Arxiv, Conferences, and Google Scholar.

Arxiv hosts reprints of publications and is publicly accessible. This combined with the ML researcher’s strong interest in sharing and spreading their work openly makes it one of the top places to find papers.

Conferences are probably the best place to physically exchange with other interested people in the field. Additionally, they list all accepted publications and make them publicly accessible. I added the links of CVPR 2019, ICML 2019, NeurIPS 2019, ICLR 2020 for you. You can also find recordings from the presentations for the conferences online. I listed some here: CVPR has its own YouTube channel, for ICML you can simply search on YouTube for “ICML 2019” to find relevant presentations, and the same goes for NeurIPS.

Google Scholar is probably the most extensive database of publications related to ML. It allows us to search within citations (as shown in one of my tips and tricks below) as well as to filter by publication date. For example, this would make it easy to filter for all publicly available papers citing BERT which appeared after 2020.

One must not forget the importance of colleagues working in the same field. I have regular exchanges with friends from all over the world to discuss recent papers we found interesting. One way to find like-minded people is by attending local meetup groups.

What I found interesting is the growing relevance of GitHub. I’ve seen more and more repositories showing a listing of “Awesome papers …” such as awesome-deep-vision, awesome-deep-learning-papers. I don’t expect these sources to be always up to date, but it might be convenient to have a “summary” of more impactful papers in one place.

Tips and Tricks I find very useful

I will summarize some points which helped me increase my efficiency. The main channels I use are Twitter, Reddit, and Google Scholars citation search.

Create a Twitter account and follow other researchers

Even though Twitter only got a few votes in our 2019 survey, I’ve experienced that it helps me a lot to stay up-to-date. Most ML researchers that are relevant for my field tweet about their latest research papers. By simply following them and checking my twitter account a couple of times a week, I can keep track of their work. Additionally, they often retweet other new papers from the field they like. So by just following a few dozen researchers you already get a good twitter feed of new and interesting papers. If you’re a researcher and don’t have a Twitter account yet, create one and let other colleagues stay updated about your work.

Use Reddit, not only to find new papers but also to discuss them

One of the things I like about Reddit is that people are more direct and honest when it comes to giving feedback on other people’s work. I want to highlight the Machine Learning subreddit. This subreddit has almost 1 Million subscribers. Not only will you find many new and interesting publications but also critiques and thoughts from others in the comment section. There is another cherry on top when using Reddit: You often find papers that are not directly related to your field. As someone in computer vision, you also find papers about NLP or speech recognition. On a personal level, I very much appreciate this, since it allows me to see research patterns across data types and industries. Additionally, it gives me an overview of the general advancements in self-supervised learning. In 2018, BERT showed huge success in NLP using self-supervised pre-training. 2019 took that breakthrough across the border and proved to be invaluable for computer vision.

Use Google Scholars “search in citation” feature

Google Scholar is, for me, one of the most important tools to find other relevant papers related to my current research. Let’s assume we want to search for a specific text within articles citing SimCLR on Google scholar. Simply toggle the checkbox “Search within citing articles” and you’ll be searching within articles citing SimCLR. Arxiv has a somewhat similar feature to search within citations however my personal preference has settled on Google Scholar search.

Toggle the checkbox to search within citing articles on Google Scholar

Check whether the paper has been accepted at a conference

I often see myself falling into this trap. Arxiv has become one of the hot spots of ML papers. However, submitting to Arxiv is very easy, and sometimes even too easy. There is no peer-review process as you would, for example, have it with well-known conferences. You simply need an “endorsement” from someone already registered. This has its pros and cons. On one hand, experiments, whose results would not be sufficient for other conferences can appear on Arxiv. Less successful experiments do not enjoy the same level of exposure as their successful counterparts, yet they can provide me with some valuable learnings and this simple platform makes them publicly accessible. On the other hand, works with bad experiments, wrong numbers and results might also appear on Arxiv. To stay cautious of this disadvantage, it’s always good to quickly check whether the paper you find on Arxiv also has been accepted to any conference in the field.

Check paper reviews

For some papers, you’ll find feedback on OpenReview. As an example, I’ve linked the feedback from Progressive GAN from Karras et al., 2018, for you here. Not all feedback might be useful for yourself since you’re not an author of the publication. But it might still help you to understand some parts in more detail and see how other work differentiates from it.

How do you keep track of relevant research and trends in your field? Please share your tips and tricks in the comments.

Igor, co-founder

whattolabel.com