DAVIS-2017: a multi-instance Video Object Segmentation challenge

In the previous post: Video Object Segmentation — The Basics, we’ve gone through the problem definition of Video Object Segmentation, its metrics and nuances. Then we’ve covered the two main approaches that emerged to deal with the DAVIS-2016 video object segmentation dataset: MaskTrack and OSVOS. In this post we’ll see how these algorithms evolved to handle the more challenging DAVIS-2017 dataset.

In terms of accuracy, there was a significant leap in performance in 2017. For reference: OSVOS, the state of the art for 2016, got a Region Similarity score of ~46 on the 2017 challenge (with our best implementation) while this year’s winner achieved the impressive score of 67.9!

Breaking down the top 9 published works in the DAVIS-2017 challenge

Out of 22 participating teams, here are the top 9 with published results: