is changing fast

We propose a novel approach based on recurrent neural networks for the challenging task of answering of questions about images. It combines a CNN with a LSTM into an end-to-end architecture that predict answers conditioning on a question and an image.

To align movies and books we exploit a neural sentence embedding that is trained in an unsupervised way from a large corpus of books, as well as a video-text neural embedding for computing similarities between movie clips and sentences in the book.

We show that using the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on the tasks of scene recognition, object recognition, visual odometry and keypoint matching.

We introduce a deep convolutional architecture that yields patch-level descriptors, as an alternative to the popular SIFT descriptor for image retrieval.

We show that a sparse coding model particularly designed for super-resolution can be incarnated as a neural network, and trained in a cascaded structure from end to end.

In this work we show how to predict boundaries by exploiting object level features from a pretrained object-classification network.

A novel deep visual correspondence embedding model is trained via Convolutional Neural Network on a large set of stereo images with ground truth disparities. This deep embedding model leverages appearance data to learn visual similarity relationships between corresponding image patches, and explicitly maps intensity values into an embedding feature space to measure pixel dissimilarities.

We present a system which can recognize the contents of your meal from a single image, and then predict its nutritional contents, such as calories.

How can one write an objective function to encourage a representation to capture, for example, objects, if none of the objects are labeled?

We introduce a stochastic and differentiable decision tree model, which steers the representation learning usually conducted in the initial layers of a (deep) convolutional network.

Computer Vision used to be cleanly separated into two schools: geometry and recognition. Geometric methods like structure from motion and optical flow usually focus on measuring objective real-world quantities like 3D "real-world" distances directly from images and recognition techniques like support vector machines and probabilistic graphical models traditionally focus on perceiving high-level semantic information (i.e., is this a dog or a table) directly from images.The world of computer visionhas changed. We now have powerful convolutional neural networks that are able to extract just about anything directly from images. So if your input is an image (or set of images), then there's probably a ConvNet for your problem. While you do need a large labeled dataset, believe me when I say that collecting a large dataset is much easier than manually tweaking knobs inside your 100K-line codebase. As we're about to see, the separation between geometric methods and learning-based methods is no longer easily discernible.By 2016 just about everybody in the computer vision community will have tasted the power of ConvNets, so let's take a look at some of the hottest new research directions in computer vision.This December in Santiago, Chile, the International Conference of Computer Vision 2015 is going to bring together the world's leading researchers in Computer Vision, Machine Learning, and Computer Graphics.To no surprise, this year's ICCV is filled with lots of ConvNets, but this time the applications of these Deep Learning tools are being applied to much much more creative tasks. Let's take a look at the following, which will hopefully give you a taste of where the field is going.Mateusz Malinowski, Marcus Rohrbach, Mario FritzYukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja FidlerPulkit Agrawal, Joao Carreira, Jitendra MalikMattis Paulin, Matthijs Douze, Zaid Harchaoui, Julien Mairal, Florent Perronin, Cordelia SchmidZhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, Thomas HuangGedas Bertasius, Jianbo Shi, Lorenzo TorresaniZhuoyuan Chen, Xun Sun, Liang Wang, Yinan Yu, Chang HuangAustin Meyers, Nick Johnston, Vivek Rathod, Anoop Korattikara, Alex Gorban, Nathan Silberman, Sergio Guadarrama, George Papandreou, Jonathan Huang, Kevin P. MurphyCarl Doersch, Abhinav Gupta, Alexei A. EfrosPeter Kontschieder, Madalina Fiterau, Antonio Criminisi, Samuel Rota Bulò