Researchers from Tel Aviv University have developed a deep learning-based system that can automatically generate pictures of a finished meal from a simple text-based recipe.

This is designed to help accelerate research in this field rather than being used 'for real', but is still an interesting use of AI. The problem faced by the researchers, Ori Bar El, Ori Licht and Netanel Yosephian, is that the relation between the recipe text (without its title) to the visual content of the image is vague, and the textual structure of recipes is complex, consisting of two sections (ingredients and instructions) both containing multiple sentences.

The data set consists of 52,000 written recipes and their corresponding images. Once trained, the system generated images of what the recipe might look like from a long list of text that did not describe the visual content or the title of the dish. Ori Bar El, one of the co-authors of the paper, said:

"Our system takes a recipe as an input and generates, from scratch, an image that reflects the food that the system ‘believes’ this recipe describes."

He added that because the text of the recipe is both long and does not describe the visual content of the image directly, a human would find the task very hard, never mind a computer.

The researchers used the recipe1M dataset to train and evaluate our model, which is based on a the StackGAN-v2 architecture. This is a Stacked Generative Adversarial Network. Originally, GANs were a combination of two models that are trained to compete with each other. In the training process both the generator G and the discriminator D are trained. G is optimized to reproduce images similar to the original data distribution, by generating images that are difficult for the discriminator D to differ from the true images. D is trained to distinguish between real images and fake synthetic ones, generated by G. The hardware chosen for the system was NVidia Titan X GPUs with the cuDNN-accelerated PyTorch deep learning framework.

The success of the images generated by the system was tested using human judges who ranked photos of the actual food created by the recipes against the generated images. In some cases the real images were given a mark that was less than or equal the mark given to the generated images. The researchers say the system is better at dishes such as pasta, rice or soups, and doesn't do as well at recipes where the end result has a distinctive shape such as a hamburger or chicken.

More Information

Abstract On Arxiv



GILT: Generating Images from Long Text by Ori Bar El, Ori Licht, Netanel Yosephian Tel-Aviv University (pdf)

Related Articles

Better Face Detection in Amazon Rekognition

Pyro Now On Watson Machine Learning

iNaturalist Launches Deep Learning-Based Identification App

Facebook Shares Deep Learning Tools

Deep Learning Chess

ConvNetJS - Deep Learning In The Browser

Google's Deep Learning AI Knows Where You Live And Can Crack CAPTCHA

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.







Comments



Make a Comment or View Existing Comments Using Disqus





or email your comment to: comments@i-programmer.info