ELI5: Give Pythia an image and shoot questions to it, get your questions answered; sounds interesting isn’t it, but Pythia can do more than this….

Pythia is the new multimodal research framework for supercharging vision and language tasks built on top of Pytorch

Is there a reason behind naming it Pythia?

‘The name ‘Pythia’ is an homage to the Oracle of Apollo at Delphi, who answered questions in Ancient Greece.’

Lycurgus Consulting the Pythia

Ever wondered how great it would be if we had a single framework that could incorporate both NLP and vision tasks easily, this is what Pythia does, it is designed for answering questions related to visual data and automatically generating image captions

CloudCV has a Demo of Pythia up and running which you can try it out

Pythia incorporates elements of Facebook research’s winning entries in recent AI competitions (the VQA Challenge 2018 and Vizwiz Challenge 2018) done by the Facebook AI Research (FAIR)’s A-STAR (Agents that See, Talk, Act, and Reason) team

Visual Question Answering (VQA) Challenge- Given an image and a natural language question about the image, the task is to provide an accurate natural language answer.