Recent encouraging advances in computer vision and natural language understanding shed light on a very interesting yet challenging task: asking and answering questions about a given image (VQA). The study of this research problem is still in its infancy. Most existing VQA methods are neural network-based (NN-based) solutions that pursue...