Hi! I am Makarand Tapaswi. I will be joining University of Toronto starting in October 2016 as a post-doctoral fellow. I was a PhD student at the Computer Vision for Human Computer Interaction (CVHCI) lab, Karlsruhe Institute of Technology, Germany.
My work is on helping machines analyze and understand stories seen in videos, particularly TV series and films. As all stories revolve around characters, we lay a strong focus on analyzing them, answering typical questions through popular vision tasks — detection, clustering, identification — to break the complex video into understandable meta-data (e.g., who appears when).
I work particularly in the area of enhancing such video meta-data along with rich textual descriptions in the form of plot synopses (example) or books (for film-adaptations) to perform semantic tasks such as story-based retrieval, or weak label mining. Inspired by XKCD, we also worked on visualization of the storyline through character interactions. We have also built a large benchmark for evaluating machine understanding of stories by asking it to answer questions about them. This benchmark, MovieQA, is now available for download.
If you have any questions, please feel free to contact me.
Two papers at CVPR 2016! Understanding movie stories through Question-Answering (Paper, arXiv, Project). Predicting class-attribute associations for zero-shot learning (Paper).
Paper on identifying actors only using subtitles. Download
MovieQA was featured on MIT Technology Review!