Deep learning in vision and language intelligence

Xiaodong He
Microsoft Research
Thu, 05/25/2017 - 4:00pm - 5:00pm
Jack Xin
RH 306

Deep learning, which exploits multiple levels of data representations that give rise to hierarchies of concept abstraction, has been the driving force in the recent resurgence of Artifi cial Intelligence (AI). In this talk, I will summarize rapid advances in cognitive AI, particularly including comprehension, reasoning, and generation across vision and natural language, and applications in vision-to-text captioning, text-to-image synthesis, and reasoning grounded on images for question answering and dialog. I will also discuss future AI breakthrough that will bene fit from multi-modal intelligence, which empowers the communication between humans and the real world and enables enormous scenarios such as universal chat-bot and intelligent augmented reality.