Shuhao Cao


University of Missouri–Kansas City


Monday, June 5, 2023 - 4:00pm to 5:00pm



RH 306

GPT, Stable Diffusion, AlphaFold 2, etc, all these state-of-the-art deep learning models use a neural architecture called "Transformers". Since the emergence of "Attention Is All You Need", Transformer is now the ubiquitous architecture in deep learning. At Transformer's heart and soul is the "attention mechanism". In this talk, we shall give a specific example of a fundamental but critical question: whether and how one can benefit from the theoretical structure of a mathematical problem to develop task-oriented and structure-conforming deep neural networks? An attention-based deep direct sampling method is proposed for solving Electrical Impedance Tomography, a class of boundary value inverse problems. Progresses within different communities to answer some open problems on the mathematical properties of the attention mechanism in Transformers will be briefed. This is joint work with Ruchi Guo (UC Irvine) and Long Chen (UC Irvine).