Deep learning is a rapidly developing area of machine learning. It uses artificial neural networks to perform learning tasks. The architecture of these networks is inspired by the neural networks foound in the human brain. Modern algorithms based on deep learning often outperform all other available machine learning methods. Deep learning is the core of a lot of today's software and hardware used in speech recognition, natural language processing (e.g. Siri, Alexa), recommendation systems, image processing, and many other practical applications.
Mathematical description of neural networks is simple and elegant: neural networks compute iterative compositions of linear and non-linear maps. However, mathematical theory of deep learning remains elusive. Even the most basic questions about remain open. For example, how many different functions can a neural network compute? This problem is loosely inspired by the question: how many thoughts can the brain process? How many things can we remember? What neuronal connections should form in the brain to maximize its capacity? Pierre Baldi (UCI CS) and I proved a general capacity formula, which is valid for all fully connected networks. The formula predicts, counterintuitively, that shallow networks have greater capacity than deep ones. So, the mystery remains.