Many problems of contemporary interest in signal processing and machine learning involve highly non-convex optimization problems. While nonconvex problems are known to be intractable in general, simple local search heuristics such as (stochastic) gradient descent are often surprisingly effective at finding global optima on real or randomly generated data. In this talk I will discuss some results explaining the success of these heuristics by connecting convergence of nonconvex optimization algorithms to supremum of certain stochastic processes. I will focus on two problems.
The first problem, concerns the recovery of a structured signal from under-sampled random quadratic measurements. I will show that projected gradient descent on a natural nonconvex formulation finds globally optimal solutions with a near minimal number of samples, breaking through local sample complexity barriers that have emerged in recent literature. I will also discuss how these new mathematical developments pave the way for a new generation of data-driven phaseless imaging systems that can utilize prior information to significantly reduce acquisition time and enhance image reconstruction, enabling nano-scale imaging at unprecedented speeds and resolutions. The second problem is about learning the optimal weights of the shallowest of neural networks consisting of a single Rectified Linear Unit (ReLU). I will discuss this problem in the high-dimensional regime where the number of observations are fewer than the ReLU weights. I will show that projected gradient descent on a natural least-squares objective, when initialization at 0, converges at a linear rate to globally optimal weights with a number of samples that is optimal up to numerical constants.
Variance components (a.k.a. random/mixed effects) models are commonly used to determine genetic variance-covariance matrices of quantitative phenotypic traits in a population. The eigenvalue spectra of such matrices describe the evolutionary response to selection, but may be difficult to estimate from limited samples when the number of traits is large. In this talk, I will discuss the eigenvalues of classical MANOVA estimators of these matrices, including a characterization of the bulk empirical eigenvalue distribution, Tracy-Widom fluctuations at the spectral edges under a ``sphericity'' null hypothesis, the behavior of outlier eigenvalues under spiked alternatives, and a statistical procedure for estimating true population spike eigenvalues from the sample. These results are established using tools of random matrix theory and free probability.
We will discuss recent developments relating Poissonian soups of random walks à la Lawler-Werner, Le Jan and Sznitman, with Gaussian free fields. We will show how the underlying correspondence, which can be traced back to early work of Symanzik in constructive field theory, can be used to effectively study phase transitions in such systems.
Capital Fund Management (Paris) and UCLA Applied Mathematics
Tuesday, October 17, 2017 - 11:00am
Modern financial portfolio construction uses mean-variance optimisation that requiers the knowledge of a very large covariance matrix. Replacing the unknown covariance matrix by the sample covariance matrix (SCM) leads to disastrous out-of-sample results that can be explained by properties of large SCM understood since Marcenko and Pastur. A better estimate of the true covariance can be built by studying the eigenvectors of SCM via the average matrix resolvent. This object can be computed using a matrix generalisation of Voiculescu’s addition and multiplication of free matrices. The original result of Ledoit and Peche on SCM can be generalise to estimate any rotationally invariant matrix corrupted by additive or multiplicative noise. Note that the level of rigor of the seminar will be that of statistical physics.
Previous works have shown that arctic circle phenomenons and limiting behaviors of some integrable discrete systems can be explained by a variational principle. In this talk we will present the first results of the same type for a non-integrable discrete system: graph homomorphisms form Z^d to a regular tree. We will also explain how the technique used could be applied to other non-integrable models.