Rob Webber




Monday, November 21, 2022 - 4:00pm



RH 306

Kernel computations are used for prediction, interpolation, and clustering in data science and scientific computing applications, but applying kernel methods to a large number of data points N is expensive due to the high cost of manipulating the N x N kernel matrix. A basic approach for speeding up kernel computations is low-rank approximation, in which we replace the kernel matrix A with a factorized approximation that can be stored and manipulated more cheaply. When the kernel matrix A has rapidly decaying eigenvalues, mathematical existence proofs guarantee that A can be accurately approximated using a constant number of columns (without ever looking at the full matrix). Nevertheless, for a long time designing a practical and provably justified algorithm to select the appropriate columns proved challenging.

Recently, we introduced RPCholesky ("randomly pivoted" or "rocket-propelled" Cholesky decomposition), a natural algorithm for approximating an N x N positive semidefinite matrix using k adaptively sampled columns. RPCholesky can be implemented with just a few lines of code; it requires only (k + 1) N entry evaluations and O(k^2 N) additional arithmetic operations. In experiments, RPCholesky matches or improves on the performance of alternative algorithms for low-rank psd approximation. Moreover, RPCholesky provably achieves near-optimal approximation guarantees. The simplicity, effectiveness, and robustness of this algorithm strongly support its use for large-scale kernel computations.

Joint work with Yifan Chen, Ethan Epperly, and Joel Tropp. Available at arXiv:2207.06503.