High-Dimensional Probability

An Introduction with Applications in Data Science

2019 Prose Award for Mathematics

Who is this book for?

This textbook is aimed at doctoral students, advanced master's students, and beginning researchers in mathematics, statistics, computer science, electrical engineering, and related fields, who seek to deepen their understanding of probabilistic methods commonly used in modern data science research. It can be used for self-study or as a textbook for a second probability course with data science applications.

Why this book?

Data science is evolving rapidly, and probabilistic methods are key to these advances. A typical graduate probability course no longer provides the mathematical sophistication needed for early-career data science researchers. This book aims to fill that gap, presenting essential probabilistic methods and results for mathematical data scientists.

Are you ready?

To read this book, you will need a solid knowledge of probability theory (at the masters or doctoral level), strong undergraduate linear algebra, and some familiarity with metric, normed, and Hilbert spaces. Measure theory is not required.

Roman Vershynin

I am Professor of Mathematics at the University of California, Irvine. My research spans high-dimensional probability and mathematical data science.

Take a look at my webpage to learn more. E-mail me.

NEW! The draft of the second edition is now online:

Download the Second Edition

Want to get notified when the printed version is out? Email me: rvershyn@uci.edu

The first edition in printed form can be purchased on Amazon and in Cambridge University Press. It is freely available online:

Download the First Edition

News

November 28, 2025. A couple more typos are corrected.
October 24, 2025. The proofs are returned to the publisher.
October 20, 2025. More typos and inaccuracies have been fixed, and the references have been cleaned. Thank you for your input!
August 20, 2025. I've fixed a few more typos and inaccuracies -- thanks again for your feedback! The book is now in the copy-editing stage. The printed version is expected in March 2026.
May 27, 2025. A few typos and inaccuracies are fixed -- thanks for the feedback! The book is about to go into copy-editing.
May 12, 2025. The draft of the second edition is posted.
May 20, 2024. Lots of minor corrections made.
June 9, 2020. Further typos and inaccuracies have been fixed.
February 12, 2020. More typos, inaccuracies and gaps have been fixed in the electronic version of the book. The printed version sold on Amazon and Cambridge Unibversity Press website will also be updated accordingly.
May 24, 2019. Lots of typos, inaccuracies and gaps have been fixed in the electronic version of the book.
September 30, 2018. The book is published. Buy it on Amazon.
August 21, 2018. The book is in press now. It is going to be released in September. Stay tuned!
June 7, 2018, Minor corrections made in the first proofs.
May 25, 2018. The book is available for pre-order on Amazon.
April 18, 2018. Multiple minor corrections at the copy-editing stage. The book is going to production now.
February 9, 2018. Final polishing (including more references) due to feedback of the readers. The book is about to go into copy-editing.
January 23, 2018. More polishing was done. Many figures look nicer thanks to my student Jennifer Bryson.
December 27, 2017. Thanks to the feedback of the readers, multiple clarifications and corrections were incorporated.
August 24, 2017.The entire book is now ready to be published. The preface may still be expanded a bit, but the technical material is complete.
June 8, 2017. Chapter 11, and thus the whole book, is now polished. I will make one more (third) pass over the book, adding some exercises and the preface.
June 7, 2017. Chapter 10 is now polished. Section 10.5.2 on the restricted isometry property is added.
June 2, 2017. Chapters 8 and 9 are now polished. Section 9.2.3 is added, where we quickly derive Koltchinskii-Lounici bounds on covariance estimation from matrix deviation inequality.
May 23, 2017. An "Appetizer" added to the front of the book. It presents the so-called Maurey's empirical method, which is an elegant and elementary application of probability to bound covering numbers of sets. Chapter 7 is now polished.
April 27, 2017. Chapter 6 is now polished.
April 20, 2017. Chapter 5 is now polished. I cleaned up the guarantees of covariance estimation both in this chapter and those appeared earlier in Chapter 4.
February 23, 2017. Chapter 4 is now polished. I added an application to error correction codes in Section 4.3 and rewrote the application for covariance estimation in Section 4.7.
February 9, 2017. Chapter 3 is now polished. I added a section (3.7) on kernel methods and Krivine's proof of Grothendieck's inequality, which gives (almost) the best known bound on the constant.
January 20, 2017. Chapter 2 is now polished.
January 4, 2017. Chapter 1 has been polished. The difficulty of exercises will be indicated by the number of coffee cups one may need to solve them.
December 21, 2016. Numerous typos and inaccuracies fixed throughout the book. It was then converted into the publisher's style, which miraculously reduced the number of pages by 50!
December 20, 2016. A short version of this book, condensed into just four lectures, can be found here.
November 15, 2016. Two big sections are added in Chapter 8: VC dimension and applications in statistical learning theory.
October 24, 2016. A few applications are added to Chapter 3: Grothendieck's inequality, semidefinite programming, and maximum cut for graphs.