High Dimensional Probability and Statistics

Course Description

This course is essentially about non-asymptotic, high dimensional probability and statistical theory which plays a fundamental role in modern data analysis, machine learning, as well as in scientific computing. Typical applications include sparse linear regression, principal component analysis, and randomized numerical algorithms. This course is designed to give graduate-level students a thorough grounding on the statistical tools for high dimensional data analysis. Some interesting examples and applications in data analysis will also be provided.

Content

  • Concentration inequalities of random variables based on Chernoff method, entropy method, and transportation method

  • Bounds for expectation of suprema

  • Uniform law of large numbers via VC dimension

  • Random matrix concentration and applications

  • Information theory basics and minimax lower bounds

Main References

  • High-dimensional statistics -- A non-asymptotic viewpoint by Martin J. Wainwright

  • High-dimensional probability: An introduction with applications in data science by Roman Vershynin

  • Probability in High Dimension by Ramon van Handel

  • Concentration inequalities: A nonasymptotic theory of independence by Stephane Boucheron, Gabor Lugosi, and Pascal Massart