AeroAstro-CSE PhD Thesis Defense: Aimee Maurais
Abstract:
Sampling from a target probability distribution is fundamental to modern computational science and machine learning. Sampling is the essence of Monte Carlo integration, enables uncertainty quantification in Bayesian inference, and underlies generative models that have the ability to synthesize convincing text, images, and far beyond. A powerful, emerging approach to sampling is dynamic measure transport (DMT): the idea is to design an ordinary or stochastic differential equation that evolves samples from a tractable reference distribution (e.g., a Gaussian) to the desired target distribution. DMT is state-of-the-art in generative modeling and underlies techniques such as diffusion models and flow-matching, but DMT pipelines for density-driven sampling tasks, as arising in computational chemistry and Bayesian inference, have seen significantly less development. In this thesis we develop tractable algorithms for density– and likelihood–driven dynamic measure transport and introduce frameworks to enable effective design and structure exploitation within these and other DMT approaches. In the first chapter, we introduce a family of gradient-free interacting particle systems for density-driven sampling which are available in closed form and can even be used in Bayesian inference settings where only samples from a prior and evaluations of the likelihood (rather than access to the full posterior density) are available. We demonstrate that these particle systems are able to effectively sample from distributions with features such as multimodality, anisotropy, and concentration on manifolds which cannot be sampled faithfully by comparable interacting particle system algorithms which rely on parametric approximations of the target. Mathematically, our interacting particle systems possess Fisher-Rao gradient flow characterizations and, at the same time, can be interpreted through the lens of optimal transport. In the second chapter we study choices of paths of distributions for density-driven DMT and highlight the fact that certain off-the-shelf paths can be highly unsuitable for DMT when the reference distribution is unimodal and the target is multimodal. The presence of “teleportation behavior” along such paths can lead to catastrophically poor sampling performance, via mode collapse, in practice. We introduce a PDE-constrained optimization framework to enable principled design of paths and demonstrate that our framework is able to correct teleportation behavior and yield paths corresponding to smooth velocity fields which are tractable to approximate and avoid mode-collapse. Finally, in the third chapter we demonstrate how sparse conditional dependence structure can be exploited to enable scalability of DMT-based sampling approaches to high dimensions. We show under certain regularity assumptions that sparse conditional dependence in the target and reference distributions implies approximate sparse variable dependence in velocity fields for transport over two commonly used paths of measures, and introduce computational approaches for learning velocities with sparse variable dependence in practice. We demonstrate that imposing this sparse dependence structure on DMT velocity fields can lead to a favorable bias-variance tradeoff when the data or computational resources available for learning the velocity fields are limited and result in better sampling performance.
Thesis Committee Members:
- Professor Youssef M. Marzouk, Breene M. Kerr (1951) Professor, Department of Aeronautics & Astronautics, MIT (Chair)
- Professor Philippe Rigollet, Cecil and Ida Green Distinguished Professor, Department of Mathematics, MIT
- Professor Justin Solomon, Associate Professor, Department of Electrical Engineering and Computer Science, MIT
- Professor Benjamin Peherstorfer, Associate Professor, Courant Institute of Mathematical Sciences, New York University (Reader)
- Professor Bamdad Hosseini, Assistant Professor, Department of Applied Mathematics, University of Washington (Reader)