CSE Community Seminar | April 24, 2026
Dimensionless learning based on information
Yuan Yuan, Graduate student, Aeronautics and Astronautics, MIT
Abstract: Dimensional analysis is one of the most fundamental tools for understanding physical systems. However, the construction of dimensionless variables, as guided by the Buckingham-๐ theorem, is not uniquely determined. Here, we introduce IT-๐, a model-free method that combines dimensionless learning with the principles of information theory. Grounded in the irreducible error theorem, IT-๐ identifies dimensionless variables with the highest predictive power by measuring their shared information content. The approach is able to rank variables by predictability, identify distinct physical regimes, uncover self-similar variables, determine the characteristic scales of the problem, and extract its dimensionless parameters. IT-๐ also provides a bound of the minimum predictive error achievable across all possible models, from simple linear regression to advanced deep learning techniques, naturally enabling a definition of model efficiency. We benchmark IT-๐ across different cases and demonstrate that it offers superior performance and capabilities compared to existing tools. The method is also applied to conduct dimensionless learning for supersonic turbulence, aerodynamic drag on both smooth and irregular surfaces, magnetohydrodynamic power generation, and laser-metal interaction.
Generative AI for weather data assimilation
Ruizhe Huang, MechE-CSE SM student
Abstract: To anchor weather products in reality, data assimilation integrates observational data into physical simulations of the atmosphere. Traditional approaches do this by using numerical model forecasts as a prior, which is expensive. Today, researchers are exploring the use of deep generative models, such as diffusion models, as emulators to reconstruct full weather fields directly from sparse observations, but existing guidance-based approaches can be unstable or have not been evaluated under real-world conditions. We introduce GLaD-Flow (Guided Latent D-Flow), which combines Guidance and D-Flow within the latent space. It uses Latent D-Flow to optimize the latent initial noise using an observation loss, then generates full fields with observation guidance using the optimized initial noise produced by Latent D-Flow. We conduct a comprehensive benchmark over the Continental United States (CONUS) by training the flow model from 2017 to 2022 and testing in 2023. We generate full ERA5-like fields for 4 surface variables (10-meter wind, 2-meter temperature, and 2-meter dewpoint) from sparse ground station observations. We test the generalizability of our method by evaluating performance on held-out test weather stations. Our results show that GLaD-Flow reduces the Root Mean Square Error (RMSE) compared to ERA5 by over 31% on average across 1,778 test stations, while retaining ERA5 physics. We estimate that GLaD-Flow reduces ERA5 error by 22.0% at median-distance locations across the CONUS, demonstrating meaningful generalization beyond the immediate vicinity of observation stations. Our work demonstrates that unconditional generative models, particularly the GLaD-Flow framework, provide a promising tool for reducing the cost and improving the accuracy of weather products.