Numerical approaches for sequential Bayesian optimal experimental design

See more on DSpace@MIT


Experimental data play a crucial role in developing and refining models of physical systems. Some experiments can be more valuable than others, however. Well-chosen experiments can save substantial resources, and hence optimal experimental design (OED) seeks to quantify and maximize the value of experimental data. Common current practice for designing a sequence of experiments uses suboptimal approaches: batch (open-loop) design that chooses all experiments simultaneously with no feedback of information, or greedy (myopic) design that optimally selects the next experiment without accounting for future observations and dynamics. In contrast, sequential optimal experimental design (sOED) is free of these limitations. With the goal of acquiring experimental data that are optimal for model parameter inference, we develop a rigorous Bayesian formulation for OED using an objective that incorporates a measure of information gain. This framework is first demonstrated in a batch design setting, and then extended to sOED using a dynamic programming (DP) formulation. We also develop new numerical tools for sOED to accommodate nonlinear models with continuous (and often unbounded) parameter, design, and observation spaces. Two major techniques are employed to make solution of the DP problem computationally feasible. First, the optimal policy is sought using a one-step lookahead representation combined with approximate value iteration. This approximate dynamic programming method couples backward induction and regression to construct value function approximations. It also iteratively generates trajectories via exploration and exploitation to further improve approximation accuracy in frequently visited regions of the state space. Second, transport maps are used to represent belief states, which reflect the intermediate posteriors within the sequential design process. Transport maps offer a finite-dimensional representation of these generally non-Gaussian random variables, and also enable fast approximate Bayesian inference, which must be performed millions of times under nested combinations of optimization and Monte Carlo sampling. The overall sOED algorithm is demonstrated and verified against analytic solutions on a simple linear-Gaussian model. Its advantages over batch and greedy designs are then shown via a nonlinear application of optimal sequential sensing: inferring contaminant source location from a sensor in a time-dependent convection-diffusion system. Finally, the capability of the algorithm is tested for multidimensional parameter and design spaces in a more complex setting of the source inversion problem.