Optimization Methods for Training Neural Networks

February 8, 2018, 12:00PM

Most high-dimensional nonconvex optimization problems cannot be solved to optimality. It has been observed, however, that deep neural networks have a benign geometry that permits standard optimization methods to find acceptable solutions. However, solution times can be exorbitant. In addition, not all minimizers of the neural network loss functions are equally desirable, as some lead to prediction systems with better generalization properties than others. In this talk we discuss classical and new optimization methods in the light of these observations, and conclude with some open questions.

Jorge Nocedal
Walter P. Murphy Professor
Department of Industrial Engineering and Management Sciences
Northwestern University

Speaker: Jorge Nocedal
MIT Distinguished Seminar Series in Computational Science and Engineering