Deep learning as optimal control problems with applications to mechanical systems

Brynjulf Owren
Department of Mathematical Sciences, Norwegian University of Science and Technology (NTNU)

Abstract:

Supervised deep learning can be thought of as a way of approximating an unknown function based on a set of data samples. The space of approximating functions is represented by a neural network. The residual neural networks constitute a particular type of network architecture that enjoys a close connection to ordinary differential equations, the so called neural ODE. The training of the network can in this sense be cast as a continuous optimal control problem. This allows for a number of opportunities for discretisations and analysis of the resulting dynamical system. We shall discuss some of these features, such as stability and equivariance.

In the second part of the talk, we shall present the application of deep neural networks to Hamiltonian systems. We shall then assume that the Hamiltonian is unknown and estimate it from observed measurements by a recurrent neural network. In this case, there will be back propagation on two levels, one for computing the Hamiltonian vector field from the neural network representation of the Hamiltonian, and secondly for computing gradients of the network with respect to the parameters. We consider a hybrid model where the kinetic energy is assumed to have a standard from, whereas the potential energy is represented as a residual neural network. We give some ideas about how to handle constrained Hamiltonian problems. Some numerical experiments will be shown.

The first part of the talk is joint work with Martin Benning, Elena Celledoni, Matthias Ehrhardt, Christian Etmann, Robert McLachlan, Carola Schönlieb and Ferdia Sherry, The second part is done in collaboration with Elena Celledoni, Ergys Cokaj, Andrea Leone and Davide Murari.