Dynamic neural networks and their relation to state-space methods – Unravelling Neural Networks with Structure-Preserving Computing

Dynamic neural networks are networks where the action of neurons is described by scalar ordinary differential equations, enabling the simulation of time-dependent phenomena. Such networks can be used in practical applications, e.g. electronic circuit simulation and gravitational N-body simulations (subproject 6).

This type of neural networks deviates from the commonly used static networks, where neurons contain either the Sigmoid or ReLU function. The dynamic modelling method combines and extends ideas from the popular error backpropagation method for neural networks, and time domain extensions to neural networks. The two most prevalent approaches extend either the fully connected Hopfield-type recurrent networks, or the feedforward networks used in backpropagation learning. We will describe extensions along this second line, because the absence of feedback loops greatly facilitates giving theoretical guarantees on several desirable model(ling) properties. Describing neuron action by 2nd order differential equations is inspired by the electronics industry, making the neurons either high or low pass filters depending on the choice of parameters.

Previous work on this type of dynamic neural networks revealed an intimate relationship between the structure and parameters of the network with state space models used to describe the underlying system. The observed relation enables one to avoid a trial-and-error approach regarding the topology of the network, often leading to non-optimal structures. It implies that the state space system can be used to predict the topology of the neural network (number of hidden layers, number of neurons in a layer), as well as the weights in the connections between layers, and the parameters of the differential equations in the neurons. For example, the number of hidden layers was shown to be related to the multiplicity of eigenvalues of the matrices occurring in the state space system describing the same input-output problem, whereas the number of neurons in a hidden layer is related to the number of complex eigenvalues of these matrices.

We will first carry out a much more detailed study of the relation between dynamic neural networks and state space formulations in the linear case: what are the precise relations between the two input-output formulations, how are the parameters of both formulations related, and how can this be exploited to define the topology of the neural networks used. Based on those findings we will investigate error estimation for neural networks, allowing us to assess the prediction capabilities of the latter. Then, also for the linear case, we will perform research into how to construct mimetic neural networks using the relations to the state-space formulations, mainly in port-Hamiltonian form. The observed relations between state space methods and dynamic neural networks will enable the construction of structure-preserving neural networks, and will yield accurate approximations in a much more efficient way and with a much smaller dependence on data. To improve the efficiency even further, methods for model order reduction (MOR) of neural networks will be investigated: how can we translate MOR methods used for reducing state-space systems to MOR methods for neural networks? Answers to this question will open up a large variety of potential methods, and initiate an entirely new direction of research within the MOR community. Both Krylov-type methods as well as balanced truncation methods will be considered. The strong theoretical framework of the latter type of methods is attractive, and research will be performed into how to extend this to neural networks.

These initial findings and methods will be extended to the important case of linear differential-algebraic equations (DAEs), as many problems are formulated in this way. Here, the index-preserving MOR methodologies developed at TU Eindhoven will be used as a basis. Relevant questions then are how the concept of ‘index’ (there are several definitions, but we prefer the concept of tractability index based upon the theory developed by Maerz), and how such theory can be developed for dynamic neural networks. Researching similar questions for the nonlinear case (EIM, DEIM, DMD) in the final phases of the project.

For this subproject the PhD student Anna Shalova has been appointed under the supervision of lead researcher Wil Schilders. Both are located in the Centre for Analysis, Scientific Computing and Applications at the Eindhoven University of Technology.