FCS6: Numerical integration of differential equations

David Young, October 1997

This teach file provides an elementary introduction to the ideas behind the numerical integration of differential equations.

Contents

Introduction

Often, the physics of a situation - or of a simulation - tells us how variables vary with one another, but does not directly tell us what values the variables have. The rule for how the variables co-vary might be expressed as a differential equation (that is, an equation involving derivatives), or it might be that we are simply have some method of obtaining values of derivatives. In either case, what we often want to do is to work out the values of the variables themselves.

For example, a mobile robot might be able to measure its velocity - it knows how position varies with time - but it might need to know its actual position in order to determine when it has reached its target, or which way to turn towards it. To estimate its position, it could use a kind of dead reckoning: it could add up the distance covered in successive steps, assuming each step is made at a constant velocity. Adding up small steps is known as integration; in mathematics it is carried out on infinitessimal steps, and is formally the inverse process to differentiation.

Sometimes, differential equations can be integrated symbolically, so that we obtain a straightforward formula for the variables in question. Here, though, we look at the case where this is not possible, either because the equation is too difficult to integrate symbolically, or because the derivatives are not given by a formula at all, but are tabulated or measured.

Textbooks on numerical analysis deal with the techniques needed in detail. A good example is "Numerical Analysis", by R.L. Burden & J.D. Faires (3rd Edition, PWS Publishers, 1985) - see chapter 5. A summary and practical advice can be found in "Numerical Recipes in C", by W.H. Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling (Cambridge University Press 1988, later reprints). Anyone needing to seriously use these techniques will need to spend time with at least one such book; in this file the aim is only to illustrate the underlying principle - which is in fact very simple.

Euler's method for initial value problems

A mobile robot moves down a corridor (a 1-D space), measuring its speed at successive time steps. We need to know how far down the corridor it is at any given time, perhaps so that its position can be displayed on a monitor, or so that we can stop it when it has gone far enough.

If the robot's position is x(t) at time t, then its speed v(t) is given by the simple differential equation

    v  =  dx/dt

We assume that we know where the robot starts from. This is called an initial value problem because the differential information is used to work out how the state evolves from a specified initial state.

There is an obvious way to integrate the equation to get values of x given values of v. Suppose v is sampled regularly at t = 0, t = 1, t = 2, etc., and that v changes slowly enough that we can assume that it is constant during the time between samples. In fact, we will just assume that v keeps the value it has when one sample is taken until the next sample is taken. Then the distance travelled between time t = 0 and time t = 1 is just v(0), and so on. If the robot starts at x = 0 at time t = 0, we can just compute the distance travelled in the first time step, then in the second, and so on, and add them up to get the current position. Negative speeds mean going backwards.

For example, if the speed is given by the middle column below, then this method gives the positions in the right hand column

    t           v               x
    0           2               0
    1           4               2
    2           3               6
    3           2               9
    4           0              11
    5          -1              11
    6          -2              10
    7          -3               8

and so on. The calculation is trivial and generalises to

    x(t)  =  x(t-1) + v(t-1)

If the interval between samples is not 1 unit of time, then the velocity must be multiplied by the interval to get the distance covered. We would write

    x(t)  =  x(t-T)   +   T * v(t-T)

where T means the interval between samples.

This is Euler's method for integrating the differential equation. It is evident that the shorter the time step that is used, the more accurate the results will be. Also, errors will accumulate, so the overall error will increase as time goes on. It shares these characteristics with all other numerical integration methods, but is extremely easy to implement. In a sense, it is the model for all the more sophisticated algorithms that you will encounter. Although numerical analysis books describe many more complex methods for integrating differential equations, Euler's method and simple variations on it are the workhorse of many physical simulations.

Euler's method extends to higher-order differential equations (those with multiple derivatives). For example, we might know the acceleration of the robot, not its speed (though we would need to know the initial speed). Euler's method can be applied twice, first to integrate the acceleration to get the speed, and then to integrate the speed to get the position.

Euler's method also applies to vector equations. If vectors are represented using rectangular coordinates, then each component can be integrated independently of the others. Thus a robot that measures its speed and direction can integrate them to find its position in 2-D or 3-D space.

A second example of the application of Euler's method occurs in recent work in computer vision. One of the major difficulties of image analysis is to find meaningful structures in an image. A very promising technique is to simulate a physical system which responds to "forces" generated by the image data. For instance, a string which is elastic and stiff, and which is attracted to local changes in image intensity, can be used to find connected smooth contours. Such a simulated string is let loose in the image, and allowed to attach itself to any structure it can find. The behaviour of these computational objects, known as active contours, is described by differential equations, and their evolution in time is simulated out using Euler's method.

More complex one-step methods

One-step methods are those which, like Euler's, try to make a prediction for time t+T using information only from time t or later, where T is the interval between times for which the function is estimated.

The limitation of Euler's method is that it assumes that the value of the derivative (the speed in the example above) is constant over each successive interval. More sophisticated methods attempt greater accuracy by weakening this assumption.

For example, in the midpoint method, the derivative (the speed in the robot example) is estimated at the centre of the interval rather than at its beginning, which clearly makes more sense. The problem with this that the speed might depend not only on elapsed time, but also on the position the robot has reached by the middle of the time interval. In a simulation, we do not have a direct measurement of the speed, so we need to know the robot's position to find the midpoint speed estimate. This seems circular. The way out is to use Euler's method to estimate the position the robot will have reached half-way into the time interval, then get a midpoint estimate of speed from that, and use that to get the position at the end of the interval.

This kind of thing can be elaborated further: a second mid-point estimate of the speed might be made, and combined with start-point and end-point estimates to get an overall speed for the final prediction. A class of methods based on this general idea, but carefully designed using an analysis of the errors that will occur, is the Runge-Kutta methods, one particular case of which forms a kind of standard in the area.

A different kind of refinement is to vary the step size T according to how rapidly the function seems to be fluctuating. The result obtained by taking two half-size steps can be compared with that from one full step to see whether significant accuracy is being lost on the full step, and the step size adjusted accordingly.

The main point to realise is that all these methods involve trade-offs amongst accuracy, computation time, and ease of implementation. There is no simple rule for choosing an algorithm, since this depends on the details of the problem - every methods exploits assumptions about the smoothness of the functions concerned in some way. The books mentioned above give considerable guidance, and give formal analyses of the errors: but the error expressions are in terms of higher-order derivatives, so the assumption of smoothness is still present. In the end the consistency of the results when parameters are varied across trials is probably the best check on whether a simulation is using sensible methods.

Multistep methods

The one-step methods discard information from before time t when they are calculating the prediction for time t+T. It seems reasonable to suppose that for smooth functions, this information should be retained and used. In general, some weighted sum of the previous estimates of the function and previous values of its derivative is going to be a good start. Methods which use this information are called multistep methods; a common example is the predictor-corrector method, which involves a weighted sum of the previous values and their derivatives to make a prediction, and then uses the derivative at this predicted point to improve the prediction further.

A generalisation

The weights for the weighted sums are calculated by considering the theoretical errors, and assuming that higher order derivatives of the functions are relatively small. However, the use of a weighted sum inevitably suggests that there might be something to be gained by making the weights adaptive, if it is the case that we find out what the true value of the function is at some point. For example, in financial forecasting, one might supply a set of recent values of a share price as inputs to a neural network, together with the values of other variables which might be expected to affect the price (and so might be related to its derivative with respect to time). The network's output could be taken as a prediction of the next day's price, and then when the true value came along, the network could be trained using any supervised learning technique. Such a network could learn to simulate at least the predictor step of the predictor-corrector method; since the network would be data adaptive, it should be capable of outperforming other methods for at least some classes of input. For a relatively simple signal, such as the level of the tide measured each hour, a simple weighted sum with adaptive weights can produce good predictions over surprisingly long periods.

Copyright University of Sussex 1997. All rights reserved.