« Previous « Start » Next »
4 Basic Use of Forward Mode for First Derivatives
4.1 Differentiating Matlab Expressions
Using forward mode AD,
Mad can easily compute first derivatives of functions defined via
expressions built-up of the arithmetic operations and intrinsic
functions of
Matlab.
For the moment let us simply consider the trivial operation
y=
f(
x)=
x2 for which the derivative is 2
x. Let us determine this derivative at
x=3 for
which
f(3)=9 and
df(3)/
dx=6. We will make use of the
fmad constructor function
fmad, and the functions
getvalue and
getderivs for extracting values and derivatives from
fmad objects.
Example 1
MADEXForward1: Forward Mode Example y=x2
This example demonstrates how to differentiate simple arithmetic expressions in Matlab.
See also MADEXForward2
Contents
-
Differentiating y=f(x)=x2
- Using fmad
- Calculating y
- Extracting the value
- Extracting the derivatives
Differentiating y=f(x)=x2
MAD can easily compute first derivatives of functions defined via expressions built-up of the arithmetic operations and intrinsic functions of Matlab using forward mode AD. For the moment let us simply consider the trivial operation
y=f(x)=x2
for which the derivative is
Let us determine this derivative at x=3 for which f(3)=9 and df(3)/dx=6.
Using fmad
We first define x to be of fmad class and have value 3 and derivative 1 (since the derivative with respect to the independent variable is one, dx/dx=1). The command and Matlab output is as below.
format compact % so there are no blank lines in the output
x=fmad(3,1) % creates the fmad object with value 3 and derivative 1
value =
3
derivatives =
1
Calculating y
The value of the object x is 3 and its derivatives correspond to an object of size [1 1] (a scalar) with one derivative of value 1. We now calculate y exactly as we would normally, allowing the fmad function libraries to take care of the arithmetic associated with values and derivatives.
y=x^2
value =
9
derivatives =
6
Extracting the value
We extract the value of y using the MAD function getvalue.
yvalue=getvalue(y)
yvalue =
9
Extracting the derivatives
Similarly we extract the derivatives using getderivs.
yderivs=getderivs(y)
yderivs =
6
4.2 Vector-Valued Functions
Let us now consider a simple, linear vector-valued function for which we may use
fmad to calculate Jacobian-vector products or the function's Jacobian. We also see, for the first time
fmad's
getinternalderivs function.
Example 2
MADEXForward2: Forward Mode Example y=A*x
This example demonstrates how to calculate Jacobian-vector products and how to calculate the Jacobian for a simple vector-valued function.
See also MADEXForward1
Contents
-
Differentiating y=f(x)=A*x
- Directional derivative in [1;0;0] direction
- Directional derivative in [0;1;0] direction
- Calculating the Jacobian
- But dy is a 3-D array not a matrix!
Differentiating y=f(x)=A*x
MAD easily deals with vector-valued functions. Let us consider the function
y=f(x)=A*x
at x=[1; 1 ;1 ] where A is the matrix,
A=[ 1 4 7
2 5 8
3 6 9]
for which the Jacobian is the matrix A and directional derivatives in the 3 respective coordinate directions are the 3 respective columns of A.
format compact;
A=[1 4 7;2 5 8; 3 6 9];
xval=[1;1;1];
Directional derivative in [1;0;0] direction
Using the fmad constructor function we initialise x with its directional derivative, calculate y and extract the derivative. We find the correct directional derivative [1;2;3].
x=fmad(xval,[1;0;0])
y=A*x;
dy=getderivs(y)
fmad object x
value =
1
1
1
derivatives =
1
0
0
dy =
1
2
3
Directional derivative in [0;1;0] direction
Similarly in the [0;1;0] direction we get [4;5;6].
x=fmad(xval,[0;1;0]);
y=A*x;
dy=getderivs(y)
dy =
4
5
6
Calculating the Jacobian
To obtain the full Jacobian we initialise the derivative with the 3 x 3 identity matrix.
x=fmad(xval,eye(3))
y=A*x
dy=getderivs(y)
fmad object x
value =
1
1
1
derivvec object derivatives
Size = 3 1
No. of derivs = 3
derivs(:,:,1) =
1
0
0
derivs(:,:,2) =
0
1
0
derivs(:,:,3) =
0
0
1
fmad object y
value =
12
15
18
derivvec object derivatives
Size = 3 1
No. of derivs = 3
derivs(:,:,1) =
1
2
3
derivs(:,:,2) =
4
5
6
derivs(:,:,3) =
7
8
9
dy(:,:,1) =
1
2
3
dy(:,:,2) =
4
5
6
dy(:,:,3) =
7
8
9
But dy is a 3-D array not a matrix!
Notice now that we have a 3-dimensional array returned for a result we expect to be a matrix. This is easily remedied using the Matlab intrinsic squeeze, which squeezes out singleton dimensions, or more conveniently using getinternalderivs to extract the derivatives from the fmad object y.
dy=squeeze(dy)
dy=getinternalderivs(y)
dy =
1 4 7
2 5 8
3 6 9
dy =
1 4 7
2 5 8
3 6 9
This example brings us to an important topic, how are the derivatives
stored in
Mad?
4.3 Storage of Derivatives in Mad
Mad stores derivatives in two ways depending upon whether a variable
of
fmad class possesses single, or multiple directional derivatives.
4.3.1 Single Directional Derivatives
For a variable of
fmad class with a single directional
derivatives the derivatives are stored as an array of class
double with the same shape as the variable's value. The
getderivs function will therefore return this array. We have seen
this behaviour in Example
1 and Example
2.
4.3.2 Multiple Directional Derivatives
For a variable of
fmad class with multiple directional
derivatives, the derivatives are stored in an object of class
derivvec.
Mad has been written so that multiple directional derivatives may be thought of as an array of one-dimension higher than that of the corresponding value
using, what we term, the derivative's
external
representation.
Mad also has an
internal
representation of
derivatives that may often be of use, particularly when interfacing
Mad with another
Matlab package. Accessing the internal representation is covered in
section
5.1.
Consider an N-dimensional
Matlab array
A with
size(A)=[n1 n2 n3 ... nN]
, such that
(n1 n2 ... nN)
are positive integers. Then the external representation of
A's derivatives, which we shall call
dA is designed
to have
size(dA)=[n1 n2 n3 ... nN nD], where
nD>1 is
the number of directional derivatives. An example or two may make
things clearer.
Example 3
MADEXExtRep1: External representation of a matrix
This example illustrates MAD's external representation of derivatives for a matrix.
See also MADEXExtRep2, MADEXIntRep1
Contents
-
Define a 2 x 2 matrix A.
- Multiple directional derivatives for A
- The fmad object
- The derivs component
Define a 2 x 2 matrix A.
format compact
A=[1 2; 3 4]
A =
1 2
3 4
Multiple directional derivatives for A
Define a 2 x 2 x 3 matrix (note the transpose) to provide 3 directional derivatives for dA.
dA=reshape([1 2 3 4;5 6 7 8;9 10 11 12]',[2 2 3])
dA(:,:,1) =
1 3
2 4
dA(:,:,2) =
5 7
6 8
dA(:,:,3) =
9 11
10 12
The fmad object
We create the fmad object,
A=fmad(A,dA)
value =
1 2
3 4
Derivatives
Size = 2 2
No. of derivs = 3
derivs(:,:,1) =
1 3
2 4
derivs(:,:,2) =
5 7
6 8
derivs(:,:,3) =
9 11
10 12
The derivs component
The last index of the A's derivs component refers to the directional derivatives and we see that, as expected, we have a 2 x 2 x 3 array which makes up 3 sets of 2 x 2 directional derivative matrices.
Note that
fmad (like
Matlab and Fortran) assumes a
column
major (or first index fastest) ordering for array storage. The
derivatives supplied to the constructor function
fmad are
therefore assumed to be in column major order and the external
representation
reshapes them appropriately. We therefore needn't
bother reshaping the derivatives before passing them to
fmad
as the example below shows.
Example 4
MADEXExtRep2: External Representation Example
This example shows the use of column-major ordering for an input derivative.
See also MADEXExtRep1, MADEXIntRep1
Contents
-
Define a 2 x 2 matrix A.
- Multiple directional derivatives for A
- The fmad object
- The derivs component
Define a 2 x 2 matrix A.
format compact
A=[1 2; 3 4]
A =
1 2
3 4
Multiple directional derivatives for A
Define a vector to provide directional derivatives for dA.
dA=[1 2 3 4 5 6 7 8 9 10 11 12];
The fmad object
We create the fmad object,
A=fmad(A,dA)
value =
1 2
3 4
Derivatives
Size = 2 2
No. of derivs = 3
derivs(:,:,1) =
1 3
2 4
derivs(:,:,2) =
5 7
6 8
derivs(:,:,3) =
9 11
10 12
The derivs component
Notice how the elements of dA are taken in column major order to provide the 3 directional derivatives of A in turn.
4.4 Use of Intrinsics (Operators and Functions)
We have produced overloaded
fmad functions for most commonly used
Matlab operators and functions. For example addition of two
fmad
objects is determined by the function
@fmad\plus
, subtraction
@fmad\minus
, etc. To see which overloaded
intrinsic operators and functions are provided type
help fmad.Contents
at the
Matlab prompt. The help information for individual functions describes the function and any additional information such as restrictions, e.g.
help fmad.plus
4.5 Additional Functions
We have also coded a small number of additional functions which are
not part of
Matlab but application specific. For example, for a
matrix
A,
logdet(A)
calculates
log(abs(det(A))) a function used in
statistical parameter estimation. This composition of functions may
be differentiated very efficiently by considering it in this composite
form [
Kub94, For01]. Note that a version of the
function valid for arguments of classes
double and
sparse is provided in the
MAD/madutil
.
4.6 Differentiating User Defined Functions
In section
4.1 we saw how simple
Matlab
expressions could be differentiated using
Mad. It is usually
straightforward to apply the same methodology to differentiating user
defined functions.
Example 5
Consider the function which defines the Brusselator stiff ODE problem
from the Matlab ODE toolbox, supplied as brussode_f.m
and shown in
Figure 1.
For a given scalar time t (unused) and vector y of
2N rows, this function brussode_f will supply the
right-hand-side function for an ODE of the form,
The Jacobian of the function is
required to solve the ODE efficiently via an implicit stiff ODE solver
(e.g., ode15s in Matlab). The function has actually been
written in so-called vectorised
form, i.e. if y is a
matrix then the dydt supplied will also be a matrix with each
column of dydt corresponding to the appropriate column of
y. This approach allows for the efficient approximation of
the function's Jacobian via finite-differences.
It is trivial to calculate the Jacobian of the function of
figure 1 using Mad. The commands to do so are
given in the script MADEXbrussodeJac.m. The script also includes a simple one-sided finite-difference (FD) calculation of the Jacobian.
MADEXbrussodeJac: Full Jacobian of Brusselator Problem
This example shows how to use the fmad class to calculate the Jacobian of the Brusselator problem and compare the result with those from finite-differencing. More efficient AD approaches are available as shown in MADEXbrussodeJacSparse, MADEXbrussodeJacComp
See also: brussode_f, brussode_S, MADEXbrussodeJacSparse, MADEXbrussodeJacComp, brussode
Contents
-
Initialise variables
- Use fmad to calculate the Jacobian
- Using finite-differencing
- Comparing the AD and FD Jacobian
Initialise variables
We define the input variables.
N=3;
t=0;
y0=ones(2*N,1);
Use fmad to calculate the Jacobian
We initialise y with the value y0 and derivatives given by the identity matrix of size equal to the length of y0. We may then evaluate the function and extract the value and Jacobian.
y=fmad(y0,eye(length(y0)));
F=brussode_f(t,y,N);
F_fmad=getvalue(F); % grab value
JF_fmad=getinternalderivs(F) % grab Jacobian
JF_fmad =
-2.6400 1.0000 0.3200 0 0 0
1.0000 -1.6400 0 0.3200 0 0
0.3200 0 -2.6400 1.0000 0.3200 0
0 0.3200 1.0000 -1.6400 0 0.3200
0 0 0.3200 0 -2.6400 1.0000
0 0 0 0.3200 1.0000 -1.6400
Using finite-differencing
We make 2N extra copies of y, add a perturbation to the copies, perform all function evaluations taking advantage of the vectorization of the function brussode_f, and approximate the derivatives using 1-sided finite-differences.
y=repmat(y0,[1 2*N+1]);
y(:,2:end)=y(:,2:end)+sqrt(eps)*eye(2*N);
F=brussode_f(t,y,N);
F_fd=F(:,1);
JF_fd=(F(:,2:end)-repmat(F_fd,[1 2*N]))./sqrt(eps)
JF_fd =
-2.6400 1.0000 0.3200 0 0 0
1.0000 -1.6400 0 0.3200 0 0
0.3200 0 -2.6400 1.0000 0.3200 0
0 0.3200 1.0000 -1.6400 0 0.3200
0 0 0.3200 0 -2.6400 1.0000
0 0 0 0.3200 1.0000 -1.6400
Comparing the AD and FD Jacobian
The function values computed are, of course, identical. The Jacobians disagree by about 1e-8, this is due to the truncation error of the finite-difference approximation.
disp(['Function values norm(F_fd-F_fmad) = ',num2str(norm(F_fd-F_fmad,inf))])
disp(['Function Jacobian norm(JF_fd-JF_fmad)= ',num2str(norm(JF_fd-JF_fmad,inf))])
Function values norm(F_fd-F_fmad) = 0
Function Jacobian norm(JF_fd-JF_fmad)= 2.861e-008
function dydt=brussode_f(t,y,N)
% brussode_f: defines ODE for the Brusselator problem.
% Taken from matlab 6.5 brussode.
%
% See also MADEXbrussode, brussode_S, brussode
% AUTHOR: S.A.Forth
% DATE: 30/5/06
% Copyright 2006: S.A.Forth, Cranfield University
%
% REVISIONS:
% DATE WHO WHAT
c = 0.02 * (N+1)^2;
dydt = zeros(2*N,size(y,2)); % preallocate dy/dt
% Evaluate the 2 components of the function at one edge of the grid
% (with edge conditions).
i = 1;
dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(1-2*y(i,:)+y(i+2,:));
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(3-2*y(i+1,:)+y(i+3,:));
% Evaluate the 2 components of the function at all interior grid points.
i = 3:2:2*N-3;
dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + ...
c*(y(i-2,:)-2*y(i,:)+y(i+2,:));
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + ...
c*(y(i-1,:)-2*y(i+1,:)+y(i+3,:));
% Evaluate the 2 components of the function at the other edge of the grid
% (with edge conditions).
i = 2*N-1;
dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(y(i-2,:)-2*y(i,:)+1);
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(y(i-1,:)-2*y(i+1,:)+3);
Figure 1: Brusselator function supplied with Matlab ODE Toolbox
We should note that the above approach is not efficient either for FD
or AD for large
N. The Jacobian is sparse and exploiting the
sparsity greatly increases the efficiency of both approaches.
Exploiting sparsity will be covered in Sections
5.3
and
5.4.
4.7 Possible Problems Using the Forward Mode
In many cases
fmad will be able to differentiate fairly involved
functions as shown in Figure
1. However some
problems in terms of robustness and efficiency may occur.
-
Assigning an fmad object to a variable of class double. For example,
>> x=fmad(1,1);
>> y=zeros(2,1);
>> y(1)=x;
??? Conversion to double from fmad is not possible.
See Section 5.2 and Example 4 for a solution to this problem.
- Differentiation for a particular function is not supported. For example,
>> x=fmad(1,1);
>> y=tanh(x)
??? Error using ==> tanh
Function 'tanh' not defined for variables of class 'fmad'.
Please contact the author to arrange for the fmad class to be
extended for your case.
- Derivatives are not accurate when differentiating an iteratively
defined function. See section 5.5 for an approach to
this issue.
- Differentiation is slow compared to finite-differences. The
fmad and derivvec class are designed to be competitive with
finite-differencing in terms of speed of execution. They will of
course return derivatives correct to within machine rounding. If this
is not the case then please contact the author.
- Derivatives are incorrect. Then you have discovered a bug,
please contact the author.
« Previous « Start » Next »