« Previous « Start » Next »
13 Approximation of Derivatives
This section about derivatives is particulary important from a
practical point of view. It often seems to be the case that either
it is nearly impossible or the user has difficulties in coding the
derivatives.
For information about supplying partial derivatives see the
Prob parameter
CheckNaN in Appendix
A.
Options Summary
Observe that the usage depends on which solver is used. Clearly,
if a global solver, such as
glbFast is used, derivatives
are not used at all and the following sections do not apply. Some
solvers use only first order information, while others may require
second order information. See 'help iniSolve' and the TOMLAB
interface for each solver (e.g.
snoptTL ) for information on
derivative level needed. If a solver requires second order
information it is highly recommended that at least the first order
information is given analytically or the accuracy requested is
changed.
Prob.NumDiff and
Prob.ConsDiff options 11-15 require
that the patterns (
Prob.HessPattern ,
Prob.JacPattern
and
Prob.ConsPattern ) are properly set. They should also be
set for options 1-5 but are not required. These are most easily
set by calling the corresponding user routines with 2 random set
of variables and combining the sparse matrices. The functions
estConsPattern ,
estJacPattern and
estHessPattern automates this process with 3 safe trials.
If
Prob.LargeScale is set to 1 and the respective user
routines are not given, then
estConsPattern ,
estJacPattern and
estHessPattern are automatically
executed before the solution process is started.
If first order information is provided the user should set minus
(-) in front of the option that they want. For example if the
objective function and gradient are given
Prob.NumDiff = -1
will make sure that only the Hessian is estimated by numerical
differentiation.
Table
13 describes the differentiation options
available in TOMLAB.
Table
13 shows the flags that are included in the
callbacks to the user routines for objective, constraints and
other functions. These flags should be used to optimize the
performance of numerical differentiation. For example if
FDVar = [1 4], then only the constraints that depend on
decision variable number 1 and 4 need to be calculated.
The following applies to the different user supplied functions:
User c: If Prob.rows nonempty, then only these
constraints need to be computed, others could be set as 0. If it
is easier to exclude the constraints to compute from dependencies,
then Prob.FDVar says which variables are changed, i.e. to compute
numerical derivatives for. Only constraints that depend on the
variable indices given in FDVar need to be computed. If FDVar ==
0, nothing is known, and all constraints must be computed.
User dc: If Prob.rows nonempty, then only these
constraint Jacobian rows need to be computed, others could be set
as 0. If Prob.cols nonempty, only these columns in the constraint
Jacobian need to be computed, others could be set as 0. Only if
Prob.cols is empty and Prob.FDVar(1) > 0 could Prob.FDVar be
used to simplify the computations.
User f: If Prob.FDVar(1) > 0, then only variables in
Prob.FDVar are changed since the last call with Prob.FDVar == 0.
If knowledge about the functional parts that are dependent on the
non-changed variables are saved the last time Prob.FDVar == 0, it
may be utilized to speed up the computations.
User g: If Prob.FDVar(1) > 0, the variables in
Prob.FDVar are changed. They influence the variables in Prob.cols
(if nonempty), which means that g(Prob.cols) are the elements
accessed to obtain the numerical Hessian.
User r: If Prob.rows nonempty, then only these residuals
need to be computed, others could be set as 0. If it is easier to
exclude the residuals to compute from dependencies, then
Prob.FDVar says which variables are changed, i.e. to compute
numerical derivatives for. Only residuals that depend on the
variable indices given in FDVar need to be computed, the rest
could be set as 0. If FDVar == 0, nothing is known, and all
residuals must be computed.
The differentiation options in TOMLAB . |
|
Flag |
Value |
Comments |
|
|
Prob.ADObj |
1 |
MAD used for first order information (gradient is
automatically calculated with floating point precision). Some functions are
not supported by MAD. |
|
-1 |
MAD used for second order information (Hessian), i.e. the gradient
needs
to be specified. Users should make sure that MAD variables are allocated
properly in their files and that they are not overwritten. |
|
Prob.ADCons |
1 |
MAD used for first order information (Jacobian of the constraints is
automatically calculated with floating point precision). Some functions are
not supported by MAD. Users should make sure that MAD variables are allocated
properly in their files and that they are not overwritten. |
|
-1 |
MAD used for second order information (d2c, the second part of the Hessian to the Lagrangian), i.e. the Jacobian
needs to be specified. Users should make sure that MAD variables are allocated
properly in their files and that they are not overwritten. |
|
Prob.NumDiff |
1 |
fndg calculates the
gradient. FDJac calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess . Default in TOMLAB . |
|
-1 |
Same as option 1 but the gradient
is not estimated, as it is given. A negative sign has the same functionality for all options. |
|
11 |
Same as option 1 but a function
findpatt is called from
iniSolve when using this
option. Prob.HessIx and Prob.JacIx are set to improve the
performance of the differentiation routines. These are only for internal use. |
|
-11 |
Same as option 11 but the gradient
is not estimated, as it is given. A negative sign has the same functionality for all options. |
|
2 |
Matlab standard splines (no additional toolboxes needed). Prob.optParam.CentralDiff is used by this routine.
fndg2 calculates the
gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2 . |
|
12 |
Same as option 2 but Prob.HessIx and Prob.JacIx are set for improved performance. |
|
3 |
csaps will be used (Splines Toolbox needed). Prob.optParam.splineSmooth is used by this routine. fndg2 calculates the
gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2 . |
|
13 |
Same as option 3 but Prob.HessIx and Prob.JacIx are set for improved performance. |
|
4 |
spaps will be used (Splines Toolbox needed). Prob.optParam.splineTol is used by this routine. If Prob.optParam.SplineTol < 0 then csapi is used instead. fndg2 calculates the
gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2 . |
|
14 |
Same as option 4 but Prob.HessIx and Prob.JacIx are set for improved performance. |
|
5 |
A routine using complex variables. fndg3 calculates the
gradient. FDJac3 calculates the Jacobian of the residuals. This option is not available for solvers requiring second order information. |
|
15 |
Same as option 5 but Prob.JacIx is set for improved performance. |
|
6 |
The derivatives are estimated by the solver (only available for some options). |
|
Prob.ConsDiff |
1 |
FDJac calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess . |
|
11 |
Same as option 1 but a function
findpatt is called from
iniSolve when using this
option. Prob.ConIx is set to improve the
performance of the differentiation routines. This is only for internal use. |
|
2 |
Matlab standard splines (no additional toolboxes needed). Prob.optParam.CentralDiff is used by this routine.
FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess . |
|
12 |
Same as option 2 but Prob.ConIx is set for improved performance. |
|
3 |
csaps will be used (Splines Toolbox needed). Prob.optParam.splineSmooth is used by this routine.
FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess . |
|
13 |
Same as option 3 but Prob.ConIx is set for improved performance. |
|
4 |
spaps will be used (Splines Toolbox needed). Prob.optParam.splineTol is used by this routine. If Prob.optParam.SplineTol < 0 then csapi is used instead.
FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess . |
|
14 |
Same as option 4 but Prob.ConIx is set for improved performance. |
|
5 |
A routine using complex variables. FDJac3 calculates the Jacobian of the residuals. This option is not available for solvers requiring second order information. |
|
15 |
Same as option 5 but Prob.ConIx is set for improved performance. |
|
6 |
The derivatives are estimated by the solver. |
|
Callback flags in TOMLAB . |
|
Flag |
Value |
Comments |
|
|
Prob.FDVar |
0 or vector |
The variables being perturbed in the callback for a numerical differentiation routine.
The user should make sure that unnecessary calculation are not made. |
|
Prob.rows |
0 or vector |
The rows in the user computed vector/matrix that will be accessed, and needs to be set.
If empty, no information is available, and all elements need to be set. |
|
Prob.cols |
0 or vector |
The columns in the user computed matrix that will be accessed, and needs to be set.
If empty, no information is available, and all elements need to be set. |
|
Both numerical differentiation and automatic differentiation are
possible. There are six ways to compute numerical differentiation.
Furthermore, the SOL solvers
MINOS ,
NPSOL ,
NLSSOL ,
SNOPT and other solvers include numerical
differentiation.
Numerical differentiation is automatically used for gradient,
Jacobian, constraint Jacobian and Hessian if a user routine is not
present.
Especially for large problems it is important to tell the system
which values are nonzero, if estimating the Jacobian, the
constraint Jacobian, or the Hessian matrix. Define a sparse 0-1
matrix, with ones for the nonzero elements. This matrix is set as
input in the
Prob structure using the fields
Prob.JacPattern ,
Prob.ConsPattern ,
Prob.HessPattern or
Prob.d2LPattern . If there are many
zeros in the matrix, this leads to significant savings in each
iteration of the optimization. It is possible to use the TOMLAB
estimation routines mentioned above to set these.
A variable
Prob.FDVar is a dynamically set field that
indicates which decision variables are being perturbed during a
call to the user's routines. For example if FDVar = [1,3,5], the
variables with indices 1, 3 and 5 are being used for an
estimation. When the Jacobian (for example) is being calculated by
repeated calls to the constraints (or residuals) the user can make
sure that unnecessary calculations are avoided.
FDVar = Prob.FDVar;
if FDVar == 0 | FDVar == 4
% Calculate some constraints
end
% The constraint will only be calculated if FDVar == 0
% which means that all are required, or if FDVar == 4, i.e.
% only if decision variable number 4 is being perturbed.
To allow efficiently estimation of the derivatives in the five
different options, it is possible to obtain the indices needed for
the constraints (
Prob.ConIx ), Jacobian (
Prob.JacIx )
and Hessian (
Prob.HessIx ) evaluations. A function called
findpatt is called by
iniSolve automatically if a 1
is added before the number assigned for the differentiation
method. The same functionality illustrated below applies for
Prob.ConsDiff .
Prob.NumDiff = 11; % Use index information for NumDiff = 1
Prob.NumDiff = 12; % Use index information for NumDiff = 2
...
Prob.NumDiff = 15; % Use index information for NumDiff = 5
If a set of problems with identical indices for the
differentiation routines are optimized in the sequence the
following code can be used. This avoids re-generating the indices
needed:
Prob.ConIx = findpatt(Prob.ConsPattern);
Prob.JacIx = findpatt(Prob.JacPattern);
Prob.HessIx = findpatt(Prob.HessPattern);
Prob.NumDiff = 1; % NumDiff = 11 used automatically
Forward or Backward Difference Approximations
Prob.NumDiff = 1 (11) or
Prob.ConsDiff = 1 (11).
Default in TOMLAB .
The default way is to use the classical approach with forward or
backward differences together with an optional automatic step size
selection procedure. Set
Prob.GradTolg = -1 to run the
procedure. The differentiation is handled by the routine
fdng , which is a direct implementation of the FD
algorithm [
31, page 343].
The
fdng routine is using the parameter field
DiffInt , in the structure optParam, see Table
A, as the numerical
step size. The user could either change this field or set the
field
Prob.GradTolg . The field
Prob.GradTolg may
either be a scalar value or a vector of step sizes of the same
length as the number of unknown parameters
x. The advantage is
that individual step sizes can be used, in the case of very
different properties of the variables or very different scales. If
the field
Prob.GradTolg is defined as a
negative
number, the
fdng routine is estimating a suitable step
size for each of the unknown parameters. This is a costly
operation in terms of function evaluations, and is normally not
needed on well-scaled problems.
Similar to the
fdng , there are two routines
FDJac
and
FDHess .
FDJac numerically estimates the Jacobian
for nonlinear least squares problems or the constraint Jacobian in
constrained nonlinear problems.
FDJac checks if the field
Prob.GradTolJ is defined, with the same action as
fdng .
FDHess estimates the Hessian matrix in
nonlinear problems and checks for the definition of the field
Prob.GradTolH . Both routines use field
Prob.optParam.DiffInt as the default tolerance if the other
field is empty. Note that
FDHess has no automatic step size
estimation. The implementation in
fdng ,
FDJac and
FDHess avoids taking steps outside the lower and upper
bounds on the decision variables. This feature is important if
going outside the bounds makes the function undefined.
Splines
Matlab splines is selected by setting
Prob.NumDiff or
Prob.ConsDiff = 2 (12).
csaps will be used if
Prob.NumDiff or
Prob.ConsDiff is set to 3 (13),
spaps if set to 4 (14) and finally
csapi if set to 4
(14) with
Prob.optParam.SplineTol < 0.
The first spline option can be used without the Spline Toolbox
installed. If the Spline Toolbox is installed, gradient, Jacobian,
constraint Jacobian and Hessian approximations could be computed
in three different ways depending on which of the three routines,
csaps ,
spaps or
csapi the user choose to use.
The routines
fdng2 ,
FDJac2 and
FDHess2
implements the gradient estimation procedure for the different
approximations needed. All routines use the tolerance in the field
Prob.optParam.CentralDiff as the numerical step length. The
basic principle is central differences, taking a small step in
both positive and negative direction.
Complex Variables
Prob.NumDiff = 5 (15) or
Prob.ConsDiff = 5 (15).
The fifth approximation method is a method by Squire and Trapp
[
75], which is using complex variables to estimate the
derivative of real functions. The method is not particulary
sensitive to the choice of step length, as long as it is very
small. The routine
fdng3 implements the complex variable
method for the estimation of the gradient and
FDJac3 the
similar procedure to estimate the Jacobian matrix or the
constraint Jacobian matrix. The tolerance is hard coded as
1
E−20. There are some pitfalls in using Matlab code for this
strategy. In the paper by Martins et. al [
58], important
hints are given about how to implement the functions in Matlab.
They were essential in getting the predefined TOMLAB examples to
work, and the user is advised to read this paper before attempting
to make new code and using this differentiation strategy. However,
the insensitivity of the numerical step size might make it
worthwhile, if there are difficulties in the accuracy with
traditional gradient estimation methods.
Automatic Differentiation
Automatic differentiation is performed by use of the MAD toolbox.
MAD is a TOMLAB toolbox which is documented in a separate manual,
see http://tomopt.com.
MAD should be initialized by calling
madinitglobals before
running TOMLAB with automatic differentiation. Note that in order for
TOMLAB to be fully compatible with the MAD, the functions must be
defined according to the MAD requirements and restrictions. Some of
the predefined test problems in TOMLAB do not fulfill those
requirements.
In the Graphical User Interface, the differentiation strategy
selection is made from the
Differentiation Method menu
reachable in the
General Parameters mode. Setting the
Only 2ndD click-box, only unknown second derivatives are
estimated. This is accomplished by changing the sign of
Prob.NumDiff to negative to signal that first order
derivatives are only estimated if the gradient routine is empty,
not otherwise. The method to obtain derivatives for the constraint
Jacobian is selected in the
Constraint Jacobian diff. method
menu in the
General Parameters mode.
When running the menu program
tomRemote/tomMenu , push/select
the
How to compute derivatives button in the
Optimization Parameter Menu.
To choose differentiation strategy when running the driver routines
or directly calling the actual solver set
Prob.ADObj equal to
-1 or 1 for automatic differentiation or
Prob.NumDiff to 1
(11), 2 (12), 3 (13), 4 (14) or 5 (15) for numerical
differentiation, before calling drivers or solvers. Note that
Prob.
NumDiff=1 will run the
fdng routine.
Prob.
NumDiff=2,3,4 will make the
fdng2 routine use standard
Matlab splines or call the Spline Toolbox routines
csaps ,
spaps , and
csapi respectively. The
csaps
routine needs a smoothness parameter and the
spaps routine
needs a tolerance parameter. Default values for these parameters are
set in the structure optParam, see Table
A, fields
splineSmooth and
splineTol . The user should be aware of
that there is no guarantee that the default values of
splineSmooth and
splineTol are the best for a particular
problem. They work on the predefined examples in TOMLAB . To use the
built in numerical differentiation in the SOL solvers
MINOS ,
NPSOL ,
NLSSOL and
SNOPT , set
Prob.
NumDiff=6.
Note that the
DERIVATIVE LEVEL SOL parameter must be set
properly to tell the SOL solvers which derivatives are known or not
known. There is a field
DerLevel in
Prob.optParam that
is used by TOMLAB to send this information to the solver. To select
the method to obtain derivatives for the constraint Jacobian the
field
Prob.ConsDiff is set to 1-6 (11-15, 16 not valid) with
the same meaning as for
Prob.NumDiff as described above.
Prob.ADCons is the equivalent variable for automatic
differentiation of the nonlinear constraints.
Here follows some examples of the use of approximative derivatives
when solving problems with
ucSolve and
clsSolve . The
examples are part of the TOMLAB distribution in the file
diffDemo in directory
examples .
To be able to use automatic differentiation the MAD toolbox must
be installed.
Automatic Differentiation example
madinitglobals; % Initiate MAD variables
probFile = 'uc_prob'; % Name of Init File
P = 1; % Problem number
Prob = probInit(probFile, P);
Prob.Solver.Alg = 2; % Use the safeguarded standard BFGS
Prob.ADObj = 1; % Use Automatic Differentiation.
Result = tomRun('ucSolve', Prob, 2);
FD example
% Finite differentiation using the FD algorithm
probFile = 'uc_prob'; % Name of Init File
P = 1; % Problem number
Prob = probInit(probFile, P); Prob.Solver.Alg = 2;
Prob.NumDiff = 1; % Use the fdng routine with the FD algorithm.
Result = tomRun('ucSolve', Prob, 2);
% Change the tolerances used by algorithm FD
Prob.GradTolg = [1E-5; 1E-6]; % Change the predefined step size
Result = tomRun('ucSolve', Prob, 1);
% The change leads to worse accuracy
% Instead let an algorithm determine the best possible GradTolg
% It needs some extra f(x) evaluations, but the result is much better.
Prob.GradTolg = -1; % A negative number demands that the step length
% of the algorithm is to be used at the initial point
% Avoid setting GradTolg empty, then instead Prob.optParam.DiffInt is used.
Result = tomRun('ucSolve', Prob, 1);
% Now the result is perfect, very close to the optimal == 0.
Prob.NumDiff = 5; % Use the complex variable technique
% The advantage is that it is insensitive to the choice of step length
Result = tomRun('ucSolve', Prob, 1);
% When it works, like in this case, it gives absolutely perfect result.
Increasing the tolerances used as step sizes for the individual
variables leads to a worse solution being found, but no less
function evaluations until convergence. Using the automatic step
size selection method gives very good results. The complex
variable method gives absolutely perfect results, the optimum is
found with very high accuracy.
The following example illustrates the use of spline function to
approximate derivatives.
Spline example
probFile = 'ls_prob'; % Name of Init File
P = 1; % Problem number
Prob = probInit(probFile, P);
Prob.Solver.Alg = 0; % Use the default algorithm in clsSolve
Prob.NumDiff = 2; % Use Matlab spline .
Result = tomRun('clsSolve', Prob, 2);
FD example with patterns
% Finite differentiation using the FD algorithm
%
% This example illustrates how to set nonzeros in HessPattern
% to tell TOMLAB which elements to estimate in the Hessian.
% All elements in Hessian with corresponding zeros in HessPattern are
% set to 0, and no numerical estimate is needed.
%
% This saves very much time for large problems
% In a similar way, Prob.ConsPattern is set for the constraint gradient
% matrix for the nonlinear constraints, and Prob.JacPattern for the
% Jacobian matrix in nonlinear least squares problems.
probFile = 'con_prob';
P = 12;
Prob = probInit(probFile, P);
Prob.Solver.Alg = 1;
Prob.HessPattern = sparse([ones(6,6), zeros(6,6);zeros(6,12)]);
% Note that if setting Prob.NumDiff = 1, also the gradient would be
% estimated with numerical differences, which is not recommended.
% Estimating two levels of derivatives is an ill-conditioned process.
% Setting Prob.NumDiff = -1, only the Hessian is estimated
% Use qpSolve in base module for QP subproblems
Prob.SolverQP = 'qpSolve';
Prob.NumDiff = -1; % Use the fdng routine with the FD algorithm.
Result = tomRun('nlpSolve',Prob,1);
% Setting Prob.NumDiff = -11 makes Tomlab analyze HessPattern
% and use a faster algorithm for large-scale problems
% In this small-scale case it is no advantage,
% it just takes more CPU-time.
Prob.NumDiff = -11; % Use the fdng routine with the FD algorithm.
Result = tomRun('nlpSolve',Prob,1);
% Run the same problem estimating Hessian with Matlab Spline
% Needs more gradient calculations because it is principle
% smooth central differences to obtain the numerical Hessian.
% If the problem is noisy, then this method is recommended.
Prob.NumDiff = -2; % Matlab spline
Result = tomRun('nlpSolve',Prob,1);
« Previous « Start » Next »