Class 8: Roots and minima, more C++, and using external libraries

Overview

Root-finding:

Finding a root of a function (1-d): Solve f(x)=0 for x.

Finding root(s) of N functions of N variables: Solve \underline{f}(\underline{x})=0 for \underline{x}.

Numerical Recipes describes other algorithms, and gives lots of
examples and explanations.

Helping the algorithms get the right answer

Give the algorithm a good starting point.

For 1-d root-finders, provide a bracket for the root.

For the multi-dimensional case, bracketing isn't feasible. The algorithms generally need

Bisection algorithm

Shrink bracket until the root is pinned down, as follows:

Start with caller-provided bracket for the root (a,b), tolerance \epsilon, and function f.
Check sign(f(a)) \neq sign(f(b)).
Loop:
x \leftarrow (a+b)/2
y \leftarrow f(x)
If y has same sign as f(a), then
a \leftarrow x
else
b \leftarrow x
Repeat until |a-b|<\epsilon

Newton's method

Repeatedly "shoot for the zero," using the approximation

f(x+d) \simeq f(x) + f'(x)d + O(f''d^2) = 0
d = -\frac{f(x)}{f'(x)}

Potential problems can occur when far from the root:

(... Draw figures on board, insert in notes later ...)

Simple Newton's algorithm

Start with caller-supplied f and f' functions, initial x, and tolerance \epsilon.
Loop:
d \leftarrow - \frac{f}{f'}
x \leftarrow x + d
Repeat until |d|<\epsilon.

Important note: \epsilon should be small enough for your application, but no smaller. Make sure it is large enough that x+\epsilon is different from x in your machine's floating point representation.

Smarter Newton's algorithm

Use bracketing to protect against any craziness, bisect whenever a Newton's method step fails.

Start with caller-supplied f and f' functions, bracket (a,b), and tolerance \epsilon.
Set initial x \leftarrow (a+b)/2.
Loop:
d \leftarrow - \frac{f}{f'}
x_{\text{try}} \leftarrow x + d
Is x_{\text{try}} in range (a,b)?
If yes: x \leftarrow x_{\text{try}}
If not: bisect range and try again
Repeat until |d|<\epsilon.

See full example in Numerical Recipes (sec 9.4 in 2nd or 3rd ed.).

Multidimensional case

We have N functions F_i of N variables x_j, and want to find the zeros.

There's a great illustration in Numerical Recipes showing why this is hard in general. (Fig. 9.6.1.) If you want to seek the nearest zero to some point, Newton's method should work.

F_i(\underline{x}+\underline{d}) = F_i(\underline{x}) + \sum_{j=1}^N \frac{\partial F_i}{\partial x_j} d_j + O(d^2).

In matrix notation,

\underline{F}(\underline{x}+\underline{d}) = \underline{F}(\underline{x}) + \underline{\underline{J}}\cdot\underline{d} + O(d^2),

using the Jacobian matrix

\underline{\underline{J}} =\left[ J_{ij} \right] \equiv \left[  \frac{\partial F_i}{\partial x_j} \right].

To find the root, repeatedly solve

\underline{\underline{J}}\cdot\underline{d} = -\underline{F}.

Update \underline{x} until convergence |\underline{d}| < \epsilon.

Numerical Recipes describes a method for "backtracking" if a
multi-dimensional Newton's method step makes things worse instead of better.

Minimization

Finding a local minimum: find value for which f(x) or f(\underline{x}) is locally minimized. (To find maxima, just minimize -f(x).)

I will describe one particularly pretty 1-d minimization algorithm that requires no calculation of derivatives and that is guaranteed to find a local minimum.

There are faster algorithms that use derivatives, for 1-d and multi-d problems, including ones that only need the user-supplied function. (These algorithms estimate the derivatives themselves.)

I'll describe the general properties of these, and then we will talk about how to use pre-written implementations.

More general minimization

Near minimum, to 2nd order,

f(\underline{x}+\underline{d})  = f(\underline{x}) + \sum_j \frac{\partial f}{\partial x_j} d_j  + \frac{1}{2} \sum_{i,j} \frac{\partial^2 f}{\partial x_j \partial x_i} d_i d_j
= c + \underline{b}\cdot\underline{d} + \frac{1}{2} \underline{d}\ \underline{\underline{A}}\ \underline{d}

Terminology: The matrix \underline{\underline{A}} is called the Hessian. The vector \underline{b} is the gradient.

\underline{\nabla} f = \underline{\underline{A}} \cdot \underline{d} - \underline{b}

At an extremum, \underline{\nabla} f = 0. So we just solve

\underline{\underline{A}} \cdot \underline{d} = \underline{b}.

Essentially the same algorithm as Newton's method works if we know \underline{\underline{A}} and \underline{b} well enough.

Types of minimization algorithms

Various algorithms differ in how they guard against crazy steps and whether they need the caller to provide a function for the derivatives or Hessian.

What's common among the algorithms:

Numerical Recipes has many different algorithms.

More importantly, there are existing libraries for minimization of functions.

Break

A little more C++: derived classes

C++ lets you define a new class based on an existing class, like this:

class NewClass : public BaseClass {
  // usual class definition
};

This defines NewClass as having all the same data and member functions as BaseClass, with the following enhancements:

This is not impossible in C, just clunky

C++ C
class MyBase {
  int i;
public:
  virtual void set_i(int ii);
};

class MyClass2 : public MyBase {

  int j;
public:
  virtual void set_i(int ii);
  virtual void set_j(int jj);
};
struct MyBase_s   {
  int i;

  void (*set_i)(int ii);
};

struct MyClass2_s {
  strict MyBase_s base;
  int j;


  void (*set_j)(int jj);
};
C++ constructor sets virtual function table for new MyClass2 object's set_i and set_j to the correct addresses for MyClass2::set_i() and MyClass2::set_j(). For any new MyClass2_s variable made, C programmer has to set the values of .set_j and .base.set_i pointers to point to the right functions, and i field must be accessed as base.i.

Example of minimizer in C using object-oriented approach: GSL

See very nice documentation in GNU Scientific Library Reference Manual, section titled "Multidimensional Minimization". http://www.gnu.org/software/gsl/manual/html_node/Multidimensional-Minimization.html

Things to note about how they write documentation:

Things to note about the API:

Example of a minimizer in C++: ROOT

The best documentation from the authors is at http://root.cern.ch/root/html526/TMinuitMinimizer.html

Assignment (easy?)

Get the ROOT examples from the course web page to compile and run.