Nelder–Mead Method

The Nelder–Mead Method is a popular derivative-free optimization algorithm used to find the minimum of a scalar-valued function in multidimensional space. It is based on the concept of a simplex and uses geometrical transformations of the points in simplex to iteratively improve the solution. This method is available in MATLAB as fminsearch function.

2. The Simplex

Lets first define what is simplex. In \( \mathbb{R}^n \) dimensional space, \(n+1\) points form a simplex. For example:

In 1D: a line segment (2 points)
In 2D: a triangle (3 points)
In 3D: a tetrahedron (4 points)

Along with objective function, the initial coordinates of the simplex are provided as input to the algorithm. The algorithm iteratively refines these points to find the minimum of the objective function. In this process, in intial simples transformed into new coordinates based on function value at each vertex of the simplex.

3. Method

Lets consider a objective function \(f(\boldsymbol{X}) \) with \( \boldsymbol{X} \in \mathbb{R}^{n} \) as inputs.

\[ f: \mathbb{R}^{n} \rightarrow \mathbb{R} \]

As mentioned earlier, for \(n\)-dimensional function, a simplex is defined by \(n+1\) points. The algorithm starts with an initial simplex defined by these points, which are the vertices of the simplex. For two dimesional function \(n=2\) and the simplex is a triangle

For \(\mathbb{R}^{n}\) input space, lets define the intial \(n+1\) simplex vertices \(\{\boldsymbol{X}_1, \boldsymbol{X}_2, ... , \boldsymbol{X}_{n+1}\}\)

Evaluate the function at each vertex and sort them according to their function value

\[ f(\boldsymbol{X}_{1}) < f(\boldsymbol{X}_{2}) < ... < f(\boldsymbol{X}_{n+1}) \]

The point with lowest function values is considered as the best point \(\boldsymbol{X}_{1}\) and the point with highest function value, is considered as the worst point \(\boldsymbol{X}_{n+1}\). In order to transform the vertices of the simplex to new optimal coordinates, the algorithm first need to calculate the centroid \( \boldsymbol{\bar{X}} \) of first \(n\) points.

\[ \boldsymbol{\bar{X}} = \frac{1}{n} \sum_{i=1}^{n} \boldsymbol{X}_i \]

New coordinate for simplex is calculated as below:

\[ \boldsymbol{X}(\mu) = (1+\mu) \boldsymbol{\bar{X}} - \mu \boldsymbol{X}_{n+1} \]

The worst point will be transformed to the new coordinate. The choice of \(\mu\) determines transformation types.

In ordered to perform inner contraction \( -1 < \mu < 0 \)
In order to perform outer contraction \( 0 < \mu < 1 \)
In order to perform reflection \( \mu = 1 \)
In order to perform expansion \( \mu > 1 \)

If all the above transformation fails to reach the optimal solution, then all the points move towards the best point, which is called shrinkage. All these operations for a 2D objective function are illustrated in the figure below.

The function value at the best point of updated simplex will be computed and compared with the previous best point. If the difference between the point is less than a predefined threshold \(\varepsilon\), the algorithm terminates. Otherwise, new iteration starts with the updated simplex vertices.

4. Algorithm

Will be added soon.