Iteration | x | y | f(x, y) |
---|
The Nelder–Mead Method is a popular derivative-free optimization algorithm used to find the
minimum of a scalar-valued function in multidimensional space. It is based on the concept of a
simplex and uses geometrical transformations of the points in simplex to iteratively
improve the solution. This method is available in MATLAB as fminsearch
function.
Lets first define what is simplex. In \( \mathbb{R}^n \) dimensional space, \(n+1\) points form a simplex. For example:
Lets consider a objective function \(f(\boldsymbol{X}) \) with \( \boldsymbol{X} \in \mathbb{R}^{n} \) as inputs.
\[ f: \mathbb{R}^{n} \rightarrow \mathbb{R} \]
As mentioned earlier, for \(n\)-dimensional function, a simplex is defined by \(n+1\) points. The algorithm starts with an initial simplex defined by these points, which are the vertices of the simplex. For two dimesional function \(n=2\) and the simplex is a triangle
For \(\mathbb{R}^{n}\) input space, lets define the intial \(n+1\) simplex vertices \(\{\boldsymbol{X}_1, \boldsymbol{X}_2, ... , \boldsymbol{X}_{n+1}\}\)
Evaluate the function at each vertex and sort them according to their function value
\[ f(\boldsymbol{X}_{1}) < f(\boldsymbol{X}_{2}) < ... < f(\boldsymbol{X}_{n+1}) \]
The point with lowest function values is considered as the best point \(\boldsymbol{X}_{1}\) and the point with highest function value, is considered as the worst point \(\boldsymbol{X}_{n+1}\). In order to transform the vertices of the simplex to new optimal coordinates, the algorithm first need to calculate the centroid \( \boldsymbol{\bar{X}} \) of first \(n\) points.
\[ \boldsymbol{\bar{X}} = \frac{1}{n} \sum_{i=1}^{n} \boldsymbol{X}_i \]
New coordinate for simplex is calculated as below:\[ \boldsymbol{X}(\mu) = (1+\mu) \boldsymbol{\bar{X}} - \mu \boldsymbol{X}_{n+1} \]
The worst point will be transformed to the new coordinate. The choice of \(\mu\) determines transformation types.
If all the above transformation fails to reach the optimal solution, then all the points move towards the best point, which is called shrinkage. All these operations for a 2D objective function are illustrated in the figure below.
The function value at the best point of updated simplex will be computed and compared with the previous best point. If the difference between the point is less than a predefined threshold \(\varepsilon\), the algorithm terminates. Otherwise, new iteration starts with the updated simplex vertices.
Will be added soon.