Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Line search

From Wikipedia, the free encyclopedia
Optimization algorithm
Not to be confused withlinear search.

Inoptimization,line search is a basiciterative approach to find alocal minimumx{\displaystyle \mathbf {x} ^{*}} of anobjective functionf:RnR{\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} }. It first finds adescent direction along which the objective functionf{\displaystyle f} will be reduced, and then computes a step size that determines how farx{\displaystyle \mathbf {x} } should move along that direction. The descent direction can be computed by various methods, such asgradient descent orquasi-Newton method. The step size can be determined either exactly or inexactly.

One-dimensional line search

[edit]

Supposef is a one-dimensional function,f:RR{\displaystyle f:\mathbb {R} \to \mathbb {R} }, and assume that it isunimodal, that is, contains exactly one local minimumx* in a given interval [a,z]. This means thatf is strictly decreasing in [a,x*] and strictly increasing in [x*,z]. There are several ways to find an (approximate) minimum point in this case.[1]: sec.5 

Zero-order methods

[edit]

Zero-order methods use only function evaluations (i.e., avalue oracle) - not derivatives:[1]: sec.5 

  • Ternary search: pick some two pointsb,c such thata<b<c<z. If f(b)≤f(c), then x* must be in [a,c]; if f(b)≥f(c), then x* must be in [b,z]. In both cases, we can replace the search interval with a smaller one. If we pickb,c very close to the interval center, then the interval shrinks by ~1/2 at each iteration, but we need two function evaluations per iteration. Therefore, the method haslinear convergence with rate0.50.71{\displaystyle {\sqrt {0.5}}\approx 0.71}. If we pick b,c such that the partition a,b,c,z has three equal-length intervals, then the interval shrinks by 2/3 at each iteration, so the method haslinear convergence with rate2/30.82{\displaystyle {\sqrt {2/3}}\approx 0.82}.
  • Fibonacci search: This is a variant of ternary search in which the pointsb,c are selected based on theFibonacci sequence. At each iteration, only one function evaluation is needed, since the other point was already an endpoint of a previous interval. Therefore, the method has linear convergence with rate1/φ0.618{\displaystyle 1/\varphi \approx 0.618} .
  • Golden-section search: This is a variant in which the pointsb,c are selected based on thegolden ratio. Again, only one function evaluation is needed in each iteration, and the method has linear convergence with rate1/φ0.618{\displaystyle 1/\varphi \approx 0.618} . This ratio is optimal among the zero-order methods.

Zero-order methods are very general - they do not assume differentiability or even continuity.

First-order methods

[edit]

First-order methods assume thatf is continuously differentiable, and that we can evaluate not onlyf but also its derivative.[1]: sec.5 

  • Thebisection method computes the derivative off at the center of the interval,c: if f'(c)=0, then this is the minimum point; if f'(c)>0, then the minimum must be in [a,c]; if f'(c)<0, then the minimum must be in [c,z]. This method has linear convergence with rate 0.5.

Curve-fitting methods

[edit]

Curve-fitting methods try to attainsuperlinear convergence by assuming thatf has some analytic form, e.g. a polynomial of finite degree. At each iteration, there is a set of "working points" in which we know the value off (and possibly also its derivative). Based on these points, we can compute a polynomial that fits the known values, and find its minimum analytically. The minimum point becomes a new working point, and we proceed to the next iteration:[1]: sec.5 

  • Newton's method is a special case of a curve-fitting method, in which the curve is a degree-two polynomial, constructed using the first and second derivatives off. If the method is started close enough to a non-degenerate local minimum (= with a positive second derivative), then it hasquadratic convergence.
  • Regula falsi is another method that fits the function to a degree-two polynomial, but it uses the first derivative at two points, rather than the first and second derivative at the same point. If the method is started close enough to a non-degenerate local minimum, then it has superlinear convergence of orderφ1.618{\displaystyle \varphi \approx 1.618}.
  • Cubic fit fits to a degree-three polynomial, using both the function values and its derivative at the last two points. If the method is started close enough to a non-degenerate local minimum, then it hasquadratic convergence.

Curve-fitting methods have superlinear convergence when started close enough to the local minimum, but might diverge otherwise.Safeguarded curve-fitting methods simultaneously execute a linear-convergence method in parallel to the curve-fitting method. They check in each iteration whether the point found by the curve-fitting method is close enough to the interval maintained by safeguard method; if it is not, then the safeguard method is used to compute the next iterate.[1]: 5.2.3.4 

Multi-dimensional line search

[edit]

In general, we have a multi-dimensionalobjective functionf:RnR{\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} }. The line-search method first finds adescent direction along which the objective functionf{\displaystyle f} will be reduced, and then computes a step size that determines how farx{\displaystyle \mathbf {x} } should move along that direction. The descent direction can be computed by various methods, such asgradient descent orquasi-Newton method. The step size can be determined either exactly or inexactly. Here is an example gradient method that uses a line search in step 5:

  1. Set iteration counterk=0{\displaystyle k=0} and make an initial guessx0{\displaystyle \mathbf {x} _{0}} for the minimum. Pickϵ{\displaystyle \epsilon } a tolerance.
  2. Loop:
    1. Compute adescent directionpk{\displaystyle \mathbf {p} _{k}}.
    2. Define a one-dimensional functionh(αk)=f(xk+αkpk){\displaystyle h(\alpha _{k})=f(\mathbf {x} _{k}+\alpha _{k}\mathbf {p} _{k})}, representing the function value on the descent direction given the step-size.
    3. Find anαk{\displaystyle \displaystyle \alpha _{k}} that minimizesh{\displaystyle h} overαkR+{\displaystyle \alpha _{k}\in \mathbb {R} _{+}}.
    4. Updatexk+1=xk+αkpk{\displaystyle \mathbf {x} _{k+1}=\mathbf {x} _{k}+\alpha _{k}\mathbf {p} _{k}}, andk=k+1{\textstyle k=k+1}
  3. Untilf(xk+1)<ϵ{\displaystyle \|\nabla f(\mathbf {x} _{k+1})\|<\epsilon }

At the line search step (2.3), the algorithm may minimizehexactly, by solvingh(αk)=0{\displaystyle h'(\alpha _{k})=0}, orapproximately, by using one of the one-dimensional line-search methods mentioned above. It can also be solvedloosely, by asking for a sufficient decrease inh that does not necessarily approximate the optimum. One example of the former isconjugate gradient method. The latter is called inexact line search and may be performed in a number of ways, such as abacktracking line search or using theWolfe conditions.

Overcoming local minima

[edit]

Like other optimization methods, line search may be combined withsimulated annealing to allow it to jump over somelocal minima.

See also

[edit]

References

[edit]
  1. ^abcdeNemirovsky and Ben-Tal (2023)."Optimization III: Convex Optimization"(PDF).

Further reading

[edit]
  • Dennis, J. E. Jr.; Schnabel, Robert B. (1983). "Globally Convergent Modifications of Newton's Method".Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Englewood Cliffs: Prentice-Hall. pp. 111–154.ISBN 0-13-627216-9.
  • Nocedal, Jorge; Wright, Stephen J. (1999). "Line Search Methods".Numerical Optimization. New York: Springer. pp. 34–63.ISBN 0-387-98793-2.
  • Sun, Wenyu; Yuan, Ya-Xiang (2006). "Line Search".Optimization Theory and Methods: Nonlinear Programming. New York: Springer. pp. 71–117.ISBN 0-387-24975-3.
Functions
Gradients
Convergence
Quasi–Newton
Other methods
Hessians
Graph of a strictly concave quadratic function with unique maximum.
Optimization computes maxima and minima.
General
Differentiable
Convex
minimization
Linear and
quadratic
Interior point
Basis-exchange
Paradigms
Graph
algorithms
Minimum
spanning tree
Shortest path
Network flows
Retrieved from "https://en.wikipedia.org/w/index.php?title=Line_search&oldid=1239715615"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp