(2015-01-30) Some historical cases have beautiful ad hoc solutions.
Without the general methods presented in the rest of this page,every optimization problem appears to call for a brilliant insight. Here are a few classical such solutions, which ought to be remembered not only forthe sheer beauty of the arguments leading to them but also for thesimplicity in which the final results can be described. Arguably, ad hoc methods are best suited for problems whosesolutions are simple (likeDido's problem).
Once guessed, such solutions may have beautiful justifications. In general, however, correctly guessing the solution of a complicated optimization problemis all but impossible and calculus is required for both the discovery and the justificationof the solutions. When calculus itself becomes difficult to apply, the latest trend is to use numerical methods instead of symbolic manipulations (this should only be a last resort, though).
The solution of our first optimization problem depends on what can be construed asEuclid's postulated "optimization", namely:
In a Euclidean plane (or Euclidean space) the shortest path between twopoints is a straight line.
Single-variable optimization: Laws of classical optics.
In the plane, we consider a straight line and two points A and B on the same side of the line. The basic question is to find which point M on the line will yield the shortest two-legged path from A to M and M to B.
The key observation is that any such path has the same length as a path from A to M and from M to the point B', symmetric of B with respect to the line. Conversely, any path from A to B' must cross the line ata point M.
If the path from A to B' is shortest, so is the matchingpath from A to M to B. Bypostulate, the shortest path from A to B' is a straight line and thesolution M is simply found as the intersection of two straight lines.
Heron thus showed, nearly 2000 years ago, that the main law ofmirror optics (the angle of reflection is equal to the angleof incidence) can be deduced from the appealing postulate that light travelsalong the shortest available route. A great intellectual feat for that time, or for any time!
Postulating instead (more generally) that light travels along the fastest route, the assumption that its speed is different on either sideof the line yields the basic law of refraction (Snell's law).
Two-variable optimization: Torricelli points.
The Torricelli point minimizes the sum of the distancesto three given points. (This is a two-dimensional specialization of Plateau's laws.)
(2007-09-23) 'Tis little more than finding where the function'sderivative vanishes.
For a point x in the interior of the domain of f and a small enough valueof , both points x and x are in the allowed domain andyield values of f which are on both sides of f (x) if we just assume that f' is nonzero. Therefore, there can be an extremum at point x only when f' (x) vanishes.
On the other hand, there's no such requirement for a point x at theborder of the allowed domain, because small displacementsof only one sign are allowed.
Away from the border, a saddlepoint x (let's use that general term to indicate that f' (x) vanishes) will bethe location of a minimum (resp. a maximum ) when f'' (x) is positive (resp. negative). If it's zero, furtheranalysis is needed to determine whether x is the location of an extremum or not. (It's indeed an extremum if the first nonzero derivativeis of even order.)
(2016-01-11) An historical example of a single-variable problem of the above type.
Regiomontanus (1436-1476) proposed to discuss how a line segment is viewed at an angle from a variable viewpoint. That angle is maximum (and equal to 180°) from a point inside the segment. It is minimum (and equal to 0°) for a viewpoint on the same line asthe segment but outside of it. The angle is arbitrarily small from viewpoints at large distances.
Regiomontanus considers a line perpendicular to the line of the segment and crossing it ata point outside the segment. He asks when the viewing angle is maximumfrom points on that line.
(E. M. of Wisconsin Rapids, WI.2000-11-21) Determine the points where the [objective] function z = 3x3+3y3-9xy is maximized or minimized. [ Check for second-order condition. ]
Anecessary (but not sufficient) condition for asmooth functionof two variables to be extremal ( "minimized" or "maximized" ) on an open domainis to be at a saddlepoint where both partial derivatives vanish. In this case that means:
9x2-9y = 0 and 9y2-9x = 0
So, extrema of z can only occur when (x,y) is either (0,0) or (1,1).
To see whether a local extremum actually occurs,we must examine the second-order behavior of thefunction about each such saddlepoint (in the rare case where the second-ordervariation vanishes, further analysis would be required).
Well, if the second-order partial derivatives are L, M and N, thenthe second-order variation at point (x+dx, y+dy) is the quantity
½ [ L(dx)2 + 2M(dxdy) + N (dy)2 ]
We recognize the bracket as a quadratic expression whose sign remains the same (regardless of the ratio dy/dx) if and only if its (reduced) discriminant(M2LN) is negative. If it's positive, the point in question isnot an extremum.
In our example, we have L = 18x, M = -9 and N = 18y. Therefore:
M2 LN = 81 (1 - 4xy)
At (0,0) this quantityis positive (+81). Thus, there's no extremum there. On the other hand, at the point (1,1) this quantity is negative (-243) so the point (1,1) doescorrespond to the only local extremum of z.
Is this a maximum or a minimum? Well, just look at the sign of L (which is always the same as the sign of N for an extremum). If it's positive, surrounding points yield higher values and,therefore, you've got a minimum. If it's negative you've got a maximum. Here, L = 18, so a minimum is achieved at (1,1).
To summarize: z has only one local extremum;it's a minimum of -3, reached at x=1 and y=1. Does this mean we have an absolute minimum? Certainly not! (When  x and y are negative,a large enough magnitude of either will make z fall below any preset threshold.)
(2007-09-22) Saddlepoints of a function of several independent variables.
We're seeking saddlepoints (stationary points) of ascalar objective function M of n variables x1 , ... , xn which we view as components of a vector x.
M ( x1 , ... , xn ) = M (x )
If those n variables are independent, then a saddlepoint is obtained only at a point wherethedifferential form dM vanishes. This is to say:
0 = dM = grad M . dx
As this relation must hold for any infinitesimal vectorial displacement dx, such absolute saddlepoints of M are thus characterized by:
grad M = 0
That vectorial relation is shorthand for n separate scalar equations:
M
= 0
xi
(2007-09-22) Optimizing entails one Lagrange multiplier per constraint.
Instead of a set of independent variables (as discussedabove) we may have to deal with severalvariables tied to each other by several constraints. The method of choice was originally introduced byLagrange (before 1788) as an elegant way to deal with mechanical constraints:
For example, the variables may be subject to asingle constraint (e.g., the cartesian equation of a surface on which a point-mass is free to move).
E ( x1 , ... , xn ) = constant
While an unconstrained saddlepoint of M was obtained when dM = 0. A constrained saddlepoint is obtained when dM is proportional to dE.
More generally, when k functions E1 ... Ek are given to be constant,a constrained saddlepoint of M is achieved when thedifferential form dM is a linearcombination of the differentials of E1 ... Ek.
dM = 1 dE1 + 2 dE1 + ... + k dEk
In other words, there are k constants i (each is known as the Lagrange multiplier associated with the corresponding constraint) such that the followingfunction has an unconstrained (orabsolute) saddlepoint.
M (1 E1 + 2 E1 + ... + k Ek )
(2007-09-23) What apex minimizes the lateral area of a cone of given base and volume?
The volume of a cone is one third of the product of its base area by itsheight. So, by imposing the cone's volume, we're actually imposingthe height of the cone and looking for an optimalapex somewhere in a fixed plane parallel to the base...
(2007-09-23) (cf.Lagrangian mechanics) Euler-Lagrange equations hold whenever a path integral is stationary.
Let L be a smooth enough functionof 2n+1 variables:
L = L ( q 1, q 2 ... q n, v1, v2 ... vn, t )
We assume that the first n variables (q) are actually functions of thelast one (t) and also that the subsequent variables (v) are simply their respective derivatives with respect to t:
v i(t) =
d
q i(t)
dt
This makes L a function of t alone and we may considerthe following integral S (called the "action" of L ) for prescribed configurations at both extremities. In this context, a configuration is defined as a complete setof values for the q-variables only (irrespective of what the v-variables may be) so, what we are really considering now are prescribed fixed values of all the 2n quantities q i (a) and q i (b).
S =
b
L dt
a
The fundamental problem of the calculus of variations is tofind what local conditions make S stationary, for optimal functions q i . Namely:
Euler-Lagrange equations
L
=
d
L
q i
dt
vi
From a set of optimal functions q i and arbitrary (sufficiently smooth) functions h i which vanish at both extremities a and b, let's define the following family offunctions, depending on one parameter :
Q i(t) = q i(t) + h i(t)
V i(t) = v i(t) +
d
h i(t)
dt
Those yield a value S() of the action which must bestationary at = 0 (since the functions q i are optimal). Thus, the derivative of S() along does vanishat = 0, which is to say:
0 =
b
i
h i
L
+
d h i
L
dt
a
q i
dt
vi
Since every h i vanishes at both extremities a and b , we may integrate by parts the second term of the square bracket to obtain:
0 =
b
i
h i
L
d
L
dt
a
q i
dt
vi
As h i is arbitrary, the square bracket must vanisheverywhere, for every i. (If this wasn't so, the whole equality would be violated for a choiceof h i vanishing wherever the square bracketdoesn't have a prescribed sign).
Theoretical and Practical Examples :
The above is most commonly used as the basis forvariational mechanics andrelated aspects of theoretical physics based on a principle of least action. However, it's also the correct answer to more practical concerns:
As a pilot, I've always been interested in writing a proper piece offlight-planning software to optimize the plane's path with regard to time, fuel,or any combination thereof. [...]
I've always felt the professionally-provided solutions were crude approximations that donot take into account the full range of possibilities, especially over long distance,as one crosses varying jet streams at different altitudes, etc. [...]What is the correct mathematical approach?
Thanks a lot. Bill Swearer
Well, just express carefully the local cost function (L) as a function of the position of the aicraft and of the related derivatives (to a good approximation, the latter are only useful to compute horizontal speed). Predicted meteorological conditions (changing with timethroughout space) can be used for best planning.
The Euler-Lagrange equations then tell the pilot what to do at all times.
A slight modification of theabove proof yields a straightderivation of one of thegreatest statements of mathematical physics. Let's just do it...
Suppose that, for specific functions hi , a symmetry exists which leaves L  unchanged, (to first-order in about 0) when q i is replaced by
Q i = q i + h i
Formally, this leads to a situation similar to the previous one,since we still know that the derivative of S() vanishes at = 0 (albeit for very different reasons). However, the h functions need not vanish at the extremities a and b. So, an extra "integrated term" appearsin the following expression of the derivative of S() which results from our integration by parts:
0 =
i
h i
L
b
+
b
i
h i
L
d
L
dt
vi
a
a
q i
dt
vi
Now, as the integral still vanishes (becausethe previously established Euler-Lagrange equations make everysquare bracket vanish) the extra term must be zero as well. This means that the following quantity doesn't change:
Conserved Noether Charge
i
h i
L
vi
(2012-04-02) Paths of least length.
On a 2D surface M (u,v) the infinitesimaldistance corresponding to a displacement du,dv is given bythefirst fundamental quadratic form:
1(du,dv) = (dM)2 = E (du)2 + 2 F du dv + G (dv)2
The problem of minimizing the length of the path between two points on that surface isa standard exemple of thecalculus of variations. Let's introduce notations compatible with the above:
q1 = u q2 = v v1 = du/dt v2 = dv/dt L = [1(v1, v2)] = [ E v12 + 2 F v1v2 + G v22]
The first of those (i = 1, qi = u) translates into:
E'u v12 + 2 F'u v1v2 + G'u v22
=
d
E v1 + F v2
2 L
dt
L
Since dX/dt = v1 X/u +v2 X/v the right-hand side is equal to:
(2010-07-06) The curve of fastest descent in a uniform gravitational field.
We are now presenting the brachistochrone problem as a simple applicationof thecalculus of variation. Historically, the relation is reversed,as this problem actually helped define the need for the latter, which was formalizedmany years after the brachistochrone problem was solved.
Along what curve would a point-mass fall from one prescribed point to another lowerone (not directly underneath) in the least possible time?
Johann Bernoulli's own solution was published in the same journala year later, along with 4 of the 5 solutions sent by famous contributors (the solution of Guillaumede l'Hôpital was only published in 1988).
We shall use a coordinate system where the origin is the starting point. We choose to orient the y-axis downward so that y is positive forall target points. Let's call u the slope of the trajectory dy/dx. In a gravitational field g the conservation of mechanical energy for a mass m dropped from the origin at zero speed tells us that:
m g y = ½ m [ (dx/dt)2 + (dy/dt)2] Therefore, ( 2 g y ) ½ dt = ( 1 + u2) ½ dx
To minimize the descent time, we seek to minimize the path integral of dt or, equivalently,to minimize the integral of (2g)½ dt = L dx
That last equality holds only when dy/dx doesn't vanish, which need notalways be true. We put v = u2 and separate thevariables to integrate:
d {Log y} = dy/y = - dv / (1+v) = - d { Log (1+v) }
Therefore, y / 2a = 1/(1+v) which is to say v = 2a / y-1 for some a.
( dy/dx ) 2 = 2a / y - 1
This characterizes acycloidcorresponding to a wheel of radius a.
(2016-01-11) Idealized legendary real-estate deal upon which Carthage was founded.
They purchased a site, which was named 'Bull's Hide' after the bargain by which they should get as much land as they could enclose with a bull's hide. Aeneid by Virgil (70-19 BC)
The land so acquired by queen Dido,founder Carthage, is interpreted to be whatever area could be bounded by the seashore on one side and arope of fixed length on the other. (The preliminary step was to cut the hide into very thin strips to make a long rope out of it.) The problem is to maximize that area, assuming the shore line to be perfectly straight.
To prove that the solution is a half-circle, one considers the two straight lines drawnfrom an arbitrary point on the optimal boundary to either extremity of the rope(on the shoreline). If we assume that both parts of the rope are rigidly attachedto those two lines, there's a constant area between those lines and their respective ropes. What remains is the area of the triangle formedby two sides of fixed length and part of the shore line. This area is greatest when the two sides are perpendicular!
By the second theorem ofThales,a planar curve where an arbitrary point always sees the extremities ata right-angle is a semicircle.
The isometric inequality is a corollary of Dido's solution :
In 1846, the great Swiss geometerJakob Steiner(1796-1863) showed that the above solution of Dido's problem implies the familiar 2-dimensional isoperimetric inequality (i.e., the planar curve of given perimeterwhich encloses the greatest area is a circle). Three steps:
The optimal shape is convex (or else itsconvex hullwould have lesser perimeter and greater area).
If two points cut the perimeter of an optimal convex shape in half,the straight line joining them necessarily cuts the shape into two portions of equal area. (Otherwise a better shape could be formed using two parts congruent to thelarger portion.)
As solutions of Dido's problem, both halvesmust be semicircles. Therefore, the whole shape is a circle.
No such simple geometric solutions are available to justify the higher-dimensionalcounterparts of the isoperimetric inequality, discussednext.
(2008-11-10) Among planar loops of unit length, the circle encloses the largest area.
The surface area S enclosed by a closed planar curveof given perimeter P is largest in the case of a circle. This ancient statement (proved geometrically by Steiner, in 1846) can be expressed by the following relation, known as the isoperimetric inequality:
S ≤ P 2/ 4
Generalizations to Higher Dimensions :
The three-dimensional equivalent of the isoperimetric inequality says that theclosed surface of area S which encloses the largest volume V is a sphere.
V 2 ≤ S 3/ 36
The abovecan be generalized to n dimensions: No hypersurface of hyperarea S encloses a larger hypervolume V than the hypersphere. This makes the relations given in the oldest Numericana article yield:
V n-1 ≤ ( 1 +n/2) [ S/ n ] n
(2008-11-10) The mean curvature of a soap film is constant.
That constant vanishes when both sides of the soap film are at the same pressure,in which case the film is a surface of least area within the given border. However, the non-vanishing case was also considered by Joseph Plateau;it arises when soap film encloses pockets of air...
Besides the familiar spherical soap bubble, the simplest case is a soap bubblemerged within a large planar film. It takes on the form of two sphericalcaps which form an outer angle of 120° (inner angle of 60°) with the planar part of the filmwhwere they meet. The thickness 2h and the equatorial diameter d of thelenticular bubble are proportional to the radius of curvature of the spherical parts:
d
=
R
2h
=
(2-3) R
= 0.267949...R
The Plateau rules (1873) state that the solutions are smooth surfaces of constantmean curvature withonly two possible types of inner singularities:
Lines where 3 such surfaces meet at equal angles (120°).
Isolated points where 4 such lines meet at equal angles .
Jean Taylor (1944-) published a proof of Plateau's laws in 1976.
Euler's catenoid (1741) :
By finding that the catenoid is the only minimalsurface of revolution (besides planes orthogonal to the axis) Euler solved Plateau's problem (well before it was so named) when the given border consists ofa pair of two sufficiently close coaxial circles.
When the parallel planes of the coaxial circles are too far apart, no catenoid can join them and the solution of Plateau's problemjust consists of separate planar surfaces.
(2008-11-18) Minimal surfaces without edges or self-intersections in ordinary space.
Such surfaces are technically called complete embedded minimal surfaces. Before 1982, only four examples were known:
A surface, found byRiemann in 1860, consisting ofinfinitely many horizontal planes with slanted tunnels between adjacent pairs.
In 1982, CelsoJ. Costa (a graduate student from Brazil) stumbled upon aminimal surface topologically equivalent to a 3-hole torus. Costa suspected that his surface had no self-intersections but,at first, couldn't prove it...
David Hoffman teamed up with WilliamH. Meeks III .and Jim Hoffman, (then a graduate student) to produce a computer visualization of thestunning symmetries of the strange surface described by Costa (which contains two straight lines meeting at a right angle). Dividing Costa's surface into 8 congruent pieces,they proved, indeed, that it never intersects itself!
Loosely speaking, Costa's surface features two complementarypairs of "tunnels" through theequatorial plane which connect the inside of some "catenoidal" northern halfand the outside of its southern half, or vice-versa.
Subsequently, Dave Hoffman and Bill Meeks discovered that Costa's surface is just the simplestmember of a whole family of complete minimalembedded surfaces constructed in the same way but with more "tunnels"...
Applied to helicoids, the idea yields yet another family of complete minimalembedded surfaces where tunnels provide shortcuts betweenslices of space which are otherwise connected only by circling around the helicoid's axis.
Denis Viala (2008-02-28) Given n blue dots and n red dots in the plane (no three aligned) there are always n disjoint segments with blue and red extremities.
Proof. (: This puzzle appears among optimization problems.)
(2008-11-09) Connecting three dots with lines of least total length.
The three vertices of a triangle ABC can be connected using just its two shortest sides. If the angle between those two sides is 120° or larger, this is the most economical way to do so.
Otherwise, we may define a fourth point T such thatthe three angles ATB, BTC and CTA are all equal to 120°. Then, the three llines TA, TB and TC form the most economicalway to connect the three dots A, B and C.
The point T is calledthe Torricelli point of ABC. The results for the first and the second of the above cases are unifiedby defining T to be the vertex opposite the longest sidein the first case.
Likewise, four points in a plane are best connected by introducing 0, 1 or 2 new Torricelli points.
(2008-11-09) In 1999, Thomas Hales finally proved an "obvious" fact.
Nobody questionedthe celebrated Honeycomb conjecture before 1993, when Phelan and Weaire showed its3D counterpart (Kelvin's conjecture) to be false! The issue was settled in1999 with a proper proof by Thomas C. Hales : Thus, the 2D Honeycomb conjecture is now a theorem! Here it is:
In any partition of the plane into tiles of unit area,each tile cannot have a lesser perimeter than the regular hexagonof unit area. (Loosely speaking, the regular honeycomb tilingis the most economical one.)
The first proof by Thomas C. Hales (1999) was surprisingly intricate. It was crucial to get rid of theconvexity restriction which earlier authors had reluctantly imposed. (Theisoperimetric inequalityimplies that the boundaries betweenoptimal shapes are either circular arcs or straight lines. Both sides of such a boundary areconvex only in the latter case.) The reason curved boundaries are ruled outis not at all obvious; it is false in the 3D case.
(2015-01-10) Surfaces whose borders have mirror-symmetry and an odd ternary axis.
With a vertical axis, such a curve has a vertical plane of symmetry, the vertical coordinate is unchanged by rotation of 2/3 (1/3 of a turn) and its sign changes for half of that (/3 rotation).
A surface containing that line and having the same symmetry will be described byan equation of the following type, in cylindrical coordinates:
z = f ( r , sin 3 ) [f being odd for either argument ]
Let's find the functions f which make the mean curvature constant:
(2008-11-09) Kelvin's problem & counterexamples to Kelvin's conjecture.
Kelvin's problem is to partition space into unit-volume cells with leastsurface area. Simon Lhuilier(1750-1840) called this one of the most difficult problems in geometry.
Lord Kelvin (1824-1907) observed that a partition of space into Archimedeantruncated octahedra was quite economical. He used the vertices as the basis for a proper foam,with curved faces and edges engineered tosatisfy Plateau's rules. The tetragonal faces of Kelvin's cell are planar figures whosecurved edges meet at the magic angle of 109.47122°. The hexagonal walls are monkey saddles with straight diagonals. In 1887, Kelvin conjectured that his foam was the answer to the above optimization problem.
In the straight polyhedron, the angle between a square and an adjoining hexagon is slightly less than 60°. An easy way to obtain the exact value is to consider the cube formed by the sixsquare faces. Seen from the center, the middles of the hexagons are in the directionsof the corners of that cube. Therefore:
= acos ( 1/3 ) = 54.73561...°
By Plateau's rules, an optimally curved hexagonal surface alwaysmeets the plane of an adjoining tetragonal face at an angle of 60°. Therefore, the curved surface at the middle of an edge is inclined 5.26439° with respect to the plane of the original flat hexagon.
On the other hand, an edge between two hexagonal faces curves inward. (It's the outward-curving edge of a tetragon in an adjacent cell.)
Although the equal-volume foams discussed below are more economical than this one, Kelvin's foam remains best among all monohedral foams (i.e., foams whose cells are all congruent to each other).
In 1993, Denis Weaire (1942-) and his student Robert Phelan disproved Kelvin's conjecture with a structuremore economical than Kelvin's foam: The A15 Weaire-Phelanstructure is one example of a Frank-Kasper foam. It consists of two distinct types of cells, withpentagonal or hexagonal faces. One cell is a squashed dodecahedron, the other is a tetradecahedron. That basic structure occurs naturally in crystals of tungsten.
In a draft (2008-10-25) of a paper published on2009-06-02, Ruggero Gabbrielli presentsa new type of unit-cell foam (featuring some tetragonal faces) whose cost (5.303) is intermediate betweenthe Weaire-Phelan foam (5.288) and Kelvin's foam (5.306). He calls this structure "P42a". The repeating pattern consists of 14 curvedcells of 4 different types (ten tetradecahedra and four tridecahedra). It has an average of 96/7 = 13.714285... faces per cell.
The above pictures for the A15 and P42a foams wereobtained by Ruggero Gabbrielli using the 3dt visualization software. Gabbrielli has also posted several related interactive3D visualizations under Java. The 1993 optimization on A15 and Gabbrielli's recent research both used Ken Brakke'sSurface Evolver(with specific code provided by John M. Sullivan, in the latter case at least).
At this writing, the A15 foam described by Weaire and Phelan in1993 is still conjectured to be the answer to Kelvin's problem. It is about 0.3% less costly than Kelvin's original foam.
The Weaire-Phelan foam was used as the basis for the design of the celebratedWater-cube of the 2008 Olympics (Beijing National Aquatics Center) which is the largest structurecovered withETFE film.
(2018-06-17) What spherical distributions of N charges minimize potential energy?
J.J. Thomson is credited with thediscovery of the electron in 1897 (to be fair, Jean Perrin had discovered it two years earlier).
This was the first subatomic particle ever discovered. For quite a while, nobody knew exactly how neutral atoms could be made from those negativeelectrons and some unknown positive stuff. Thomson himself envisioned a naive model which became known as the Plum-pudding model (1904) whereby mutually-repulsive electrons would sit at the surface of a spherical blob of positive stuff.
Back in 1904, J.J. Thomson wondered how N like charges would be distributedon the surface of a sphere at equilibrium (minimizing potential energy, at least locally). He set it as a purely mathematical puzzle, mostly for the Coulomb potential (1/r) but possibly for other potentials as well.
Until 2010, the only known analytical solutions were the obvious configurationsfor 1, 2, 3, 4, 6 and 12 charges (forming a Platonic solid in the last three cases). It was proved only recently (2010) that the triangular dipyramid is the only solution for 5 charges.
Surprisingly enough, the Platonic solids for N = 8 or 20 are not solutions (better distributions have been found numerically). For example, for N=8, the cube is improved upon by asquare antiprism obtained from that cube by rotatingtwo opposing faces 45° from each other (about their common axis). That shape happens to be optimal for the Coulomb potential.