Movatterモバイル変換

Alternating-direction implicit method

From Wikipedia, the free encyclopedia

(Redirected fromAlternating direction implicit method)

Innumerical linear algebra, thealternating-direction implicit (ADI) method is an iterative method used to solveSylvester matrix equations. It is a popular method for solving the large matrix equations that arise insystems theory andcontrol,^[1] and can be formulated to construct solutions in a memory-efficient, factored form.^[2]^[3] It is also used to numerically solveparabolic andelliptic partial differential equations, and is a classic method used for modelingheat conduction and solving thediffusion equation in two or more dimensions.^[4] It is an example of anoperator splitting method.^[5]

The method was developed atHumble Oil in the mid-1950s by Jim Douglas Jr, Henry Rachford, and Don Peaceman.^[6]

ADI for matrix equations

[edit]

The method

[edit]

The ADI method is a two step iteration process that alternately updates the column and row spaces of an approximate solution to $AX-XB=C$ . One ADI iteration consists of the following steps:^[7]

1. Solve for $X^{(j+1/2)}$ , where $\left(A-\beta _{j+1}I\right)X^{(j+1/2)}=X^{(j)}\left(B-\beta _{j+1}I\right)+C.$

2. Solve for $X^{(j+1)}$ , where $X^{(j+1)}\left(B-\alpha _{j+1}I\right)=\left(A-\alpha _{j+1}I\right)X^{(j+1/2)}-C$ .

The numbers $(\alpha _{j+1},\beta _{j+1})$ are called shift parameters, and convergence depends strongly on the choice of these parameters.^[8]^[9] To perform $K {\displaystyle K}$ iterations of ADI, an initial guess $X^{(0)}$ is required, as well as $K {\displaystyle K}$ shift parameters, $\{(\alpha _{j},\beta _{j})\}_{j=1}^{K}$ .

When to use ADI

[edit]

If $A\in \mathbb {C} ^{m\times m}$ and $B\in \mathbb {C} ^{n\times n}$ , then $AX-XB=C$ can be solved directly in ${\mathcal {O}}(m^{3}+n^{3})$ using theBartels-Stewart method.^[10] It is therefore only beneficial to use ADI when matrix-vector multiplication and linear solves involving $A {\displaystyle A}$ and $B {\displaystyle B}$ can be applied cheaply.

The equation $AX-XB=C$ has a unique solution if and only if $\sigma (A)\cap \sigma (B)=\emptyset$ , where $\sigma (M)$ is thespectrum of $M {\displaystyle M}$ .^[1] However, the ADI method performs especially well when $\sigma (A)$ and $\sigma (B)$ are well-separated, and $A {\displaystyle A}$ and $B {\displaystyle B}$ arenormal matrices. These assumptions are met, for example, by the Lyapunov equation $AX+XA^{*}=C$ when $A {\displaystyle A}$ ispositive definite. Under these assumptions, near-optimal shift parameters are known for several choices of $A {\displaystyle A}$ and $B {\displaystyle B}$ .^[8]^[9] Additionally, a priori error bounds can be computed, thereby eliminating the need to monitor the residual error in implementation.

The ADI method can still be applied when the above assumptions are not met. The use of suboptimal shift parameters may adversely affect convergence,^[1] and convergence is also affected by the non-normality of $A {\displaystyle A}$ or $B {\displaystyle B}$ (sometimes advantageously).^[11]Krylov subspace methods, such as the Rational Krylov Subspace Method,^[12] are observed to typically converge more rapidly than ADI in this setting,^[1]^[3] and this has led to the development of hybrid ADI-projection methods.^[3]

Shift-parameter selection and the ADI error equation

[edit]

The problem of finding good shift parameters is nontrivial. This problem can be understood by examining the ADI error equation. After $K {\displaystyle K}$ iterations, the error is given by

$X-X^{(K)}=\prod _{j=1}^{K}{\frac {(A-\alpha _{j}I)}{(A-\beta _{j}I)}}\left(X-X^{(0)}\right)\prod _{j=1}^{K}{\frac {(B-\beta _{j}I)}{(B-\alpha _{j}I)}}.$

Choosing $X^{(0)}=0$ results in the following bound on the relative error:

${\frac {\left\|X-X^{(K)}\right\|_{2}}{\|X\|_{2}}}\leq \|r_{K}(A)\|_{2}\|r_{K}(B)^{-1}\|_{2},\quad r_{K}(M)=\prod _{j=1}^{K}{\frac {(M-\alpha _{j}I)}{(M-\beta _{j}I)}}.$

where $\|\cdot \|_{2}$ is theoperator norm. The ideal set of shift parameters $\{(\alpha _{j},\beta _{j})\}_{j=1}^{K}$ defines arational function $r_{K}$ that minimizes the quantity $\|r_{K}(A)\|_{2}\|r_{K}(B)^{-1}\|_{2}$ . If $A {\displaystyle A}$ and $B {\displaystyle B}$ arenormal matrices and haveeigendecompositions $A=V_{A}\Lambda _{A}V_{A}^{*}$ and $B=V_{B}\Lambda _{B}V_{B}^{*}$ , then

$\|r_{K}(A)\|_{2}\|r_{K}(B)^{-1}\|_{2}=\|r_{K}(\Lambda _{A})\|_{2}\|r_{K}(\Lambda _{B})^{-1}\|_{2}$ .

Near-optimal shift parameters

[edit]

Near-optimal shift parameters are known in certain cases, such as when $\Lambda _{A}\subset [a,b]$ and $\Lambda _{B}\subset [c,d]$ , where $[a,b]$ and $[c,d]$ are disjoint intervals on the real line.^[8]^[9] TheLyapunov equation $AX+XA^{*}=C$ , for example, satisfies these assumptions when $A {\displaystyle A}$ ispositive definite. In this case, the shift parameters can be expressed in closed form usingelliptic integrals, and can easily be computed numerically.

More generally, if closed, disjoint sets $E {\displaystyle E}$ and $F {\displaystyle F}$ , where $\Lambda _{A}\subset E$ and $\Lambda _{B}\subset F$ , are known, the optimal shift parameter selection problem is approximately solved by finding an extremal rational function that attains the value

$Z_{K}(E,F):=\inf _{r}{\frac {\sup _{z\in E}|r(z)|}{\inf _{z\in F}|r(z)|}},$

where the infimum is taken over all rational functions of degree $(K,K)$ .^[9] This approximation problem is related to several results inpotential theory,^[13]^[14] and was solved byZolotarev in 1877 for $E {\displaystyle E}$ = [a, b] and $F=-E.$ ^[15] The solution is also known when $E {\displaystyle E}$ and $F {\displaystyle F}$ are disjoint disks in the complex plane.^[16]

Heuristic shift-parameter strategies

[edit]

When less is known about $\sigma (A)$ and $\sigma (B)$ , or when $A {\displaystyle A}$ or $B {\displaystyle B}$ are non-normal matrices, it may not be possible to find near-optimal shift parameters. In this setting, a variety of strategies for generating good shift parameters can be used. These include strategies based on asymptotic results in potential theory,^[17] using the Ritz values of the matrices $A {\displaystyle A}$ , $A^{-1}$ , $B {\displaystyle B}$ , and $B^{-1}$ to formulate a greedy approach,^[18] and cyclic methods, where the same small collection of shift parameters are reused until a convergence tolerance is met.^[18]^[11] When the same shift parameter is used at every iteration, ADI is equivalent to an algorithm called Smith's method.^[19]

Factored ADI

[edit]

In many applications, $A {\displaystyle A}$ and $B {\displaystyle B}$ are very large, sparse matrices, and $C {\displaystyle C}$ can be factored as $C=C_{1}C_{2}^{*}$ , where $C_{1}\in \mathbb {C} ^{m\times r},C_{2}\in \mathbb {C} ^{n\times r}$ , with $r=1,2$ .^[1] In such a setting, it may not be feasible to store the potentially dense matrix $X {\displaystyle X}$ explicitly. A variant of ADI, called factored ADI,^[3]^[2] can be used to compute $ZY^{*}$ , where $X\approx ZY^{*}$ . The effectiveness of factored ADI depends on whether $X {\displaystyle X}$ is well-approximated by a low rank matrix. This is known to be true under various assumptions about $A {\displaystyle A}$ and $B {\displaystyle B}$ .^[11]^[9]

ADI for parabolic equations

[edit]

Historically, the ADI method was developed to solve the 2D diffusion equation on a square domain using finite differences.^[4] Unlike ADI for matrix equations, ADI for parabolic equations does not require the selection of shift parameters, since the shift appearing in each iteration is determined by parameters such as the timestep, diffusion coefficient, and grid spacing. The connection to ADI on matrix equations can be observed when one considers the action of the ADI iteration on the system at steady state.

Example: 2D diffusion equation

[edit]

Stencil figure for the alternating direction implicit method in finite difference equations

The traditional method for solving the heat conduction equation numerically is theCrank–Nicolson method. This method results in a very complicated set of equations in multiple dimensions, which are costly to solve. The advantage of the ADI method is that the equations that have to be solved in each step have a simpler structure and can be solved efficiently with thetridiagonal matrix algorithm.

Consider the linear diffusion equation in two dimensions,

{\partial u \over \partial t}=\left({\partial ^{2}u \over \partial x^{2}}+{\partial ^{2}u \over \partial y^{2}}\right)=(u_{xx}+u_{yy})

The implicit Crank–Nicolson method produces the following finite difference equation:

{u_{ij}^{n+1}-u_{ij}^{n} \over \Delta t}={1 \over 2(\Delta x)^{2}}\left(\delta _{x}^{2}+\delta _{y}^{2}\right)\left(u_{ij}^{n+1}+u_{ij}^{n}\right)

where:

\Delta x=\Delta y

and $\delta _{p}^{2}$ is the central second difference operator for thep-th coordinate

\delta _{p}^{2}u_{ij}=u_{ij+e_{p}}-2u_{ij}+u_{ij-e_{p}}

with $e_{p}=(1,0)$ or $(0,1)$ for $p=x$ or $y {\displaystyle y}$ respectively (and $i j {\displaystyle ij}$ a shorthand for lattice points $(i,j)$ ).

After performing astability analysis, it can be shown that this method will be stable for any $\Delta t$ .

A disadvantage of the Crank–Nicolson method is that the matrix in the above equation isbanded with a band width that is generally quite large. This makes direct solution of thesystem of linear equations quite costly (although efficient approximate solutions exist, for example use of theconjugate gradient method preconditioned withincomplete Cholesky factorization).

The idea behind the ADI method is to split the finite difference equations into two, one with thex-derivative taken implicitly and the next with they-derivative taken implicitly,

{u_{ij}^{n+1/2}-u_{ij}^{n} \over \Delta t/2}={\left(\delta _{x}^{2}u_{ij}^{n+1/2}+\delta _{y}^{2}u_{ij}^{n}\right) \over \Delta x^{2}}

{u_{ij}^{n+1}-u_{ij}^{n+1/2} \over \Delta t/2}={\left(\delta _{x}^{2}u_{ij}^{n+1/2}+\delta _{y}^{2}u_{ij}^{n+1}\right) \over \Delta y^{2}}

The system of equations involved issymmetric and tridiagonal (banded with bandwidth 3), and is typically solved usingtridiagonal matrix algorithm.

It can be shown that this method is unconditionally stable and second order in time and space.^[20] There are more refined ADI methods such as the methods of Douglas,^[21] or the f-factor method^[22] which can be used for three or more dimensions.

Generalizations

[edit]

The usage of the ADI method as anoperator splitting scheme can be generalized. That is, we may consider general evolution equations

{\dot {u}}=F_{1}u+F_{2}u,

where $F_{1}$ and $F_{2}$ are (possibly nonlinear) operators defined on aBanach space.^[23]^[24] In the diffusion example above we have $F_{1}={\partial ^{2} \over \partial x^{2}}$ and $F_{2}={\partial ^{2} \over \partial y^{2}}$ .

Fundamental ADI (FADI)

[edit]

Simplification of ADI to FADI

[edit]

It is possible to simplify the conventional ADI method into Fundamental ADI method, which only has the similar operators at the left-hand sides while being operator-free at the right-hand sides. This may be regarded as the fundamental (basic) scheme of ADI method,^[25]^[26] with no more operator (to be reduced) at the right-hand sides, unlike most traditional implicit methods that usually consist of operators at both sides of equations. The FADI method leads to simpler, more concise and efficient update equations without degrading the accuracy of conventional ADI method.

Relations to other implicit methods

[edit]

Many classical implicit methods by Peaceman-Rachford, Douglas-Gunn, D'Yakonov, Beam-Warming, Crank-Nicolson, etc., may be simplified to fundamental implicit schemes with operator-free right-hand sides.^[26] In their fundamental forms, the FADI method of second-order temporal accuracy can be related closely to the fundamental locally one-dimensional (FLOD) method, which can be upgraded to second-order temporal accuracy, such as for three-dimensional Maxwell's equations^[27]^[28] incomputational electromagnetics. For two- and three-dimensional heat conduction and diffusion equations, both FADI and FLOD methods may be implemented in simpler, more efficient and stable manner compared to their conventional methods.^[29]^[30]

References

[edit]

^^a ^b ^c ^d ^eSimoncini, V. (2016). "Computational Methods for Linear Matrix Equations".SIAM Review.58 (3):377–441.doi:10.1137/130912839.hdl:11585/586011.ISSN 0036-1445.S2CID 17271167.
^^a ^bLi, Jing-Rebecca; White, Jacob (2002). "Low Rank Solution of Lyapunov Equations".SIAM Journal on Matrix Analysis and Applications.24 (1):260–280.doi:10.1137/s0895479801384937.ISSN 0895-4798.
^^a ^b ^c ^dBenner, Peter; Li, Ren-Cang; Truhar, Ninoslav (2009)."On the ADI method for Sylvester equations".Journal of Computational and Applied Mathematics.233 (4):1035–1045.Bibcode:2009JCoAM.233.1035B.doi:10.1016/j.cam.2009.08.108.ISSN 0377-0427.
^^a ^bPeaceman, D. W.; Rachford Jr., H. H. (1955), "The numerical solution of parabolic and elliptic differential equations",Journal of the Society for Industrial and Applied Mathematics,3 (1):28–41,doi:10.1137/0103003,hdl:10338.dmlcz/135399,MR 0071874.
^*Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. (2007)."Section 20.3.3. Operator Splitting Methods Generally".Numerical Recipes: The Art of Scientific Computing (3rd ed.). New York: Cambridge University Press.ISBN 978-0-521-88068-8. Archived fromthe original on 2011-08-11. Retrieved2011-08-18.
^"Prof. Jim Douglas". Purdue University. May 2, 2016. RetrievedMarch 25, 2025.
^Wachspress, Eugene L. (2008)."Trail to a Lyapunov equation solver".Computers & Mathematics with Applications.55 (8):1653–1659.doi:10.1016/j.camwa.2007.04.048.ISSN 0898-1221.
^^a ^b ^cLu, An; Wachspress, E.L. (1991). "Solution of Lyapunov equations by alternating direction implicit iteration".Computers & Mathematics with Applications.21 (9):43–58.doi:10.1016/0898-1221(91)90124-m.ISSN 0898-1221.
^^a ^b ^c ^d ^eBeckermann, Bernhard; Townsend, Alex (2017). "On the Singular Values of Matrices with Displacement Structure".SIAM Journal on Matrix Analysis and Applications.38 (4):1227–1248.arXiv:1609.09494.doi:10.1137/16m1096426.ISSN 0895-4798.S2CID 3828461.
^Golub, G.; Van Loan, C (1989).Matrix computations (Fourth ed.). Baltimore: Johns Hopkins University.ISBN 1421407949.OCLC 824733531.
^^a ^b ^cSabino, J (2007).Solution of large-scale Lyapunov equations via the block modified Smith method.PhD Diss., Rice Univ. (Thesis).hdl:1911/20641.
^Druskin, V.; Simoncini, V. (2011). "Adaptive rational Krylov subspaces for large-scale dynamical systems".Systems & Control Letters.60 (8):546–560.doi:10.1016/j.sysconle.2011.04.013.ISSN 0167-6911.
^Saff, E.B.; Totik, V. (2013-11-11).Logarithmic potentials with external fields. Berlin: Springer Science & Business Media.ISBN 9783662033296.OCLC 883382758.
^Gonchar, A.A. (1969). "Zolotarev problems connected with rational functions".Mathematics of the USSR-Sbornik.7 (4):623–635.Bibcode:1969SbMat...7..623G.doi:10.1070/SM1969v007n04ABEH001107.
^Zolotarev, D.I. (1877). "Application of elliptic functions to questions of functions deviating least and most from zero".Zap. Imp. Akad. Nauk. St. Petersburg.30:1–59.
^Starke, Gerhard (July 1992)."Near-circularity for the rational Zolotarev problem in the complex plane".Journal of Approximation Theory.70 (1):115–130.doi:10.1016/0021-9045(92)90059-w.ISSN 0021-9045.
^Starke, Gerhard (June 1993)."Fejér-Walsh points for rational functions and their use in the ADI iterative method".Journal of Computational and Applied Mathematics.46 (1–2):129–141.doi:10.1016/0377-0427(93)90291-i.ISSN 0377-0427.
^^a ^bPenzl, Thilo (January 1999). "A Cyclic Low-Rank Smith Method for Large Sparse Lyapunov Equations".SIAM Journal on Scientific Computing.21 (4):1401–1418.Bibcode:1999SJSC...21.1401P.doi:10.1137/s1064827598347666.ISSN 1064-8275.
^Smith, R. A. (January 1968). "Matrix Equation XA + BX = C".SIAM Journal on Applied Mathematics.16 (1):198–201.doi:10.1137/0116017.ISSN 0036-1399.
^Douglas, J. Jr. (1955), "On the numerical integration of u_xx+ u_yy= u_t by implicit methods",Journal of the Society for Industrial and Applied Mathematics,3:42–65,MR 0071875.
^Douglas, Jim Jr. (1962), "Alternating direction methods for three space variables",Numerische Mathematik,4 (1):41–63,doi:10.1007/BF01386295,ISSN 0029-599X,S2CID 121455963.
^Chang, M. J.; Chow, L. C.; Chang, W. S. (1991),"Improved alternating-direction implicit method for solving transient three-dimensional heat diffusion problems",Numerical Heat Transfer, Part B: Fundamentals,19 (1):69–84,Bibcode:1991NHTB...19...69C,doi:10.1080/10407799108944957,ISSN 1040-7790.
^Hundsdorfer, Willem; Verwer, Jan (2003).Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations. Berlin, Heidelberg: Springer Berlin Heidelberg.ISBN 978-3-662-09017-6.
^Lions, P. L.; Mercier, B. (December 1979). "Splitting Algorithms for the Sum of Two Nonlinear Operators".SIAM Journal on Numerical Analysis.16 (6):964–979.Bibcode:1979SJNA...16..964L.doi:10.1137/0716071.
^Tan, E. L. (2007)."Efficient Algorithm for the Unconditionally Stable 3-D ADI-FDTD Method"(PDF).IEEE Microwave and Wireless Components Letters.17 (1):7–9.doi:10.1109/LMWC.2006.887239.hdl:10356/138245.S2CID 29025478.
^^a ^bTan, E. L. (2008)."Fundamental Schemes for Efficient Unconditionally Stable Implicit Finite-Difference Time-Domain Methods"(PDF).IEEE Transactions on Antennas and Propagation.56 (1):170–177.arXiv:2011.14043.Bibcode:2008ITAP...56..170T.doi:10.1109/TAP.2007.913089.hdl:10356/138249.S2CID 37135325.
^Tan, E. L. (2007)."Unconditionally Stable LOD-FDTD Method for 3-D Maxwell's Equations"(PDF).IEEE Microwave and Wireless Components Letters.17 (2):85–87.doi:10.1109/LMWC.2006.890166.hdl:10356/138296.S2CID 22940993.
^Gan, T. H.; Tan, E. L. (2013)."Unconditionally Stable Fundamental LOD-FDTD Method with Second-Order Temporal Accuracy and Complying Divergence"(PDF).IEEE Transactions on Antennas and Propagation.61 (5):2630–2638.Bibcode:2013ITAP...61.2630G.doi:10.1109/TAP.2013.2242036.S2CID 7578037.
^Tay, W. C.; Tan, E. L.; Heh, D. Y. (2014). "Fundamental Locally One-Dimensional Method for 3-D Thermal Simulation".IEICE Transactions on Electronics. E-97-C (7):636–644.Bibcode:2014IEITE..97..636T.doi:10.1587/transele.E97.C.636.hdl:10220/20410.
^Heh, D. Y.; Tan, E. L.; Tay, W. C. (2016). "Fast Alternating Direction Implicit Method for Efficient Transient Thermal Simulation of Integrated Circuits".International Journal of Numerical Modelling: Electronic Networks, Devices and Fields.29 (1):93–108.doi:10.1002/jnm.2049.hdl:10356/137201.S2CID 61039449.

Numerical methods for partial differential equations

Finite difference

Parabolic	Forward-time central-space (FTCS) Crank–Nicolson
Hyperbolic	Lax–Friedrichs Lax–Wendroff MacCormack Upwind Method of characteristics
Others	Alternating direction-implicit (ADI) Finite-difference frequency-domain (FDFD) Finite-difference time-domain (FDTD)