Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Galactic algorithm

From Wikipedia, the free encyclopedia
Classification of algorithm

Agalactic algorithm is analgorithm with record-breaking theoretical (asymptotic) performance, but which is not used due to practical constraints. Typical reasons are that the performance gains only appear for problems that are so large they never occur, or the algorithm's complexity outweighs a relatively small gain in real-world performance. Galactic algorithms were so named byRichard Lipton and Ken Regan,[1] because they will never be used on any data sets on Earth.

Possible use cases

[edit]

Even if they are never used in practice, galactic algorithms may still contribute tocomputer science:

  • An algorithm, even if impractical, may show new techniques that may eventually be used to create practical algorithms. See, for example, communication channel capacity, below.
  • Available computational power may catch up to the crossover point, so that a previously impractical algorithm becomes practical. See, for example, low-density parity-check codes, below.
  • An impractical algorithm can still demonstrate thatconjectured bounds can be achieved, or that proposed bounds are wrong, and hence advance the theory of algorithms (see, for example, Reingold's algorithm forconnectivity in undirected graphs). As Lipton states:[1]

    This alone could be important and often is a great reason for finding such algorithms. For example, if tomorrow there were a discovery that showed there is a factoring algorithm with a huge but provably polynomial time bound, that would change our beliefs about factoring. The algorithm might never be used, but would certainly shape the future research into factoring.

    Similarly, a hypothetical algorithm for theBoolean satisfiability problem with a large but polynomial time bound, such asΘ(n2100){\displaystyle \Theta {\bigl (}n^{2^{100}}{\bigr )}}, although unusable in practice, would settle theP versus NP problem, considered the most important open problem in computer science and one of theMillennium Prize Problems.[2][3]

Examples

[edit]

Integer multiplication

[edit]

An example of a galactic algorithm is the fastest known way tomultiply two numbers,[4] which is based on a 1729-dimensionalFourier transform.[5] It needsO(nlogn){\displaystyle O(n\log n)} bit operations, but as the constants hidden by thebig O notation are large, it is never used in practice. However, it also shows why galactic algorithms may still be useful. The authors state: "we are hopeful that with further refinements, the algorithm might become practical for numbers with merely billions or trillions of digits."[5]

Primality testing

[edit]

TheAKS primality test is galactic. It is the most theoretically sound of any known algorithm that can take an arbitrary number and tell if it isprime. In particular, it isprovably polynomial-time,deterministic, andunconditionally correct. All other known algorithms fall short on at least one of these criteria, but the shortcomings are minor and the calculations are much faster, so they are used instead.ECPP in practice runs much faster than AKS, but it has never been proven to be polynomial time. TheMiller–Rabin test is also much faster than AKS, but produces only a probabilistic result. However the probability of error can be driven down to arbitrarily small values (say<10100{\displaystyle <10^{-100}}), good enough for practical purposes. There is also adeterministic version of the Miller-Rabin test, which runs in polynomial time over all inputs, but its correctness depends on thegeneralized Riemann hypothesis (which is widely believed, but not proven). The existence of these (much) faster alternatives means AKS is not used in practice.

Matrix multiplication

[edit]

The first improvement over brute-forcematrix multiplication (which takesO(n3){\displaystyle O(n^{3})} operations) was theStrassen algorithm: a recursive algorithm that takesO(n2.807){\displaystyle O(n^{2.807})} operations. This algorithm is not galactic and is used in practice. Further extensions of this, using sophisticatedgroup theory, are theCoppersmith–Winograd algorithm and its slightly better successors, takingO(n2.373){\displaystyle O(n^{2.373})} operations. These are galactic – "We nevertheless stress that such improvements are only of theoretical interest, since the huge constants involved in the complexity of fast matrix multiplication usually make these algorithms impractical."[6]

Communication channel capacity

[edit]

Claude Shannon showed a simple but asymptotically optimalcode that can reach the theoretical capacity of acommunication channel. It requires assigning a random code word to every possiblen{\displaystyle n}-bit message, then decoding by finding the closest code word. Ifn{\displaystyle n} is chosen large enough, this beats any existing code and can get arbitrarily close to the capacity of the channel. Unfortunately, anyn{\displaystyle n} big enough to beat existing codes is also completely impractical.[7] These codes, though never used, inspired decades of research into more practical algorithms that today can achieve rates arbitrarily close to channel capacity.[8]

Sub-graphs

[edit]

The problem ofdeciding whether a graphG{\displaystyle G} containsH{\displaystyle H} as aminor isNP-complete in general, but whereH{\displaystyle H} is fixed, it can be solved in polynomial time. The running time for testing whetherH{\displaystyle H} is a minor ofG{\displaystyle G} in this case isO(n2){\displaystyle O(n^{2})},[9] wheren{\displaystyle n} is the number of vertices inG{\displaystyle G} and thebig O notation hides a constant that depends superexponentially onH{\displaystyle H}. The constant is greater than2↑↑(2↑↑(2↑↑(h/2))){\displaystyle 2\uparrow \uparrow (2\uparrow \uparrow (2\uparrow \uparrow (h/2)))} inKnuth's up-arrow notation, whereh{\displaystyle h} is the number of vertices inH{\displaystyle H}.[10] Even the case ofh=4{\displaystyle h=4} cannot be reasonably computed as the constant is greater than 2pentated by 4, or 2tetrated by 65536, that is,2↑↑↑4=655362=22265536{\displaystyle 2\uparrow \uparrow \uparrow 4={}^{65536}2=\underbrace {2^{2^{\cdot ^{\cdot ^{2}}}}} _{65536}}.

Cryptographic breaks

[edit]

Incryptography jargon, a "break" is any attack faster in expectation thanbrute force – i.e., performing one trial decryption for each possible key. For many cryptographic systems, breaks are known, but are still practically infeasible with current technology. One example is the best attack known against 128-bitAES, which takes only2126{\displaystyle 2^{126}} operations.[11] Despite being impractical, theoretical breaks can provide insight into vulnerability patterns, and sometimes lead to discovery of exploitable breaks.

Traveling salesman problem

[edit]

For several decades, the best known approximation to thetraveling salesman problem in ametric space was the very simpleChristofides algorithm which produced a path at most 50% longer than the optimum. (Many other algorithms couldusually do much better, but could not provably do so.) In 2020, a newer and much more complex algorithm was discovered that can beat this by1034{\displaystyle 10^{-34}} percent.[12] Although no one will ever switch to this algorithm for its very slight worst-case improvement, it is still considered important because "this minuscule improvement breaks through both a theoretical logjam and a psychological one".[13]

Hutter search

[edit]

A single algorithm, "Hutter search", can solve any well-defined problem in an asymptotically optimal time, barring some caveats[example needed]. It works by searching through all possible algorithms (by runtime), while simultaneously searching through all possibleproofs (by length of proof), looking for a proof of correctness for each algorithm. Since the proof of correctness is of finite size, it "only" adds a constant and does not affect the asymptotic runtime. However, this constant is so big that the algorithm is entirely impractical.[14][15] For example, if the shortest proof of correctness of a given algorithm is 1000 bits long, the search will examine at least 2999 other potential proofs first.

Hutter search is related toSolomonoff induction, which is a formalization ofBayesian inference. Allcomputable theories (as implemented by programs) which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories. Again, the search over all possible explanations makes this procedure galactic.

Optimization

[edit]

Simulated annealing, when used with a logarithmic cooling schedule, has been proven to find theglobal optimum of any optimization problem. However, such a cooling schedule results in entirely impractical runtimes, and is never used.[16] However, knowing this ideal algorithm exists has led to practical variants that are able to find very good (though not provably optimal) solutions to complex optimization problems.[17]

Minimum spanning trees

[edit]

Theexpected linear time MST algorithm is able to discover theminimum spanning tree of a graph inO(m+n){\displaystyle O(m+n)}, wherem{\displaystyle m} is the number of edges andn{\displaystyle n} is the number of nodes of the graph.[18] However, the constant factor that is hidden by theBig O notation is huge enough to make the algorithm impractical. An implementation is publicly available[19] and given the experimentally estimated implementation constants, it would only be faster thanBorůvka's algorithm for graphs in whichm+n>910151{\displaystyle m+n>9\cdot 10^{151}}.[20]

Hash tables

[edit]

Researchers have found an algorithm that achieves the provably best-possible[21] asymptotic performance in terms of time-space tradeoff.[22] But it remains purely theoretical: "Despite the new hash table’s unprecedented efficiency, no one is likely to try building it anytime soon. It’s just too complicated to construct."[23] and "in practice, constants really matter. In the real world, a factor of 10 is a game ender.”[23]

Connectivity in undirected graphs

[edit]

Connectivity in undirected graphs (also known as USTCON, for Undirected Source-Target CONnectivity) is the problem of deciding if a path exists between two nodes in an undirected graph, or in other words, if they are in the sameconnected component. When the usage ofO(N){\displaystyle O({\text{N}})} space is allowed, polynomial time solutions such asDijkstra's algorithm have been known and used for decades. But for many years it was unknown if this could be done deterministically inO(log N){\displaystyle O({\text{log N}})} space (classL), though it was known to be possible with randomized algorithms (classRL).

A breakthrough 2004 paper byOmer Reingold showed that USTCON is in fact inL,[24] providing an algorithm with asymptotically better space requirement. However,the algorithm's very big constant hidden by theO(log N){\displaystyle O({\text{log N}})} means that on any realistic problem it consumes significantly more memory and computation time than the well knownO(N){\displaystyle O({\text{N}})} algorithms. Despite not being used in practice, the paper is still a landmark in theory, and has been cited more than 1000 times as of 2025.

Low-density parity-check codes

[edit]

Low-density parity-check codes, also known asLDPC orGallager codes, are an example of an algorithm that was galactic when first developed, but became practical as computation improved. They were originally conceived byRobert G. Gallager in his doctoral dissertation[25] at theMassachusetts Institute of Technology in 1960.[26][27] Although their performance was much better than other codes of that time, reaching theGilbert–Varshamov bound for linear codes, the codes were largely ignored as their iterative decoding algorithm was prohibitively computationally expensive for the hardware available.[28]

Renewed interest in LDPC codes emerged following the invention of the closely relatedturbo codes (1993), whose similarly iterative decoding algorithm outperformed other codes used at that time. LDPC codes were subsequently rediscovered in 1996,[29] and became popular as a patent-free alternative.[30] Even though the turbo code patents have now expired, LDPC codes also have some technical advantages, and are used in many applications today.

References

[edit]
  1. ^abLipton, Richard J.; Regan, Kenneth W. (2013)."David Johnson: Galactic Algorithms".People, Problems, and Proofs: Essays from Gödel's Lost Letter: 2010. Heidelberg: Springer Berlin. pp. 109–112.ISBN 9783642414220.
  2. ^Fortnow, L. (2009)."The status of the P versus NP problem"(PDF).Communications of the ACM.52 (9):78–86.doi:10.1145/1562164.1562186.S2CID 5969255.
  3. ^Fortnow, Lance (2022)."Fifty Years of P Versus NP and the Possibility of the Impossible".Communications of the ACM.65 (1):76–85.doi:10.1145/3460351.
  4. ^David, Harvey; Hoeven, Joris van der (March 2019)."Integer multiplication in time O(n log n)".HAL. hal-02070778.
  5. ^abHarvey, David (9 April 2019)."We've found a quicker way to multiply really big numbers".The Conversation. Retrieved9 March 2023.
  6. ^Le Gall, F. (2012), "Faster algorithms for rectangular matrix multiplication",Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2012), pp. 514–523,arXiv:1204.1111,doi:10.1109/FOCS.2012.80,ISBN 978-0-7695-4874-6,S2CID 2410545
  7. ^Larry Hardesty (January 19, 2010)."Explained: The Shannon limit". MIT News Office.
  8. ^"Capacity-approaching codes (Chapter 13 ofPrinciples Of Digital Communication II)"(PDF).MIT OpenCourseWare. 2005.
  9. ^Kawarabayashi, Ken-ichi; Kobayashi, Yusuke;Reed, Bruce (2012)."The disjoint paths problem in quadratic time".Journal of Combinatorial Theory. Series B.102 (2):424–435.doi:10.1016/j.jctb.2011.07.004.
  10. ^Johnson, David S. (1987). "The NP-completeness column: An ongoing guide (edition 19)".Journal of Algorithms.8 (2):285–303.CiteSeerX 10.1.1.114.3864.doi:10.1016/0196-6774(87)90043-5.
  11. ^Biaoshuai Tao & Hongjun Wu (2015).Information Security and Privacy. Lecture Notes in Computer Science. Vol. 9144. pp. 39–56.doi:10.1007/978-3-319-19962-7_3.ISBN 978-3-319-19961-0.
  12. ^Anna R. Karlin; Nathan Klein; Shayan Oveis Gharan (September 1, 2020). "A (Slightly) Improved Approximation Algorithm for Metric TSP".arXiv:2007.01409 [cs.DS].
  13. ^Klarreich, Erica (8 October 2020)."Computer Scientists Break Traveling Salesperson Record".Quanta Magazine.
  14. ^Hutter, Marcus (2002-06-14). "The Fastest and Shortest Algorithm for All Well-Defined Problems".arXiv:cs/0206022.
  15. ^Gagliolo, Matteo (2007-11-20)."Universal search".Scholarpedia.2 (11): 2575.Bibcode:2007SchpJ...2.2575G.doi:10.4249/scholarpedia.2575.ISSN 1941-6016.
  16. ^Liang, Faming; Cheng, Yichen; Lin, Guang (2014). "Simulated stochastic approximation annealing for global optimization with a square-root cooling schedule".Journal of the American Statistical Association.109 (506):847–863.doi:10.1080/01621459.2013.872993.S2CID 123410795.
  17. ^Ingber, Lester (1993)."Simulated annealing: Practice versus theory".Mathematical and Computer Modelling.18 (11):29–57.CiteSeerX 10.1.1.15.1046.doi:10.1016/0895-7177(93)90204-C.
  18. ^Karger, David R.; Klein, Philip N.; Tarjan, Robert E. (1995-03-01)."A randomized linear-time algorithm to find minimum spanning trees".Journal of the ACM.42 (2):321–328.doi:10.1145/201019.201022.ISSN 0004-5411.
  19. ^Thiesen, Francisco."A C++ implementation for an Expected Linear-Time Minimum Spanning Tree Algorithm(Karger-Klein-Tarjan + Hagerup Minimum Spanning Tree Verification as a sub-routine)".GitHub. Retrieved2022-11-19.
  20. ^Geiman Thiesen, Francisco."Expected Linear-Time Minimum Spanning Trees".franciscothiesen.github.io. Retrieved2022-11-13.
  21. ^Li, Tianxiao; Liang, Jingxun; Yu, Huacheng; Zhou, Renfei (6 November 2023). "Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries".2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS). IEEE. pp. 1842–1862.arXiv:2306.02253.doi:10.1109/FOCS57990.2023.00112.ISBN 979-8-3503-1894-4.
  22. ^Bender, Michael; Farach-Colton, Martin; Kuszmaul, John; Kuszmaul, William; Mingmou, Liu (4 Nov 2021). "On the Optimal Time/Space Tradeoff for Hash Tables".arXiv:2111.00602 [cs].
  23. ^abNadis, Steve (8 February 2024)."Scientists Find Optimal Balance of Data Storage and Time".Quanta Magazine. Retrieved12 February 2025.
  24. ^Reingold, Omer (2008), "Undirected connectivity in log-space",Journal of the ACM,55 (4):1–24,doi:10.1145/1391289.1391291,MR 2445014.
  25. ^Gallager, Robert G. (1960).Low density parity check codes(PDF) (Ph.D thesis). Massachusetts Institute of Technology.
  26. ^Hardesty, L. (January 21, 2010)."Explained: Gallager codes".MIT News. RetrievedAugust 7, 2013.
  27. ^Gallager, R.G. (January 1962). "Low density parity check codes".IRE Trans. Inf. Theory.8 (1):21–28.doi:10.1109/TIT.1962.1057683.hdl:1721.1/11804/32786367-MIT.S2CID 260490814.
  28. ^Andrews, Kenneth S; Divsalar, Dariush; Dolinar, Sam; Hamkins, Jon; Jones, Christopher R; Pollara, Fabrizio (2007)."The development of turbo and LDPC codes for deep-space applications"(PDF).Proceedings of the IEEE.95 (11). IEEE:2142–2156.doi:10.1109/JPROC.2007.905132. Archived from the original on 2009-06-20. Retrieved2025-03-04.{{cite journal}}: CS1 maint: bot: original URL status unknown (link)
  29. ^MacKay, David JC;Neal, Radford M (1996)."Near Shannon limit performance of low density parity check codes"(PDF).Electronics Letters.32 (18). IET:1645–1646.Bibcode:1996ElL....32.1645M.doi:10.1049/el:19961141.
  30. ^Erico Guizzo (Mar 1, 2004)."CLOSING IN ON THE PERFECT CODE".IEEE Spectrum. Archived fromthe original on September 2, 2021. "Another advantage, perhaps the biggest of all, is that the LDPC patents have expired, so companies can use them without having to pay for intellectual-property rights."
Retrieved from "https://en.wikipedia.org/w/index.php?title=Galactic_algorithm&oldid=1323177954"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp