This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Algorithmic efficiency" – news ·newspapers ·books ·scholar ·JSTOR(January 2024) (Learn how and when to remove this message) |
![]() | This article'stone or style may not reflect theencyclopedic tone used on Wikipedia. See Wikipedia'sguide to writing better articles for suggestions.(January 2024) (Learn how and when to remove this message) |
Incomputer science,algorithmic efficiency is a property of analgorithm which relates to the amount ofcomputational resources used by the algorithm. Algorithmic efficiency can be thought of as analogous to engineeringproductivity for a repeating or continuous process.
For maximum efficiency it is desirable to minimize resource usage. However, different resources such astime andspace complexity cannot be compared directly, so which of two algorithms is considered to be more efficient often depends on which measure of efficiency is considered most important.
For example,bubble sort andtimsort are bothalgorithms to sort a list of items from smallest to largest. Bubble sort organizes the list in time proportional to the number of elements squared (, seeBig O notation), but only requires a small amount of extramemory which is constant with respect to the length of the list (). Timsort sorts the list in timelinearithmic (proportional to a quantity times its logarithm) in the list's length (), but has a space requirementlinear in the length of the list (). If large lists must be sorted at high speed for a given application, timsort is a better choice; however, if minimizing thememory footprint of the sorting is more important, bubble sort is a better choice.
The importance of efficiency with respect to time was emphasized byAda Lovelace in 1843 as applied toCharles Babbage's mechanical analytical engine:
"In almost every computation a great variety of arrangements for the succession of the processes is possible, and various considerations must influence the selections amongst them for the purposes of a calculating engine. One essential object is to choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the calculation"[1]
Earlyelectronic computers had both limitedspeed and limitedrandom access memory. Therefore, aspace–time trade-off occurred. Atask could use a fast algorithm using a lot of memory, or it could use a slow algorithm using little memory. The engineering trade-off was therefore to use the fastest algorithm that could fit in the available memory.
Modern computers are significantly faster than early computers and have a much larger amount of memory available (gigabytes instead of kilobytes). Nevertheless,Donald Knuth emphasized that efficiency is still an important consideration:
"In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal and I believe the same viewpoint should prevail in software engineering"[2]
An algorithm is considered efficient if its resource consumption, also known as computational cost, is at or below some acceptable level. Roughly speaking, 'acceptable' means: it will run in a reasonable amount of time or space on an available computer, typically as afunction of the size of the input. Since the 1950s computers have seen dramatic increases in both the available computational power and in the available amount of memory, so current acceptable levels would have been unacceptable even 10 years ago. In fact, thanks to theapproximate doubling of computer power every 2 years, tasks that are acceptably efficient on modernsmartphones andembedded systems may have been unacceptably inefficient for industrialservers 10 years ago.
Computer manufacturers frequently bring out new models, often with higherperformance. Software costs can be quite high, so in some cases the simplest and cheapest way of getting higher performance might be to just buy a faster computer, provided it iscompatible with an existing computer.
There are many ways in which the resources used by an algorithm can be measured: the two most common measures are speed and memory usage; other measures could include transmission speed, temporary disk usage, long-term disk usage, power consumption,total cost of ownership,response time to external stimuli, etc. Many of these measures depend on the size of the input to the algorithm, i.e. the amount of data to be processed. They might also depend on the way in which the data is arranged; for example, somesorting algorithms perform poorly on data which is already sorted, or which is sorted in reverse order.
In practice, there are other factors which can affect the efficiency of an algorithm, such as requirements for accuracy and/or reliability. As detailed below, the way in which an algorithm is implemented can also have a significant effect on actual efficiency, though many aspects of this relate tooptimization issues.
In the theoreticalanalysis of algorithms, the normal practice is to estimate their complexity in the asymptotic sense. The most commonly used notation to describe resource consumption or "complexity" isDonald Knuth'sBig O notation, representing the complexity of an algorithm as a function of the size of the input. Big O notation is anasymptotic measure of function complexity, where roughly means the time requirement for an algorithm is proportional to, omittinglower-order terms that contribute less than to the growth of the function asgrows arbitrarily large. This estimate may be misleading when is small, but is generally sufficiently accurate when is large as the notation is asymptotic. For example, bubble sort may be faster thanmerge sort when only a few items are to be sorted; however either implementation is likely to meet performance requirements for a small list. Typically, programmers are interested in algorithms thatscale efficiently to large input sizes, and merge sort is preferred over bubble sort for lists of length encountered in most data-intensive programs.
Some examples of Big O notation applied to algorithms' asymptotic time complexity include:
Notation | Name | Examples |
---|---|---|
constant | Finding the median from a sorted list of measurements; Using a constant-sizelookup table; Using a suitablehash function for looking up an item. | |
logarithmic | Finding an item in a sorted array with abinary search or a balanced searchtree as well as all operations in aBinomial heap. | |
linear | Finding an item in an unsorted list or a malformed tree (worst case) or in an unsorted array; Adding twon-bit integers byripple carry. | |
linearithmic, loglinear, or quasilinear | Performing aFast Fourier transform;heapsort,quicksort (best and average case), ormerge sort | |
quadratic | Multiplying twon-digit numbers bya simple algorithm;bubble sort (worst case or naive implementation),Shell sort, quicksort (worst case),selection sort orinsertion sort | |
exponential | Finding the optimal (non-approximate) solution to thetravelling salesman problem usingdynamic programming;determining if two logical statements are equivalent usingbrute-force search |
For new versions of software or to provide comparisons with competitive systems,benchmarks are sometimes used, which assist with gauging an algorithms relative performance. If a newsort algorithm is produced, for example, it can be compared with its predecessors to ensure that at least it is efficient as before with known data, taking into consideration any functional improvements. Benchmarks can be used by customers when comparing various products from alternative suppliers to estimate which product will best suit their specific requirements in terms of functionality and performance. For example, in themainframe world certain proprietarysort products from independent software companies such asSyncsort compete with products from the major suppliers such asIBM for speed.
Some benchmarks provide opportunities for producing an analysis comparing the relative speed of various compiled and interpreted languages for example[3][4]andThe Computer Language Benchmarks Game compares the performance of implementations of typical programming problems in several programming languages.
Even creating "do it yourself" benchmarks can demonstrate the relative performance of different programming languages, using a variety of user specified criteria. This is quite simple, as a "Nine language performance roundup" by Christopher W. Cowell-Shah demonstrates by example.[5]
Implementation issues can also have an effect on efficiency, such as the choice of programming language, or the way in which the algorithm is actually coded,[6] or the choice of acompiler for a particular language, or thecompilation options used, or even theoperating system being used. In many cases a language implemented by aninterpreter may be much slower than a language implemented by a compiler.[3] See the articles onjust-in-time compilation andinterpreted languages.
There are other factors which may affect time or space issues, but which may be outside of a programmer's control; these includedata alignment,data granularity,cache locality,cache coherency,garbage collection,instruction-level parallelism,multi-threading (at either a hardware or software level),simultaneous multitasking, andsubroutine calls.[7]
Some processors have capabilities forvector processing, which allow asingle instruction to operate on multiple operands; it may or may not be easy for a programmer or compiler to use these capabilities. Algorithms designed for sequential processing may need to be completely redesigned to make use ofparallel processing, or they could be easily reconfigured. Asparallel anddistributed computing grow in importance in the late 2010s, more investments are being made into efficienthigh-levelAPIs for parallel and distributed computing systems such asCUDA,TensorFlow,Hadoop,OpenMP andMPI.
Another problem which can arise in programming is that processors compatible with the sameinstruction set (such asx86-64 orARM) may implement an instruction in different ways, so that instructions which are relatively fast on some models may be relatively slow on other models. This often presents challenges tooptimizing compilers, which must have extensive knowledge of the specificCPU and other hardware available on the compilation target to best optimize a program for performance. In the extreme case, a compiler may be forced toemulate instructions not supported on a compilation target platform, forcing it togenerate code orlink an externallibrary call to produce a result that is otherwise incomputable on that platform, even if it is natively supported and more efficient in hardware on other platforms. This is often the case inembedded systems with respect tofloating-point arithmetic, where small andlow-powermicrocontrollers often lack hardware support for floating-point arithmetic and thus require computationally expensive software routines to produce floating point calculations.
Measures are normally expressed as a function of the size of the input.
The two most common measures are:
For computers whose power is supplied by a battery (e.g.laptops andsmartphones), or for very long/large calculations (e.g.supercomputers), other measures of interest are:
As of 2018[update], power consumption is growing as an important metric for computational tasks of all types and at all scales ranging fromembeddedInternet of things devices tosystem-on-chip devices toserver farms. This trend is often referred to asgreen computing.
Less common measures of computational efficiency may also be relevant in some cases:
Analysis of algorithms, typically using concepts liketime complexity, can be used to get an estimate of the running time as a function of the size of the input data. The result is normally expressed usingBig O notation. This is useful for comparing algorithms, especially when a large amount of data is to be processed. More detailed estimates are needed to compare algorithm performance when the amount of data is small, although this is likely to be of less importance.Parallel algorithms may bemore difficult to analyze.
Abenchmark can be used to assess the performance of an algorithm in practice. Many programming languages have an available function which providesCPU time usage. For long-running algorithms the elapsed time could also be of interest. Results should generally be averaged over several tests.
Run-based profiling can be very sensitive to hardware configuration and the possibility of other programs or tasks running at the same time in amulti-processing andmulti-programming environment.
This sort of test also depends heavily on the selection of a particular programming language, compiler, and compiler options, so algorithms being compared must all be implemented under the same conditions.
This section is concerned with use of memory resources (registers,cache,RAM,virtual memory,secondary memory) while the algorithm is being executed. As for time analysis above,analyze the algorithm, typically usingspace complexity analysis to get an estimate of the run-time memory needed as a function as the size of the input data. The result is normally expressed usingBig O notation.
There are up to four aspects of memory usage to consider:
Early electronic computers, and early home computers, had relatively small amounts of working memory. For example, the 1949Electronic Delay Storage Automatic Calculator (EDSAC) had a maximum working memory of 1024 17-bit words, while the 1980 SinclairZX80 came initially with 1024 8-bit bytes of working memory. In the late 2010s, it is typical forpersonal computers to have between 4 and 32GB of RAM, an increase of over 300 million times as much memory.
Modern computers can have relatively large amounts of memory (possibly gigabytes), so having to squeeze an algorithm into a confined amount of memory is not the kind of problem it used to be. However, the different types of memory and their relative access speeds can be significant:
An algorithm whose memory needs will fit in cache memory will be much faster than an algorithm which fits in main memory, which in turn will be very much faster than an algorithm which has to resort to paging. Because of this,cache replacement policies are extremely important to high-performance computing, as arecache-aware programming anddata alignment. To further complicate the issue, some systems have up to three levels of cache memory, with varying effective speeds. Different systems will have different amounts of these various types of memory, so the effect of algorithm memory needs can vary greatly from one system to another.
In the early days of electronic computing, if an algorithm and its data would not fit in main memory then the algorithm could not be used. Nowadays the use of virtual memory appears to provide much more memory, but at the cost of performance. Much higher speed can be obtained if an algorithm and its data fit in cache memory; in this case minimizing space will also help minimize time. This is called theprinciple of locality, and can be subdivided intolocality of reference,spatial locality, andtemporal locality. An algorithm which will not fit completely in cache memory but which exhibits locality of reference may perform reasonably well.