Part ofa series on |
Probabilistic data structures |
---|
Random trees |
Related |
Arandomized algorithm is analgorithm that employs a degree ofrandomness as part of its logic or procedure. The algorithm typically usesuniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output (or both) are random variables.
There is a distinction between algorithms that use the random input so that they always terminate with the correct answer, but where the expected running time is finite (Las Vegas algorithms, for exampleQuicksort[1]), and algorithms which have a chance of producing an incorrect result (Monte Carlo algorithms, for example the Monte Carlo algorithm for theMFAS problem[2]) or fail to produce a result either by signaling a failure or failing to terminate. In some cases, probabilistic algorithms are the only practical means of solving a problem.[3]
In common practice, randomized algorithms are approximated using apseudorandom number generator in place of a true source of random bits; such an implementation may deviate from the expected theoretical behavior and mathematical guarantees which may depend on the existence of an ideal true random number generator.
As a motivating example, consider the problem of finding an ‘a’ in anarray ofn elements.
Input: An array ofn≥2 elements, in which half are ‘a’s and the other half are ‘b’s.
Output: Find an ‘a’ in the array.
We give two versions of the algorithm, oneLas Vegas algorithm and oneMonte Carlo algorithm.
Las Vegas algorithm:
findingA_LV(arrayA,n)beginrepeatRandomlyselectoneelementoutofnelements.until'a'isfoundend
This algorithm succeeds with probability 1. The number of iterations varies and can be arbitrarily large, but the expected number of iterations is
Since it is constant, the expected run time over many calls is. (SeeBig Theta notation)
Monte Carlo algorithm:
findingA_MC(arrayA,n,k)begini:=0repeatRandomlyselectoneelementoutofnelements.i:=i+1untili=kor'a'isfoundend
If an ‘a’ is found, the algorithm succeeds, else the algorithm fails. Afterk iterations, the probability of finding an ‘a’ is:
This algorithm does not guarantee success, but the run time is bounded. The number of iterations is always less than or equal to k. Taking k to be constant the run time (expected and absolute) is.
Randomized algorithms are particularly useful when faced with a malicious "adversary" orattacker who deliberately tries to feed a bad input to the algorithm (seeworst-case complexity andcompetitive analysis (online algorithm)) such as in thePrisoner's dilemma. It is for this reason thatrandomness is ubiquitous incryptography. In cryptographic applications, pseudo-random numbers cannot be used, since the adversary can predict them, making the algorithm effectively deterministic. Therefore, either a source of truly random numbers or acryptographically secure pseudo-random number generator is required. Another area in which randomness is inherent isquantum computing.
In the example above, the Las Vegas algorithm always outputs the correct answer, but its running time is a random variable. The Monte Carlo algorithm (related to theMonte Carlo method for simulation) is guaranteed to complete in an amount of time that can be bounded by a function the input size and its parameterk, but allows asmall probability of error. Observe that any Las Vegas algorithm can be converted into a Monte Carlo algorithm (viaMarkov's inequality), by having it output an arbitrary, possibly incorrect answer if it fails to complete within a specified time. Conversely, if an efficient verification procedure exists to check whether an answer is correct, then a Monte Carlo algorithm can be converted into a Las Vegas algorithm by running the Monte Carlo algorithm repeatedly till a correct answer is obtained.
Computational complexity theory models randomized algorithms asprobabilistic Turing machines. BothLas Vegas andMonte Carlo algorithms are considered, and severalcomplexity classes are studied. The most basic randomized complexity class isRP, which is the class ofdecision problems for which there is an efficient (polynomial time) randomized algorithm (or probabilistic Turing machine) which recognizes NO-instances with absolute certainty and recognizes YES-instances with a probability of at least 1/2. The complement class for RP is co-RP. Problem classes having (possibly nonterminating) algorithms withpolynomial time average case running time whose output is always correct are said to be inZPP.
The class of problems for which both YES and NO-instances are allowed to be identified with some error is calledBPP. This class acts as the randomized equivalent ofP, i.e. BPP represents the class of efficient randomized algorithms.
Quicksort was discovered byTony Hoare in 1959, and subsequently published in 1961.[4] In the same year, Hoare published thequickselect algorithm,[5] which finds the median element of a list in linear expected time. It remained open until 1973 whether a deterministic linear-time algorithm existed.[6]
In 1917,Henry Cabourn Pocklington introduced a randomized algorithm known asPocklington's algorithm for efficiently findingsquare roots modulo prime numbers.[7]In 1970,Elwyn Berlekamp introduced a randomized algorithm for efficiently computing the roots of a polynomial over a finite field.[8] In 1977,Robert M. Solovay andVolker Strassen discovered a polynomial-timerandomized primality test (i.e., determining theprimality of a number). Soon afterwardsMichael O. Rabin demonstrated that the 1976Miller's primality test could also be turned into a polynomial-time randomized algorithm. At that time, no provably polynomial-timedeterministic algorithms for primality testing were known.
One of the earliest randomized data structures is thehash table, which was introduced in 1953 byHans Peter Luhn atIBM.[9] Luhn's hash table used chaining to resolve collisions and was also one of the first applications oflinked lists.[9] Subsequently, in 1954,Gene Amdahl,Elaine M. McGraw,Nathaniel Rochester, andArthur Samuel ofIBM Research introducedlinear probing,[9] althoughAndrey Ershov independently had the same idea in 1957.[9] In 1962,Donald Knuth performed the first correct analysis of linear probing,[9] although the memorandum containing his analysis was not published until much later.[10] The first published analysis was due to Konheim and Weiss in 1966.[11]
Early works on hash tables either assumed access to a fully random hash function or assumed that the keys themselves were random.[9] In 1979, Carter and Wegman introduceduniversal hash functions,[12] which they showed could be used to implement chained hash tables with constant expected time per operation.
Early work on randomized data structures also extended beyond hash tables. In 1970, Burton Howard Bloom introduced an approximate-membership data structure known as theBloom filter.[13] In 1989,Raimund Seidel andCecilia R. Aragon introduced a randomized balanced search tree known as thetreap.[14] In the same year,William Pugh introduced another randomized search tree known as theskip list.[15]
Prior to the popularization of randomized algorithms in computer science,Paul Erdős popularized the use of randomized constructions as a mathematical technique for establishing the existence of mathematical objects. This technique has become known as theprobabilistic method.[16]Erdős gave his first application of the probabilistic method in 1947, when he used a simple randomized construction to establish the existence of Ramsey graphs.[17] He famously used a more sophisticated randomized algorithm in 1959 to establish the existence of graphs with high girth and chromatic number.[18][16]
Quicksort is a familiar, commonly used algorithm in which randomness can be useful. Many deterministic versions of this algorithm requireO(n2) time to sortn numbers for some well-defined class of degenerate inputs (such as an already sorted array), with the specific class of inputs that generate this behavior defined by the protocol for pivot selection. However, if the algorithm selects pivot elements uniformly at random, it has a provably high probability of finishing inO(n log n) time regardless of the characteristics of the input.
Incomputational geometry, a standard technique to build a structure like aconvex hull orDelaunay triangulation is to randomly permute the input points and then insert them one by one into the existing structure. The randomization ensures that the expected number of changes to the structure caused by an insertion is small, and so the expected running time of the algorithm can be bounded from above. This technique is known asrandomized incremental construction.[19]
Input: AgraphG(V,E)
Output: Acut partitioning the vertices intoL andR, with the minimum number of edges betweenL andR.
Recall that thecontraction of two nodes,u andv, in a (multi-)graph yields a new nodeu ' with edges that are the union of the edges incident on eitheru orv, except from any edge(s) connectingu andv. Figure 1 gives an example of contraction of vertexA andB.After contraction, the resulting graph may have parallel edges, but contains no self loops.
Karger's[20] basic algorithm:
begin i = 1repeatrepeat Take a random edge (u,v) ∈ E in G replace u and v with the contraction u'until only 2 nodes remain obtain the corresponding cut result Ci i = i + 1until i = m output the minimum cut among C1, C2, ..., Cm.end
In each execution of the outer loop, the algorithm repeats the inner loop until only 2 nodes remain, the corresponding cut is obtained. The run time of one execution is, andn denotes the number of vertices.Afterm times executions of the outer loop, we output the minimum cut among all the results. The figure 2 gives anexample of one execution of the algorithm. After execution, we get a cut of size 3.
Lemma 1—Letk be the min cut size, and letC = {e1,e2, ...,ek} be the min cut. If, during iterationi, no edgee ∈C is selected for contraction, thenCi =C.
IfG is not connected, thenG can be partitioned intoL andR without any edge between them. So the min cut in a disconnected graph is 0. Now, assumeG is connected. LetV=L∪R be the partition ofV induced byC :C = { {u,v} ∈E :u ∈L,v ∈R} (well-defined sinceG is connected). Consider an edge {u,v} ofC. Initially,u,v are distinct vertices.As long as we pick an edge,u andv do not get merged. Thus, at the end of the algorithm, we have two compound nodes covering the entire graph, one consisting of the vertices ofL and the other consisting of the vertices ofR. As in figure 2, the size of min cut is 1, andC = {(A,B)}. If we don't select (A,B) for contraction, we can get the min cut.
Lemma 2—IfG is a multigraph withp vertices and whose min cut has sizek, thenG has at leastpk/2 edges.
Because the min cut isk, every vertexv must satisfy degree(v) ≥k. Therefore, the sum of the degree is at leastpk. But it is well known that the sum of vertex degrees equals 2|E|. The lemma follows.
The probability that the algorithm succeeds is 1 − the probability that all attempts fail. By independence, the probability that all attempts fail is
By lemma 1, the probability thatCi =C is the probability that no edge ofC is selected during iterationi. Consider the inner loop and letGj denote the graph afterj edge contractions, wherej ∈ {0, 1, …,n − 3}.Gj hasn −j vertices. We use the chain rule ofconditional possibilities.The probability that the edge chosen at iterationj is not inC, given that no edge ofC has been chosen before, is. Note thatGj still has min cut of sizek, so by Lemma 2, it still has at least edges.
Thus,.
So by the chain rule, the probability of finding the min cutC is
Cancellation gives. Thus the probability that the algorithm succeeds is at least. For, this is equivalent to. The algorithm finds the min cut with probability, in time.
This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved. Find sources: "Randomized algorithm" – news ·newspapers ·books ·scholar ·JSTOR(September 2023) (Learn how and when to remove this message) |
Randomness can be viewed as a resource, like space and time. Derandomization is then the process ofremoving randomness (or using as little of it as possible).[21][22] It is not currently known[as of?] if all algorithms can be derandomized without significantly increasing their running time.[23] For instance, incomputational complexity, it is unknown whetherP =BPP,[23] i.e., we do not know whether we can take an arbitrary randomized algorithm that runs in polynomial time with a small error probability and derandomize it to run in polynomial time without using randomness.
There are specific methods that can be employed to derandomize particular randomized algorithms:
When the model of computation is restricted toTuring machines, it is currently an open question whether the ability to make random choices allows some problems to be solved in polynomial time that cannot be solved in polynomial time without this ability; this is the question of whether P = BPP. However, in other contexts, there are specific examples of problems where randomization yields strict improvements.