Bernardo Sulzbach’s Website

Algorithms

My algorithms repository

This is mypublic algorithms and data structures repository at GitHub. Feel free toinspect my code and propose, or submit, any changes you want.

Rope

A rope (or cord) is a data structure composed of smaller strings that is usedfor efficiently storing and manipulating a very big piece of text. Text editorsmay make use of this data structure so that insertions and deletions performedby the user can be done efficiently.

See more about it onits own page.

Euclid’s algorithm

Finds the greatest common divisor (GCD) between two integers.

Ngcd(Na,Nb){while(a%=b){swap(a,b);}returnb;}

Brian Kernighan’s algorithm

Finds how many set bits there are in an integer.

intcount_set_bits(unsignedu){intcount=0;while(u){// Will loop until u has no more set bits.u&=u-1;count++;}returncount;}

Explanation

Subtracting one from an unsigned integer toggles all bits until the rightmostset one (10111000 –00000001 =10110111).

Therefore, the result of a bitwise and operation betweenn andn – 1, isn with the rightmost set bit unset. For instance,10111000 & 10110111 =1011000.

Ifn hasx set bits, it will takex operations to maken = 0.

Brent’s Algorithm

This algorithm detects whether a singly linked list has a cycle in linear time.It uses a moving rabbit and a stationary, then teleporting (at powers of two), turtle.Both turtle and rabbit start at the first node of the list.Each iteration moves rabbit one node forward.If it is then at the same position as the stationary turtle, there is obviously a loop.If it reaches the end of the list, there is no loop.

Lambda denotes the length of the loop.It keeps track of how many steps the rabbit took since we last teleported the turtle.

Mu denotes the index of the element where the loop starts, to find it, we reset the turtle to the first element of the list and the rabbit to lambda elements after the mu.When the rabbit and the turtle meet, the turtle is at mu.

defbrent(first_node):"""        Detects a loop in a linked list in O(n) time and O(1) space complexity.            Returns a tuple of the form (mu, la), where            mu denotes the index of the list where the loop starts and            la is the length of the loop.            If this list does not have a loop, this function returns (-1, -1).        """power=1# The power of two we are working with.la=1turtle=first_noderabbit=first_node.nextwhilerabbitisnotNoneandrabbit!=turtle:ifla==power:# Get the next power of two if lambda is big enough.la=0power*=2turtle=rabbitrabbit=rabbit.nextla+=1ifrabbit==turtle:# Determine where the loop starts.mu=0turtle=rabbit=first_nodefor_inrange(la):# Let the rabbit advance λ nodes (a full cycle).rabbit=rabbit.nextwhilerabbit!=turtle:# Increment both until they meet.turtle=turtle.nextrabbit=rabbit.nextmu+=1returnmu,laelse:return-1,-1

Longest common subsequence (LCS)

Description

The longest common subsequence problem is the problem of finding the longestsubsequence common to all sequences of a set of sequences. A subsequence is anyordered subset of a sequence. Therefore, AC is a subsequence of ABC.

This is a classic problem of computer science and is solved to generate usefuldifference lists between files by version control systems.

Here I will present the intuition behind the solution of the algorithm for thecase where the sequence set has only two sequences.

Complexity

For an arbitrary number of sequences the problem is NP-hard and for twosequences the dynamic programming solution has both time and space complexity ofO(n × m).

The algorithm

The dynamic programming solution can be understood by a recursive definition:

LCS(X[:i], Y[:j]) =    0                                                   if i == 0 or j == 0    LCS(X[:i - 1], Y[:j - 1]) + 1                       if X[i] == Y[j]    max(LCS(X[:i], Y[:j - 1]), LCS(X[:i - 1], Y[:j]))   otherwise

If X_i = Y_j, the LCS is the LCS of the prefixes (up to, butnot including, the last elements) plus one (to account for the new commonelement).

Otherwise, do not account for the last element into the LCS and take the longestof the two possible candidates.

Implementation

The simplest way to implement the dynamic programming solution based on theabovementioned recursive solution uses a matrix where each element correspondsto the longest common subsequence up to that point.

Longest common substring

The longest common substring problem is quite similar to the longest commonsubsequence problem. It differs because the definition of a substring isdifferent to that of a subsequence as it does not include non-contiguoussubsequences. Therefore, AC is a subsequence of ABC, but not a substring. AB,for instance, is both a substring and a subsequence of ABC.

The recursive solution is similar to that of the LCS problem. The only change isthat instead of taking the maximum of the two possible candidates when there isa mismatch, one should use zero. This is because there can be no mismatches inthe middle of a substring.