Thoughts on computer programming and related topics.
Copyright © 2015–2025 Bernardo Sulzbach
This is mypublic algorithms and data structures repository at GitHub. Feel free toinspect my code and propose, or submit, any changes you want.
A rope (or cord) is a data structure composed of smaller strings that is usedfor efficiently storing and manipulating a very big piece of text. Text editorsmay make use of this data structure so that insertions and deletions performedby the user can be done efficiently.
See more about it onits own page.
Finds the greatest common divisor (GCD) between two integers.
Ngcd(Na,Nb){while(a%=b){swap(a,b);}returnb;}
Finds how many set bits there are in an integer.
intcount_set_bits(unsignedu){intcount=0;while(u){// Will loop until u has no more set bits.u&=u-1;count++;}returncount;}
Subtracting one from an unsigned integer toggles all bits until the rightmostset one (10111000
–00000001
=10110111
).
Therefore, the result of a bitwise and operation betweenn
andn – 1
, isn
with the rightmost set bit unset. For instance,10111000 & 10110111 =1011000
.
Ifn
hasx
set bits, it will takex
operations to maken = 0
.
This algorithm detects whether a singly linked list has a cycle in linear time.It uses a moving rabbit and a stationary, then teleporting (at powers of two), turtle.Both turtle and rabbit start at the first node of the list.Each iteration moves rabbit one node forward.If it is then at the same position as the stationary turtle, there is obviously a loop.If it reaches the end of the list, there is no loop.
Lambda denotes the length of the loop.It keeps track of how many steps the rabbit took since we last teleported the turtle.
Mu denotes the index of the element where the loop starts, to find it, we reset the turtle to the first element of the list and the rabbit to lambda elements after the mu.When the rabbit and the turtle meet, the turtle is at mu.
defbrent(first_node):""" Detects a loop in a linked list in O(n) time and O(1) space complexity. Returns a tuple of the form (mu, la), where mu denotes the index of the list where the loop starts and la is the length of the loop. If this list does not have a loop, this function returns (-1, -1). """power=1# The power of two we are working with.la=1turtle=first_noderabbit=first_node.nextwhilerabbitisnotNoneandrabbit!=turtle:ifla==power:# Get the next power of two if lambda is big enough.la=0power*=2turtle=rabbitrabbit=rabbit.nextla+=1ifrabbit==turtle:# Determine where the loop starts.mu=0turtle=rabbit=first_nodefor_inrange(la):# Let the rabbit advance λ nodes (a full cycle).rabbit=rabbit.nextwhilerabbit!=turtle:# Increment both until they meet.turtle=turtle.nextrabbit=rabbit.nextmu+=1returnmu,laelse:return-1,-1
The longest common subsequence problem is the problem of finding the longestsubsequence common to all sequences of a set of sequences. A subsequence is anyordered subset of a sequence. Therefore, AC is a subsequence of ABC.
This is a classic problem of computer science and is solved to generate usefuldifference lists between files by version control systems.
Here I will present the intuition behind the solution of the algorithm for thecase where the sequence set has only two sequences.
For an arbitrary number of sequences the problem is NP-hard and for twosequences the dynamic programming solution has both time and space complexity ofO(n × m).
The dynamic programming solution can be understood by a recursive definition:
LCS(X[:i], Y[:j]) = 0 if i == 0 or j == 0 LCS(X[:i - 1], Y[:j - 1]) + 1 if X[i] == Y[j] max(LCS(X[:i], Y[:j - 1]), LCS(X[:i - 1], Y[:j])) otherwise
If Xi = Yj, the LCS is the LCS of the prefixes (up to, butnot including, the last elements) plus one (to account for the new commonelement).
Otherwise, do not account for the last element into the LCS and take the longestof the two possible candidates.
The simplest way to implement the dynamic programming solution based on theabovementioned recursive solution uses a matrix where each element correspondsto the longest common subsequence up to that point.
The longest common substring problem is quite similar to the longest commonsubsequence problem. It differs because the definition of a substring isdifferent to that of a subsequence as it does not include non-contiguoussubsequences. Therefore, AC is a subsequence of ABC, but not a substring. AB,for instance, is both a substring and a subsequence of ABC.
The recursive solution is similar to that of the LCS problem. The only change isthat instead of taking the maximum of the two possible candidates when there isa mismatch, one should use zero. This is because there can be no mismatches inthe middle of a substring.
See thenumber theory algorithms page.
Too big to be here. Gotits own page.
This is a must-know shuffling algorithm.It has its own page.
See theinterval manipulation page.
See thesorting page.