Binary search is a blazingly-fast search algorithm that has a worst caseBig O ofΟ(log n)
, the average case isO(log n)
and the best case isO(1)
. In short it's a search algorithm that is about as fast as you can get.
Binary Search utilises divide and conquer to reduce much of the potential runtime. In fact, it is a lot likemerge sort in how it functions but of course one algorithm is used to sort data and this one is used to find data.
For the algorithm to run properly, the data must be pre-sorted and be have elements of the same or similar types held within. So if your collection contains strings and integers, it won't work, you must have elements using a consistent data type withinin your collection.
When the binary search is executed, it will take an inputcollection
and avalue
to find and if the value exists we will return the index of thatvalue
within thecollection
and if it doesn't we will return-1
out of convention to represent that thevalue
doesn't exist in thecollection
after all.
Tests
frommainimportbinarySearch;deftest_strings():assertbinarySearch(["Dave","James"],"Dave")==0;assertbinarySearch(["Dave","James"],"abc")==-1;deftest_integers():assertbinarySearch([1,2,3,4],3)==2;assertbinarySearch([1,2,3,4],5)==-1;deftest_floats():assertbinarySearch([3.14],3.14)==0;assertbinarySearch([3.141],3.14)==-1;deftest_booleans():assertbinarySearch([False,True],True)==1;assertbinarySearch([True],False)==-1;deftest_lists():assertbinarySearch([[3.14]],[3.14])==0;assertbinarySearch([[3.141]],[3.14])==-1;deftest_nested_lists():assertbinarySearch([[3.14,[1]]],[3.14,[1]])==0;assertbinarySearch([[3.141]],[3.14])==-1;deftest_sets():assertbinarySearch([set([1])],set([1]))==0;assertbinarySearch([set([1])],set([1,2]))==-1;deftest_tuples():assertbinarySearch([tuple([0,2]),tuple([1])],tuple([1]))==1;assertbinarySearch([tuple([1])],tuple([1,2]))==-1;
In python, as with many other languages, some elements can't compare with one another in a non-hacky manner, even if they are both the same type. An example of this is that types such asdict
cannot run comparisons like{'a': 1} > {'a': 1}
.
For binary search we do need to usecomparison operators like that though and so I represented only the types that are comparible with the==
,>
,>=
,<
and<=
operators that binary search utilises.
Here is a repl to run and play with the tests:
Implementation
There are a few ways to skin this particular cat. For that reason, I will show you two implementations for the Binary Search algorithm. One will be using animperative approach and another usingRecursion.
Both implementations expect uniform data and are typed for this reason. We expect data to contain consistent types in the input collection so that the search is guaranteed to be able to compare the items in the collection against one another. If we didn't do this we may get unexpected errors such as this:
TypeError: '>' not supported between instances of 'list' and 'str'
Just remember to normalise and sort your data before you begin using the implementations that will be outlined below or if you decide to build your own implementation in the future as these traits are a requirement of the algorithm.
The imperative approach
Imperative programming is a programming paradigm which describeshow a programme should run. This paradigm usesstatements to alter a programmes state and generate values.
fromtypingimportList,TypeVar;fromoperatorimporteqasdeep_equals,gtasgreater_than,neasnot_equal,leasless_than_or_equal;frommathimportfloor;T=TypeVar('T');defbinarySearch(collection:List[T],value:T)->int:start=0;end=len(collection)-1;middle=floor((start+end)/2);while(not_equal(collection[middle],value)andless_than_or_equal(start,end)):ifgreater_than(value,collection[middle]):start=middle+1;else:end=middle-1;middle=floor((start+end)/2);returnmiddleifdeep_equals(collection[middle],value)else-1;
We begin by importing some helpers from thepython standard library which has a whole swath of helper functions for us to use in our code.
A point worth noting is the use of
TypeVar
from thetyping
module to align our input types for thecollection
andvalue
.All this means is that when we recieve the collection, it should have items of consistent type
T
and thevalue
we want to search for should also be of typeT
.So if we have the
collection
of typeList[str]
for example then value must also be anstr
but instead of hard-coding this, we letTypeVar
do the work for us.
OurbinarySearch
function will recieve acollection
to search within and avalue
to look for in thatcollection
. If thevalue
is found then we return it's location (index) in thecollection
.
To find the index we setup thestart
,end
andmiddle
indexes of thecollection
. While thevalue
is not equal to the value in themiddle
of thecollection
and thestart
index is less than or equal to theend
index, we check if thevalue
is bigger than themiddle
value, if it is, we set thestart
index to begin on the next iteration from themiddle
, otherwise we set the end to be themiddle
.
We do this because if the
value
is greater than the value in themiddle
of thecollection
, thevalue
must be on the right half of the collection and vice versa for the opposite case.
Either way the condition falls we will then reset themiddle
index to be in between the newstart
andend
indexes. We continue with this process until the condition on the while loop is no longer true.
When the while loop exits, we check if themiddle
indexes value is thevalue
we are looking for. If it is, we return themiddle
index and if it isn't then thevalue
wasn't in the input collection and we return-1
to represent this case.
The recursive approach
fromtypingimportList,TypeVar,Union;fromoperatorimporteqasdeep_equals,gtasgreater_than,neasnot_equal,leasless_than_or_equal,geasgreater_than_or_equal;frommathimportfloor;T=TypeVar('T');defbinarySearch(collection:List[T],value:T,start:int=0,end:Union[int,None]=None)->int:ifdeep_equals(end,None):end=len(collection)-1;ifgreater_than_or_equal(end,start):middle=floor((start+end)/2);ifdeep_equals(collection[middle],value):returnmiddle;ifgreater_than(collection[middle],value):returnbinarySearch(collection,value,start,middle-1);returnbinarySearch(collection,value,middle+1,end);return-1;
Exactly like the imperative approach outlined above, we begin by importing some helpers from thepython standard library which has a whole swath of helper functions for us to use in our code.
A point worth noting is the use of
TypeVar
from thetyping
module to align our input types for thecollection
andvalue
.All this means is that when we recieve the collection, it should have items of consistent type
T
and thevalue
we want to search for should also be of typeT
.So if we have the
collection
of typeList[str]
for example then value must also be anstr
but instead of hard-coding this, we letTypeVar
do the work for us.
The recursive version naturally calls itself again and again and requires a couple more parameters to be provided on each call than the imperative approach does. Namely we need to know where to thestart
index andend
index are for the current recursive call of thebinarySearch
function.
Initially thestart
will be at0
which makes sense since we want to begin at the start of thecollection
and theend
initially will be set toNone
but will be set to the length of thecollection
mins one once the function first runs.
Note: We set
end
to initialise itself with a value ofNone
because in python you can't reference other parameters to use as default values of other parameters. To get around this, we set it's type toUnion[None, int]
which just means this can be of typeNone
orint
. Once the function first executes we set theend
value to the length of thecollection
mins one because if acollection
contains 5 items the last index would be 4 since lists/arrays are always 0 indexed.
If theend
index is greater than or equal to the starting index we calculate where themiddle
index of the currentcollection
is and consider 3 cases:
- If the item at the
middle
index equals the value we are looking for, return themiddle
index - If the item at the
middle
index is greater than the value we are looking for then run a recursive call tobinarySearch
to check the left half of thecollection
for thevalue
- If the item at the
middle
index is less than the value we are looking for then run a recursive call tobinarySearch
to check the right half of thecollection
for thevalue
If none of these recursive calls manage to find thevalue
in thecollection
then we return-1
by default.
Conclusions
The Binary Search algorithm is always a fantastic choice to find values within sorted and type-consistent data sets.
Personally I try to always keep data within applications as uniform as possible and thus, I find myself using this algorithm relatively often in my day to day work. I highly recommend looking into it further and seeing how it can benefit you and your work too!
I hope you learned something new today and if you have any questions, ideas or comments, feel free to comment them below!
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse