bisect — Array bisection algorithm

Source code:Lib/bisect.py


This module provides support for maintaining a list in sorted order withouthaving to sort the list after each insertion. For long lists of items withexpensive comparison operations, this can be an improvement overlinear searches or frequent resorting.

The module is calledbisect because it uses a basic bisectionalgorithm to do its work. Unlike other bisection tools that search for aspecific value, the functions in this module are designed to locate aninsertion point. Accordingly, the functions never call an__eq__()method to determine whether a value has been found. Instead, thefunctions only call the__lt__() method and will return an insertionpoint between values in an array.

The following functions are provided:

bisect.bisect_left(a,x,lo=0,hi=len(a),*,key=None)

Locate the insertion point forx ina to maintain sorted order.The parameterslo andhi may be used to specify a subset of the listwhich should be considered; by default the entire list is used. Ifx isalready present ina, the insertion point will be before (to the left of)any existing entries. The return value is suitable for use as the firstparameter tolist.insert() assuming thata is already sorted.

The returned insertion pointip partitions the arraya into twoslices such thatall(elem<xforelemina[lo:ip]) is true for theleft slice andall(elem>=xforelemina[ip:hi]) is true for theright slice.

key specifies akey function of one argument that is used toextract a comparison key from each element in the array. To supportsearching complex records, the key function is not applied to thex value.

Ifkey isNone, the elements are compared directly andno key function is called.

Changed in version 3.10:Added thekey parameter.

bisect.bisect_right(a,x,lo=0,hi=len(a),*,key=None)
bisect.bisect(a,x,lo=0,hi=len(a),*,key=None)

Similar tobisect_left(), but returns an insertion point which comesafter (to the right of) any existing entries ofx ina.

The returned insertion pointip partitions the arraya into two slicessuch thatall(elem<=xforelemina[lo:ip]) is true for the left slice andall(elem>xforelemina[ip:hi]) is true for the right slice.

Changed in version 3.10:Added thekey parameter.

bisect.insort_left(a,x,lo=0,hi=len(a),*,key=None)

Insertx ina in sorted order.

This function first runsbisect_left() to locate an insertion point.Next, it runs theinsert() method ona to insertx at theappropriate position to maintain sort order.

To support inserting records in a table, thekey function (if any) isapplied tox for the search step but not for the insertion step.

Keep in mind that theO(logn) search is dominated by the slowO(n)insertion step.

Changed in version 3.10:Added thekey parameter.

bisect.insort_right(a,x,lo=0,hi=len(a),*,key=None)
bisect.insort(a,x,lo=0,hi=len(a),*,key=None)

Similar toinsort_left(), but insertingx ina after any existingentries ofx.

This function first runsbisect_right() to locate an insertion point.Next, it runs theinsert() method ona to insertx at theappropriate position to maintain sort order.

To support inserting records in a table, thekey function (if any) isapplied tox for the search step but not for the insertion step.

Keep in mind that theO(logn) search is dominated by the slowO(n)insertion step.

Changed in version 3.10:Added thekey parameter.

Performance Notes

When writing time sensitive code usingbisect() andinsort(), keep thesethoughts in mind:

  • Bisection is effective for searching ranges of values.For locating specific values, dictionaries are more performant.

  • Theinsort() functions areO(n) because the logarithmic search stepis dominated by the linear time insertion step.

  • The search functions are stateless and discard key function results afterthey are used. Consequently, if the search functions are used in a loop,the key function may be called again and again on the same array elements.If the key function isn’t fast, consider wrapping it withfunctools.cache() to avoid duplicate computations. Alternatively,consider searching an array of precomputed keys to locate the insertionpoint (as shown in the examples section below).

See also

  • Sorted Collections is a high performancemodule that usesbisect to managed sorted collections of data.

  • TheSortedCollection recipe usesbisect to build a full-featured collection class with straight-forward searchmethods and support for a key-function. The keys are precomputed to saveunnecessary calls to the key function during searches.

Searching Sorted Lists

The abovebisect functions are useful for finding insertion points butcan be tricky or awkward to use for common searching tasks. The following fivefunctions show how to transform them into the standard lookups for sortedlists:

defindex(a,x):'Locate the leftmost value exactly equal to x'i=bisect_left(a,x)ifi!=len(a)anda[i]==x:returniraiseValueErrordeffind_lt(a,x):'Find rightmost value less than x'i=bisect_left(a,x)ifi:returna[i-1]raiseValueErrordeffind_le(a,x):'Find rightmost value less than or equal to x'i=bisect_right(a,x)ifi:returna[i-1]raiseValueErrordeffind_gt(a,x):'Find leftmost value greater than x'i=bisect_right(a,x)ifi!=len(a):returna[i]raiseValueErrordeffind_ge(a,x):'Find leftmost item greater than or equal to x'i=bisect_left(a,x)ifi!=len(a):returna[i]raiseValueError

Examples

Thebisect() function can be useful for numeric table lookups. Thisexample usesbisect() to look up a letter grade for an exam score (say)based on a set of ordered numeric breakpoints: 90 and up is an ‘A’, 80 to 89 isa ‘B’, and so on:

>>>defgrade(score,breakpoints=[60,70,80,90],grades='FDCBA'):...i=bisect(breakpoints,score)...returngrades[i]...>>>[grade(score)forscorein[33,99,77,70,89,90,100]]['F', 'A', 'C', 'C', 'B', 'A', 'A']

Thebisect() andinsort() functions also work withlists of tuples. Thekey argument can serve to extract the field used for orderingrecords in a table:

>>>fromcollectionsimportnamedtuple>>>fromoperatorimportattrgetter>>>frombisectimportbisect,insort>>>frompprintimportpprint>>>Movie=namedtuple('Movie',('name','released','director'))>>>movies=[...Movie('Jaws',1975,'Spielberg'),...Movie('Titanic',1997,'Cameron'),...Movie('The Birds',1963,'Hitchcock'),...Movie('Aliens',1986,'Cameron')...]>>># Find the first movie released after 1960>>>by_year=attrgetter('released')>>>movies.sort(key=by_year)>>>movies[bisect(movies,1960,key=by_year)]Movie(name='The Birds', released=1963, director='Hitchcock')>>># Insert a movie while maintaining sort order>>>romance=Movie('Love Story',1970,'Hiller')>>>insort(movies,romance,key=by_year)>>>pprint(movies)[Movie(name='The Birds', released=1963, director='Hitchcock'), Movie(name='Love Story', released=1970, director='Hiller'), Movie(name='Jaws', released=1975, director='Spielberg'), Movie(name='Aliens', released=1986, director='Cameron'), Movie(name='Titanic', released=1997, director='Cameron')]

If the key function is expensive, it is possible to avoid repeated functioncalls by searching a list of precomputed keys to find the index of a record:

>>>data=[('red',5),('blue',1),('yellow',8),('black',0)]>>>data.sort(key=lambdar:r[1])# Or use operator.itemgetter(1).>>>keys=[r[1]forrindata]# Precompute a list of keys.>>>data[bisect_left(keys,0)]('black', 0)>>>data[bisect_left(keys,1)]('blue', 1)>>>data[bisect_left(keys,5)]('red', 5)>>>data[bisect_left(keys,8)]('yellow', 8)