More routines for operating on iterables, beyond itertools
These tools yield groups of items from a source iterable.
New itertools
Breakiterable into lists of lengthn:
>>>list(chunked([1,2,3,4,5,6],3))[[1, 2, 3], [4, 5, 6]]
By the default, the last yielded list will have fewer thann elementsif the length ofiterable is not divisible byn:
>>>list(chunked([1,2,3,4,5,6,7,8],3))[[1, 2, 3], [4, 5, 6], [7, 8]]
To use a fill-in value instead, see thegrouper() recipe.
If the length ofiterable is not divisible byn andstrict isTrue, thenValueError will be raised before the lastlist is yielded.
Breakiterable into sub-iterables withn elements each.ichunked() is likechunked(), but it yields iterablesinstead of lists.
If the sub-iterables are read in order, the elements ofiterablewon’t be stored in memory.If they are read out of order,itertools.tee() is used to cacheelements as necessary.
>>>fromitertoolsimportcount>>>all_chunks=ichunked(count(),4)>>>c_1,c_2,c_3=next(all_chunks),next(all_chunks),next(all_chunks)>>>list(c_2)# c_1's elements have been cached; c_3's haven't been[4, 5, 6, 7]>>>list(c_1)[0, 1, 2, 3]>>>list(c_3)[8, 9, 10, 11]
Breakiterable into lists of approximately lengthn.Items are distributed such the lengths of the lists differ by at most1 item.
>>>iterable=[1,2,3,4,5,6,7]>>>n=3>>>list(chunked_even(iterable,n))# List lengths: 3, 2, 2[[1, 2, 3], [4, 5], [6, 7]]>>>list(chunked(iterable,n))# List lengths: 3, 3, 1[[1, 2, 3], [4, 5, 6], [7]]
Yield slices of lengthn from the sequenceseq.
>>>list(sliced((1,2,3,4,5,6),3))[(1, 2, 3), (4, 5, 6)]
By the default, the last yielded slice will have fewer thann elementsif the length ofseq is not divisible byn:
>>>list(sliced((1,2,3,4,5,6,7,8),3))[(1, 2, 3), (4, 5, 6), (7, 8)]
If the length ofseq is not divisible byn andstrict isTrue, thenValueError will be raised before the lastslice is yielded.
This function will only work for iterables that support slicing.For non-sliceable iterables, seechunked().
Yield batches of items fromiterable with a combined size limited bymax_size.
>>>iterable=[b'12345',b'123',b'12345678',b'1',b'1',b'12',b'1']>>>list(constrained_batches(iterable,10))[(b'12345', b'123'), (b'12345678', b'1', b'1'), (b'12', b'1')]
If amax_count is supplied, the number of items per batch is alsolimited:
>>>iterable=[b'12345',b'123',b'12345678',b'1',b'1',b'12',b'1']>>>list(constrained_batches(iterable,10,max_count=2))[(b'12345', b'123'), (b'12345678', b'1'), (b'1', b'12'), (b'1',)]
If aget_len function is supplied, use that instead oflen() todetermine item size.
Ifstrict isTrue, raiseValueError if any single item is biggerthanmax_size. Otherwise, allow single items to exceedmax_size.
Distribute the items fromiterable amongn smaller iterables.
>>>group_1,group_2=distribute(2,[1,2,3,4,5,6])>>>list(group_1)[1, 3, 5]>>>list(group_2)[2, 4, 6]
If the length ofiterable is not evenly divisible byn, then thelength of the returned iterables will not be identical:
>>>children=distribute(3,[1,2,3,4,5,6,7])>>>[list(c)forcinchildren][[1, 4, 7], [2, 5], [3, 6]]
If the length ofiterable is smaller thann, then the last returnediterables will be empty:
>>>children=distribute(5,[1,2,3])>>>[list(c)forcinchildren][[1], [2], [3], [], []]
This function usesitertools.tee() and may require significantstorage.
If you need the order items in the smaller iterables to match theoriginal iterable, seedivide().
Divide the elements fromiterable inton parts, maintainingorder.
>>>group_1,group_2=divide(2,[1,2,3,4,5,6])>>>list(group_1)[1, 2, 3]>>>list(group_2)[4, 5, 6]
If the length ofiterable is not evenly divisible byn, then thelength of the returned iterables will not be identical:
>>>children=divide(3,[1,2,3,4,5,6,7])>>>[list(c)forcinchildren][[1, 2, 3], [4, 5], [6, 7]]
If the length of the iterable is smaller than n, then the last returnediterables will be empty:
>>>children=divide(5,[1,2,3])>>>[list(c)forcinchildren][[1], [2], [3], [], []]
This function will exhaust the iterable before returning.If order is not important, seedistribute(), which does not firstpull the iterable into memory.
Yield lists of items fromiterable, where each list is delimited byan item where callablepred returnsTrue.
>>>list(split_at('abcdcba',lambdax:x=='b'))[['a'], ['c', 'd', 'c'], ['a']]
>>>list(split_at(range(10),lambdan:n%2==1))[[0], [2], [4], [6], [8], []]
At mostmaxsplit splits are done. Ifmaxsplit is not specified or -1,then there is no limit on the number of splits:
>>>list(split_at(range(10),lambdan:n%2==1,maxsplit=2))[[0], [2], [4, 5, 6, 7, 8, 9]]
By default, the delimiting items are not included in the output.To include them, setkeep_separator toTrue.
>>>list(split_at('abcdcba',lambdax:x=='b',keep_separator=True))[['a'], ['b'], ['c', 'd', 'c'], ['b'], ['a']]
Yield lists of items fromiterable, where each list ends just beforean item for which callablepred returnsTrue:
>>>list(split_before('OneTwo',lambdas:s.isupper()))[['O', 'n', 'e'], ['T', 'w', 'o']]
>>>list(split_before(range(10),lambdan:n%3==0))[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
At mostmaxsplit splits are done. Ifmaxsplit is not specified or -1,then there is no limit on the number of splits:
>>>list(split_before(range(10),lambdan:n%3==0,maxsplit=2))[[0, 1, 2], [3, 4, 5], [6, 7, 8, 9]]
Yield lists of items fromiterable, where each list ends with anitem where callablepred returnsTrue:
>>>list(split_after('one1two2',lambdas:s.isdigit()))[['o', 'n', 'e', '1'], ['t', 'w', 'o', '2']]
>>>list(split_after(range(10),lambdan:n%3==0))[[0], [1, 2, 3], [4, 5, 6], [7, 8, 9]]
At mostmaxsplit splits are done. Ifmaxsplit is not specified or -1,then there is no limit on the number of splits:
>>>list(split_after(range(10),lambdan:n%3==0,maxsplit=2))[[0], [1, 2, 3], [4, 5, 6, 7, 8, 9]]
Yield a list of sequential items fromiterable of length ‘n’ for eachinteger ‘n’ insizes.
>>>list(split_into([1,2,3,4,5,6],[1,2,3]))[[1], [2, 3], [4, 5, 6]]
If the sum ofsizes is smaller than the length ofiterable, then theremaining items ofiterable will not be returned.
>>>list(split_into([1,2,3,4,5,6],[2,3]))[[1, 2], [3, 4, 5]]
If the sum ofsizes is larger than the length ofiterable, fewer itemswill be returned in the iteration that overruns theiterable and furtherlists will be empty:
>>>list(split_into([1,2,3,4],[1,2,3,4]))[[1], [2, 3], [4], []]
When aNone object is encountered insizes, the returned list willcontain items up to the end ofiterable the same way thatitertools.slice() does:
>>>list(split_into([1,2,3,4,5,6,7,8,9,0],[2,3,None]))[[1, 2], [3, 4, 5], [6, 7, 8, 9, 0]]
split_into() can be useful for grouping a series of items where thesizes of the groups are not uniform. An example would be where in a rowfrom a table, multiple columns represent elements of the same feature(e.g. a point represented by x,y,z) but, the format is not the same forall columns.
Splititerable into pieces based on the output ofpred.pred should be a function that takes successive pairs of items andreturnsTrue if the iterable should be split in between them.
For example, to find runs of increasing numbers, split the iterable whenelementi is larger than elementi+1:
>>>list(split_when([1,2,3,3,2,5,2,4,2],lambdax,y:x>y))[[1, 2, 3, 3], [2, 5], [2, 4], [2]]
At mostmaxsplit splits are done. Ifmaxsplit is not specified or -1,then there is no limit on the number of splits:
>>>list(split_when([1,2,3,3,2,5,2,4,2],...lambdax,y:x>y,maxsplit=2))[[1, 2, 3, 3], [2, 5], [2, 4, 2]]
Wrapiterable and return an object that buckets the iterable intochild iterables based on akey function.
>>>iterable=['a1','b1','c1','a2','b2','c2','b3']>>>s=bucket(iterable,key=lambdax:x[0])# Bucket by 1st character>>>sorted(list(s))# Get the keys['a', 'b', 'c']>>>a_iterable=s['a']>>>next(a_iterable)'a1'>>>next(a_iterable)'a2'>>>list(s['b'])['b1', 'b2', 'b3']
The original iterable will be advanced and its items will be cached untilthey are used by the child iterables. This may require significant storage.
By default, attempting to select a bucket to which no items belong willexhaust the iterable and cache all values.If you specify avalidator function, selected buckets will instead bechecked against it.
>>>fromitertoolsimportcount>>>it=count(1,2)# Infinite sequence of odd numbers>>>key=lambdax:x%10# Bucket by last digit>>>validator=lambdax:xin{1,3,5,7,9}# Odd digits only>>>s=bucket(it,key=key,validator=validator)>>>2insFalse>>>list(s[2])[]
The inverse ofzip(), this function disaggregates the elementsof the zippediterable.
Thei-th iterable contains thei-th element from each elementof the zipped iterable. The first element is used to determine thelength of the remaining elements.
>>>iterable=[('a',1),('b',2),('c',3),('d',4)]>>>letters,numbers=unzip(iterable)>>>list(letters)['a', 'b', 'c', 'd']>>>list(numbers)[1, 2, 3, 4]
This is similar to usingzip(*iterable), but it avoids readingiterable into memory. Note, however, that this function usesitertools.tee() and thus may require significant storage.
Itertools recipes
Batch data into tuples of lengthn. If the number of items initerable is not divisible byn:* The last batch will be shorter ifstrict isFalse.*ValueError will be raised ifstrict isTrue.
>>>list(batched('ABCDEFG',3))[('A', 'B', 'C'), ('D', 'E', 'F'), ('G',)]
On Python 3.13 and above, this is an alias foritertools.batched().
Group elements fromiterable into fixed-length groups of lengthn.
>>>list(grouper('ABCDEF',3))[('A', 'B', 'C'), ('D', 'E', 'F')]
The keyword argumentsincomplete andfillvalue control what happens foriterables whose length is not a multiple ofn.
Whenincomplete is‘fill’, the last group will contain instances offillvalue.
>>>list(grouper('ABCDEFG',3,incomplete='fill',fillvalue='x'))[('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'x', 'x')]
Whenincomplete is‘ignore’, the last group will not be emitted.
>>>list(grouper('ABCDEFG',3,incomplete='ignore',fillvalue='x'))[('A', 'B', 'C'), ('D', 'E', 'F')]
Whenincomplete is‘strict’, a subclass ofValueError will be raised.
>>>iterator=grouper('ABCDEFG',3,incomplete='strict')>>>list(iterator)Traceback (most recent call last):...UnequalIterablesError
Returns a 2-tuple of iterables derived from the input iterable.The first yields the items that havepred(item)==False.The second yields the items that havepred(item)==True.
>>>is_odd=lambdax:x%2!=0>>>iterable=range(10)>>>even_items,odd_items=partition(is_odd,iterable)>>>list(even_items),list(odd_items)([0, 2, 4, 6, 8], [1, 3, 5, 7, 9])
Ifpred is None,bool() is used.
>>>iterable=[0,1,False,True,'',' ']>>>false_items,true_items=partition(None,iterable)>>>list(false_items),list(true_items)([0, False, ''], [1, True, ' '])
These tools peek at an iterable’s values without advancing it.
New itertools
Return a 2-tuple with a list containing the firstn elements ofiterable, and an iterator with the same items asiterable.This allows you to “look ahead” at the items in the iterable withoutadvancing it.
There is one item in the list by default:
>>>iterable='abcdefg'>>>head,iterable=spy(iterable)>>>head['a']>>>list(iterable)['a', 'b', 'c', 'd', 'e', 'f', 'g']
You may use unpacking to retrieve items instead of lists:
>>>(head,),iterable=spy('abcdefg')>>>head'a'>>>(first,second),iterable=spy('abcdefg',2)>>>first'a'>>>second'b'
The number of items requested can be larger than the number of items inthe iterable:
>>>iterable=[1,2,3,4,5]>>>head,iterable=spy(iterable,10)>>>head[1, 2, 3, 4, 5]>>>list(iterable)[1, 2, 3, 4, 5]
Wrap an iterator to allow lookahead and prepending elements.
Callpeek() on the result to get the value that will be returnedbynext(). This won’t advance the iterator:
>>>p=peekable(['a','b'])>>>p.peek()'a'>>>next(p)'a'
Passpeek() a default value to return that instead of raisingStopIteration when the iterator is exhausted.
>>>p=peekable([])>>>p.peek('hi')'hi'
peekables also offer aprepend() method, which “inserts” itemsat the head of the iterable:
>>>p=peekable([1,2,3])>>>p.prepend(10,11,12)>>>next(p)10>>>p.peek()11>>>list(p)[11, 12, 1, 2, 3]
peekables can be indexed. Index 0 is the item that will be returned bynext(), index 1 is the item after that, and so on:The values up to the given index will be cached.
>>>p=peekable(['a','b','c','d'])>>>p[0]'a'>>>p[1]'b'>>>next(p)'a'
Negative indexes are supported, but be aware that they will cache theremaining items in the source iterator, which may require significantstorage.
To check whether a peekable is exhausted, check its truth value:
>>>p=peekable(['a','b'])>>>ifp:# peekable has items...list(p)['a', 'b']>>>ifnotp:# peekable is exhausted...list(p)[]
Wrap an iterator to allow for seeking backward and forward. Thisprogressively caches the items in the source iterable so they can bere-visited.
Callseek() with an index to seek to that position in the sourceiterable.
To “reset” an iterator, seek to0:
>>>fromitertoolsimportcount>>>it=seekable((str(n)fornincount()))>>>next(it),next(it),next(it)('0', '1', '2')>>>it.seek(0)>>>next(it),next(it),next(it)('0', '1', '2')
You can also seek forward:
>>>it=seekable((str(n)forninrange(20)))>>>it.seek(10)>>>next(it)'10'>>>it.seek(20)# Seeking past the end of the source isn't a problem>>>list(it)[]>>>it.seek(0)# Resetting works even after hitting the end>>>next(it)'0'
Callrelative_seek() to seek relative to the source iterator’scurrent position.
>>>it=seekable((str(n)forninrange(20)))>>>next(it),next(it),next(it)('0', '1', '2')>>>it.relative_seek(2)>>>next(it)'5'>>>it.relative_seek(-3)# Source is at '6', we move back to '3'>>>next(it)'3'>>>it.relative_seek(-3)# Source is at '4', we move back to '1'>>>next(it)'1'
Callpeek() to look ahead one item without advancing the iterator:
>>>it=seekable('1234')>>>it.peek()'1'>>>list(it)['1', '2', '3', '4']>>>it.peek(default='empty')'empty'
Before the iterator is at its end, callingbool() on it will returnTrue. After it will returnFalse:
>>>it=seekable('5678')>>>bool(it)True>>>list(it)['5', '6', '7', '8']>>>bool(it)False
You may view the contents of the cache with theelements() method.That returns aSequenceView, a view that updates automatically:
>>>it=seekable((str(n)forninrange(10)))>>>next(it),next(it),next(it)('0', '1', '2')>>>elements=it.elements()>>>elementsSequenceView(['0', '1', '2'])>>>next(it)'3'>>>elementsSequenceView(['0', '1', '2', '3'])
By default, the cache grows as the source iterable progresses, so beware ofwrapping very large or infinite iterables. Supplymaxlen to limit thesize of the cache (this of course limits how far back you can seek).
>>>fromitertoolsimportcount>>>it=seekable((str(n)fornincount()),maxlen=2)>>>next(it),next(it),next(it),next(it)('0', '1', '2', '3')>>>list(it.elements())['2', '3']>>>it.seek(0)>>>next(it),next(it),next(it),next(it)('2', '3', '4', '5')>>>next(it)'6'
These tools yield windows of items from an iterable.
New itertools
Return a sliding window of widthn over the given iterable.
>>>all_windows=windowed([1,2,3,4,5],3)>>>list(all_windows)[(1, 2, 3), (2, 3, 4), (3, 4, 5)]
When the window is larger than the iterable,fillvalue is used in placeof missing values:
>>>list(windowed([1,2,3],4))[(1, 2, 3, None)]
Each window will advance in increments ofstep:
>>>list(windowed([1,2,3,4,5,6],3,fillvalue='!',step=2))[(1, 2, 3), (3, 4, 5), (5, 6, '!')]
To slide into the iterable’s items, usechain() to add filler itemsto the left:
>>>iterable=[1,2,3,4]>>>n=3>>>padding=[None]*(n-1)>>>list(windowed(chain(padding,iterable),3))[(None, None, 1), (None, 1, 2), (1, 2, 3), (2, 3, 4)]
Yield all of the substrings ofiterable.
>>>[''.join(s)forsinsubstrings('more')]['m', 'o', 'r', 'e', 'mo', 'or', 're', 'mor', 'ore', 'more']
Note that non-string iterables can also be subdivided.
>>>list(substrings([0,1,2]))[(0,), (1,), (2,), (0, 1), (1, 2), (0, 1, 2)]
Yield all substrings and their positions inseq
The items yielded will be a tuple of the form(substr,i,j), wheresubstr==seq[i:j].
This function only works for iterables that support slicing, such asstr objects.
>>>foriteminsubstrings_indexes('more'):...print(item)('m', 0, 1)('o', 1, 2)('r', 2, 3)('e', 3, 4)('mo', 0, 2)('or', 1, 3)('re', 2, 4)('mor', 0, 3)('ore', 1, 4)('more', 0, 4)
Setreverse toTrue to yield the same items in the opposite order.
Yield tuples whose elements are offset fromiterable.The amount by which thei-th item in each tuple is offset is given bythei-th item inoffsets.
>>>list(stagger([0,1,2,3]))[(None, 0, 1), (0, 1, 2), (1, 2, 3)]>>>list(stagger(range(8),offsets=(0,2,4)))[(0, 2, 4), (1, 3, 5), (2, 4, 6), (3, 5, 7)]
By default, the sequence will end when the final element of a tuple is thelast item in the iterable. To continue until the first element of a tupleis the last item in the iterable, setlongest toTrue:
>>>list(stagger([0,1,2,3],longest=True))[(None, 0, 1), (0, 1, 2), (1, 2, 3), (2, 3, None), (3, None, None)]
By default,None will be used to replace offsets beyond the end of thesequence. Specifyfillvalue to use some other value.
Yield(beginning,middle,end) tuples, where:
Eachmiddle hasn items fromiterable
Eachbeginning has the items before the ones inmiddle
Eachend has the items after the ones inmiddle
>>>iterable=range(7)>>>n=3>>>forbeginning,middle,endinwindowed_complete(iterable,n):...print(beginning,middle,end)() (0, 1, 2) (3, 4, 5, 6)(0,) (1, 2, 3) (4, 5, 6)(0, 1) (2, 3, 4) (5, 6)(0, 1, 2) (3, 4, 5) (6,)(0, 1, 2, 3) (4, 5, 6) ()
Note thatn must be at least 0 and most equal to the length ofiterable.
This function will exhaust the iterable and may require significantstorage.
Itertools recipes
Returns an iterator of paired items, overlapping, from the original
>>>take(4,pairwise(count()))[(0, 1), (1, 2), (2, 3), (3, 4)]
On Python 3.10 and above, this is an alias foritertools.pairwise().
Return overlapping triplets fromiterable.
>>>list(triplewise('ABCDE'))[('A', 'B', 'C'), ('B', 'C', 'D'), ('C', 'D', 'E')]
Return a sliding window of widthn overiterable.
>>>list(sliding_window(range(6),4))[(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5)]
Ifiterable has fewer thann items, then nothing is yielded:
>>>list(sliding_window(range(3),4))[]
For a variant with more features, seewindowed().
Return all contiguous non-empty subslices ofiterable.
>>>list(subslices('ABC'))[['A'], ['A', 'B'], ['A', 'B', 'C'], ['B'], ['B', 'C'], ['C']]
This is similar tosubstrings(), but emits items in a differentorder.
These tools yield items from an iterable, plus additional data.
New itertools
Cycle through the items fromiterable up ton times, yieldingthe number of completed cycles along with each item. Ifn is omitted theprocess repeats indefinitely.
>>>list(count_cycle('AB',3))[(0, 'A'), (0, 'B'), (1, 'A'), (1, 'B'), (2, 'A'), (2, 'B')]
Intersperse filler elemente among the items initerable, leavingn items between each filler element.
>>>list(intersperse('!',[1,2,3,4,5]))[1, '!', 2, '!', 3, '!', 4, '!', 5]
>>>list(intersperse(None,[1,2,3,4,5],n=2))[1, 2, None, 3, 4, None, 5]
Yield the elements fromiterable, followed byfillvalue, such thatat leastn items are emitted.
>>>list(padded([1,2,3],'?',5))[1, 2, 3, '?', '?']
Ifnext_multiple isTrue,fillvalue will be emitted until thenumber of items emitted is a multiple ofn:
>>>list(padded([1,2,3,4],n=3,next_multiple=True))[1, 2, 3, 4, None, None]
Ifn isNone,fillvalue will be emitted indefinitely.
To create aniterable of exactly sizen, you can truncate withislice().
>>>list(islice(padded([1,2,3],'?'),5))[1, 2, 3, '?', '?']>>>list(islice(padded([1,2,3,4,5,6,7,8],'?'),5))[1, 2, 3, 4, 5]
Yield 3-tuples of the form(is_first,is_last,item).
>>>list(mark_ends('ABC'))[(True, False, 'A'), (False, False, 'B'), (False, True, 'C')]
Use this when looping over an iterable to take special action on its firstand/or last items:
>>>iterable=['Header',100,200,'Footer']>>>total=0>>>foris_first,is_last,iteminmark_ends(iterable):...ifis_first:...continue# Skip the header...ifis_last:...continue# Skip the footer...total+=item>>>print(total)300
Repeat each element initerablen times.
>>>list(repeat_each('ABC',3))['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C']
After theiterable is exhausted, keep yielding its last element.
>>>list(islice(repeat_last(range(3)),5))[0, 1, 2, 2, 2]
If the iterable is empty, yielddefault forever:
>>>list(islice(repeat_last(range(0),42),5))[42, 42, 42, 42, 42]
Return an iterable over(bool, item) tuples where theitem isdrawn fromiterable and thebool indicates whetherthat item satisfies thepredicate or is adjacent to an item that does.
For example, to find whether items are adjacent to a3:
>>>list(adjacent(lambdax:x==3,range(6)))[(False, 0), (False, 1), (True, 2), (True, 3), (True, 4), (False, 5)]
Setdistance to change what counts as adjacent. For example, to findwhether items are two places away from a3:
>>>list(adjacent(lambdax:x==3,range(6),distance=2))[(False, 0), (True, 1), (True, 2), (True, 3), (True, 4), (True, 5)]
This is useful for contextualizing the results of a search function.For example, a code comparison tool might want to identify lines thathave changed, but also surrounding lines to give the viewer of the diffcontext.
The predicate function will only be called once for each item in theiterable.
See alsogroupby_transform(), which can be used with this functionto group ranges of items with the samebool value.
An extension ofitertools.groupby() that can apply transformationsto the grouped data.
keyfunc is a function computing a key value for each item initerable
valuefunc is a function that transforms the individual items fromiterable after grouping
reducefunc is a function that transforms each group of items
>>>iterable='aAAbBBcCC'>>>keyfunc=lambdak:k.upper()>>>valuefunc=lambdav:v.lower()>>>reducefunc=lambdag:''.join(g)>>>list(groupby_transform(iterable,keyfunc,valuefunc,reducefunc))[('A', 'aaa'), ('B', 'bbb'), ('C', 'ccc')]
Each optional argument defaults to an identity function if not specified.
groupby_transform() is useful when grouping elements of an iterableusing a separate iterable as the key. To do this,zip() the iterablesand pass akeyfunc that extracts the first element and avaluefuncthat extracts the second element:
>>>fromoperatorimportitemgetter>>>keys=[0,0,1,1,1,2,2,2,3]>>>values='abcdefghi'>>>iterable=zip(keys,values)>>>grouper=groupby_transform(iterable,itemgetter(0),itemgetter(1))>>>[(k,''.join(g))fork,gingrouper][(0, 'ab'), (1, 'cde'), (2, 'fgh'), (3, 'i')]
Note that the order of items in the iterable is significant.Only adjacent items are grouped together, so if you don’t want anyduplicate groups, you should sort the iterable by the key function.
Itertools recipes
These tools combine multiple iterables.
New itertools
Flatten an iterable with multiple levels of nesting (e.g., a list oflists of tuples) into non-iterable types.
>>>iterable=[(1,2),([3,4],[[5],[6]])]>>>list(collapse(iterable))[1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable andwill not be collapsed.
To avoid collapsing other types, specifybase_type:
>>>iterable=['ab',('cd','ef'),['gh','ij']]>>>list(collapse(iterable,base_type=tuple))['ab', ('cd', 'ef'), 'gh', 'ij']
Specifylevels to stop flattening after a certain level:
>>>iterable=[('a',['b']),('c',['d'])]>>>list(collapse(iterable))# Fully flattened['a', 'b', 'c', 'd']>>>list(collapse(iterable,levels=1))# Only one level flattened['a', ['b'], 'c', ['d']]
Return a new iterable yielding from each iterable in turn,until the shortest is exhausted.
>>>list(interleave([1,2,3],[4,5],[6,7,8]))[1, 4, 6, 2, 5, 7]
For a version that doesn’t terminate after the shortest iterable isexhausted, seeinterleave_longest().
Return a new iterable yielding from each iterable in turn,skipping any that are exhausted.
>>>list(interleave_longest([1,2,3],[4,5],[6,7,8]))[1, 4, 6, 2, 5, 7, 3, 8]
This function produces the same output asroundrobin(), but mayperform better for some inputs (in particular when the number of iterablesis large).
Interleave multiple iterables so that their elements are evenly distributedthroughout the output sequence.
>>>iterables=[1,2,3,4,5],['a','b']>>>list(interleave_evenly(iterables))[1, 2, 'a', 3, 4, 'b', 5]
>>>iterables=[[1,2,3],[4,5],[6,7,8]]>>>list(interleave_evenly(iterables))[1, 6, 4, 2, 7, 3, 8, 5]
This function requires iterables of known length. Iterables without__len__() can be used by manually specifying lengths withlengths:
>>>fromitertoolsimportcombinations,repeat>>>iterables=[combinations(range(4),2),['a','b','c']]>>>lengths=[4*(4-1)//2,3]>>>list(interleave_evenly(iterables,lengths=lengths))[(0, 1), (0, 2), 'a', (0, 3), (1, 2), 'b', (1, 3), (2, 3), 'c']
Based on Bresenham’s algorithm.
Repeatedly select one of the inputiterables at random and yield the nextitem from it.
>>>iterables=[1,2,3],'abc',(True,False,None)>>>list(interleave_randomly(*iterables))['a', 'b', 1, 'c', True, False, None, 2, 3]
The relative order of the items in each input iterable will preserved. Note thesequences of items with this property are not equally likely to be generated.
Yields tuples containing one item from each iterator, with subsequenttuples changing a single item at a time by advancing each iterator until itis exhausted. This sequence guarantees every value in each iterable isoutput at least once without generating all possible combinations.
This may be useful, for example, when testing an expensive function.
>>>list(partial_product('AB','C','DEF'))[('A', 'C', 'D'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'C', 'F')]
Return the input iterables sorted together, withkey_list as thepriority for sorting. All iterables are trimmed to the length of theshortest one.
This can be used like the sorting function in a spreadsheet. If eachiterable represents a column of data, the key list determines whichcolumns are used for sorting.
By default, all iterables are sorted using the0-th iterable:
>>>iterables=[(4,3,2,1),('a','b','c','d')]>>>sort_together(iterables)[(1, 2, 3, 4), ('d', 'c', 'b', 'a')]
Set a different key list to sort according to another iterable.Specifying multiple keys dictates how ties are broken:
>>>iterables=[(3,1,2),(0,1,0),('c','b','a')]>>>sort_together(iterables,key_list=(1,2))[(2, 3, 1), (0, 0, 1), ('a', 'c', 'b')]
To sort by a function of the elements of the iterable, pass akeyfunction. Its arguments are the elements of the iterables corresponding tothe key list:
>>>names=('a','b','c')>>>lengths=(1,2,3)>>>widths=(5,2,1)>>>defarea(length,width):...returnlength*width>>>sort_together([names,lengths,widths],key_list=(1,2),key=area)[('c', 'b', 'a'), (3, 2, 1), (1, 2, 5)]
Setreverse toTrue to sort in descending order.
>>>sort_together([(1,2,3),('c','b','a')],reverse=True)[(3, 2, 1), ('a', 'b', 'c')]
If thestrict keyword argument isTrue, thenUnequalIterablesError will be raised if any of the iterables havedifferent lengths.
Yield all arguments passed to the function in the same order in whichthey were passed. If an argument itself is iterable then iterate over itsvalues.
>>>list(value_chain(1,2,3,[4,5,6]))[1, 2, 3, 4, 5, 6]
Binary and text strings are not considered iterable and are emittedas-is:
>>>list(value_chain('12','34',['56','78']))['12', '34', '56', '78']
Pre- or postpend a single element to an iterable:
>>>list(value_chain(1,[2,3,4,5,6]))[1, 2, 3, 4, 5, 6]>>>list(value_chain([1,2,3,4,5],6))[1, 2, 3, 4, 5, 6]
Multiple levels of nesting are not flattened.
zip the inputiterables together, but offset thei-th iterableby thei-th item inoffsets.
>>>list(zip_offset('0123','abcdef',offsets=(0,1)))[('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e')]
This can be used as a lightweight alternative to SciPy or pandas to analyzedata sets in which some series have a lead or lag relationship.
By default, the sequence will end when the shortest iterable is exhausted.To continue until the longest iterable is exhausted, setlongest toTrue.
>>>list(zip_offset('0123','abcdef',offsets=(0,1),longest=True))[('0', 'b'), ('1', 'c'), ('2', 'd'), ('3', 'e'), (None, 'f')]
By default,None will be used to replace offsets beyond the end of thesequence. Specifyfillvalue to use some other value.
zip the inputiterables together but raiseUnequalIterablesError if they aren’t all the same length.
>>>it_1=range(3)>>>it_2=iter('abc')>>>list(zip_equal(it_1,it_2))[(0, 'a'), (1, 'b'), (2, 'c')]
>>>it_1=range(3)>>>it_2=iter('abcd')>>>list(zip_equal(it_1,it_2))Traceback (most recent call last):...more_itertools.more.UnequalIterablesError:Iterables have differentlengths
A version ofzip() that “broadcasts” any scalar(i.e., non-iterable) items into output tuples.
>>>iterable_1=[1,2,3]>>>iterable_2=['a','b','c']>>>scalar='_'>>>list(zip_broadcast(iterable_1,iterable_2,scalar))[(1, 'a', '_'), (2, 'b', '_'), (3, 'c', '_')]
Thescalar_types keyword argument determines what types are consideredscalar. It is set to(str,bytes) by default. Set it toNone totreat strings and byte strings as iterable:
>>>list(zip_broadcast('abc',0,'xyz',scalar_types=None))[('a', 0, 'x'), ('b', 0, 'y'), ('c', 0, 'z')]
If thestrict keyword argument isTrue, thenUnequalIterablesError will be raised if any of the iterables havedifferent lengths.
Itertools recipes
Return an iterator flattening one level of nesting in a list of lists.
>>>list(flatten([[0,1],[2,3]]))[0, 1, 2, 3]
See alsocollapse(), which can flatten multiple levels of nesting.
Visit input iterables in a cycle until each is exhausted.
>>>list(roundrobin('ABC','D','EF'))['A', 'D', 'E', 'B', 'F', 'C']
This function produces the same output asinterleave_longest(), butmay perform better for some inputs (in particular when the number ofiterables is small).
Yieldvalue, followed by the elements initerator.
>>>value='0'>>>iterator=['1','2','3']>>>list(prepend(value,iterator))['0', '1', '2', '3']
To prepend multiple values, seeitertools.chain()orvalue_chain().
These tools return summarized or aggregated data from an iterable.
New itertools
Return the number of items initerable.
For example, there are 168 prime numbers below 1,000:
>>>ilen(sieve(1000))168
Equivalent to, but faster than:
defilen(iterable):count=0for_initerable:count+=1returncount
This fully consumes the iterable, so handle with care.
Return the elements from each of the input iterables that aren’t in theother input iterables.
For example, suppose you have a set of packages, each with a set ofdependencies:
{'pkg_1':{'A','B'},'pkg_2':{'B','C'},'pkg_3':{'B','D'}}
If you remove one package, which dependencies can also be removed?
Ifpkg_1 is removed, thenA is no longer necessary - it is notassociated withpkg_2 orpkg_3. Similarly,C is only needed forpkg_2, andD is only needed forpkg_3:
>>>unique_to_each({'A','B'},{'B','C'},{'B','D'})[['A'], ['C'], ['D']]
If there are duplicates in one input iterable that aren’t in the othersthey will be duplicated in the output. Input order is preserved:
>>>unique_to_each("mississippi","missouri")[['p', 'p'], ['o', 'u', 'r']]
It is assumed that the elements of each iterable are hashable.
Return ak-length list of elements chosen (without replacement)from theiterable.
Similar torandom.sample(), but works on inputs that aren’tindexable (such as sets and dictionaries) and on inputs where thesize isn’t known in advance (such as generators).
>>>iterable=range(100)>>>sample(iterable,5)[81, 60, 96, 16, 4]
For iterables with repeated elements, you may supplycounts toindicate the repeats.
>>>iterable=['a','b']>>>counts=[3,4]# Equivalent to 'a', 'a', 'a', 'b', 'b', 'b', 'b'>>>sample(iterable,k=3,counts=counts)['a', 'a', 'b']
An iterable withweights may be given:
>>>iterable=range(100)>>>weights=(i*i+1foriinrange(100))>>>sampled=sample(iterable,5,weights=weights)[79, 67, 74, 66, 78]
Weighted selections are made without replacement.After an element is selected, it is removed from the pool and therelative weights of the other elements increase (thisdoes not match the behavior ofrandom.sample()’scountsparameter). Note thatweights may not be used withcounts.
If the length ofiterable is less thank,ValueError is raised ifstrict isTrue andall elements are returned (in shuffled order) ifstrict isFalse.
By default, theAlgorithm L reservoir samplingtechnique is used. Whenweights are provided,Algorithm A-ExpJ is used instead.
Notes on reproducibility:
The algorithms rely on inexact floating-point functions providedby the underlying math library (e.g.log,log1p, andpow).Those functions canproduce slightly different results ondifferent builds. Accordingly, selections can vary across buildseven for the same seed.
The algorithms loop over the input and make selections based onordinal position, so selections from unordered collections (such assets) won’t reproduce across sessions on the same platform using thesame seed. For example, this won’t reproduce:
>>seed(8675309)>>sample(set('abcdefghijklmnopqrstuvwxyz'),10)['c','p','e','w','s','a','j','d','n','t']
Yield groups of consecutive items usingitertools.groupby().Theordering function determines whether two items are adjacent byreturning their position.
By default, the ordering function is the identity function. This issuitable for finding runs of numbers:
>>>iterable=[1,10,11,12,20,30,31,32,33,40]>>>forgroupinconsecutive_groups(iterable):...print(list(group))[1][10, 11, 12][20][30, 31, 32, 33][40]
To find runs of adjacent letters, applyord() functionto convert letters to ordinals.
>>>iterable='abcdfgilmnop'>>>ordering=ord>>>forgroupinconsecutive_groups(iterable,ordering):...print(list(group))['a', 'b', 'c', 'd']['f', 'g']['i']['l', 'm', 'n', 'o', 'p']
Each group of consecutive items is an iterator that shares it source withiterable. When an an output group is advanced, the previous group isno longer available unless its elements are copied (e.g., into alist).
>>>iterable=[1,2,11,12,21,22]>>>saved_groups=[]>>>forgroupinconsecutive_groups(iterable):...saved_groups.append(list(group))# Copy group elements>>>saved_groups[[1, 2], [11, 12], [21, 22]]
run_length.encode() compresses an iterable with run-length encoding.It yields groups of repeated items with the count of how many times theywere repeated:
>>>uncompressed='abbcccdddd'>>>list(run_length.encode(uncompressed))[('a', 1), ('b', 2), ('c', 3), ('d', 4)]
run_length.decode() decompresses an iterable that was previouslycompressed with run-length encoding. It yields the items of thedecompressed iterable:
>>>compressed=[('a',1),('b',2),('c',3),('d',4)]>>>list(run_length.decode(compressed))['a', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'd', 'd']
Return a dictionary that maps the items initerable to categoriesdefined bykeyfunc, transforms them withvaluefunc, andthen summarizes them by category withreducefunc.
valuefunc defaults to the identity function if it is unspecified.Ifreducefunc is unspecified, no summarization takes place:
>>>keyfunc=lambdax:x.upper()>>>result=map_reduce('abbccc',keyfunc)>>>sorted(result.items())[('A', ['a']), ('B', ['b', 'b']), ('C', ['c', 'c', 'c'])]
Specifyingvaluefunc transforms the categorized items:
>>>keyfunc=lambdax:x.upper()>>>valuefunc=lambdax:1>>>result=map_reduce('abbccc',keyfunc,valuefunc)>>>sorted(result.items())[('A', [1]), ('B', [1, 1]), ('C', [1, 1, 1])]
Specifyingreducefunc summarizes the categorized items:
>>>keyfunc=lambdax:x.upper()>>>valuefunc=lambdax:1>>>reducefunc=sum>>>result=map_reduce('abbccc',keyfunc,valuefunc,reducefunc)>>>sorted(result.items())[('A', 1), ('B', 2), ('C', 3)]
You may want to filter the input iterable before applying the map/reduceprocedure:
>>>all_items=range(30)>>>items=[xforxinall_itemsif10<=x<=20]# Filter>>>keyfunc=lambdax:x%2# Evens map to 0; odds to 1>>>categories=map_reduce(items,keyfunc=keyfunc)>>>sorted(categories.items())[(0, [10, 12, 14, 16, 18, 20]), (1, [11, 13, 15, 17, 19])]>>>summaries=map_reduce(items,keyfunc=keyfunc,reducefunc=sum)>>>sorted(summaries.items())[(0, 90), (1, 75)]
Note that all items in the iterable are gathered into a list before thesummarization step, which may require significant storage.
The returned object is acollections.defaultdict with thedefault_factory set toNone, such that it behaves like a normaldictionary.
Joins multiple mappings together using their common keys.
>>>user_scores={'elliot':50,'claris':60}>>>user_times={'elliot':30,'claris':40}>>>join_mappings(score=user_scores,time=user_times){'elliot': {'score': 50, 'time': 30}, 'claris': {'score': 60, 'time': 40}}
ReturnTrue if exactlyn items in the iterable areTrueaccording to thepredicate function.
>>>exactly_n([True,True,False],2)True>>>exactly_n([True,True,False],1)False>>>exactly_n([0,1,2,3,4,5],3,lambdax:x<3)True
The iterable will be advanced untiln+1 truthy items are encountered,so avoid calling it on infinite iterables.
ReturnsTrue if the items of iterable are in sorted order, andFalse otherwise.key andreverse have the same meaning that they doin the built-insorted() function.
>>>is_sorted(['1','2','3','4','5'],key=int)True>>>is_sorted([5,4,3,1,2],reverse=True)False
Ifstrict, tests for strict sorting, that is, returnsFalse if equalelements are found:
>>>is_sorted([1,2,2])True>>>is_sorted([1,2,2],strict=True)False
The function returnsFalse after encountering the first out-of-orderitem, which means it may produce results that differ from the built-insorted() function for objects with unusual comparison dynamics(likemath.nan). If there are no out-of-order items, the iterable isexhausted.
ReturnsTrue if all the elements ofiterable are unique (no twoelements are equal).
>>>all_unique('ABCB')False
If akey function is specified, it will be used to make comparisons.
>>>all_unique('ABCb')True>>>all_unique('ABCb',str.lower)False
The function returns as soon as the first non-unique element isencountered. Iterables with a mix of hashable and unhashable items canbe used, but the function will be slower for unhashable items.
Index of the first occurrence of a minimum value in an iterable.
>>>argmin('efghabcdijkl')4>>>argmin([3,2,1,0,4,2,1,0])3
For example, look up a label corresponding to the positionof a value that minimizes a cost function:
>>>defcost(x):..."Days for a wound to heal given a subject's age."...returnx**2-20*x+150...>>>labels=['homer','marge','bart','lisa','maggie']>>>ages=[35,30,10,9,1]# Fastest healing family member>>>labels[argmin(ages,key=cost)]'bart'# Age with fastest healing>>>min(ages,key=cost)10
Index of the first occurrence of a maximum value in an iterable.
>>>argmax('abcdefghabcd')7>>>argmax([0,1,2,3,3,2,1,0])3
For example, identify the best machine learning model:
>>>models=['svm','random forest','knn','naïve bayes']>>>accuracy=[68,61,84,72]# Most accurate model>>>models[argmax(accuracy)]'knn'# Best accuracy>>>max(accuracy)84
Returns both the smallest and largest items from an iterableor from two or more arguments.
>>>minmax([3,1,5])(1, 5)
>>>minmax(4,2,6)(2, 6)
If akey function is provided, it will be used to transform the inputitems for comparison.
>>>minmax([5,30],key=str)# '30' sorts before '5'(30, 5)
If adefault value is provided, it will be returned if there are noinput items.
>>>minmax([],default=(0,0))(0, 0)
OtherwiseValueError is raised.
This function makes a single pass over the input elements and takes care tominimize the number of comparisons made during processing.
Note that unlike the builtinmax function, which always returns the firstitem with the maximum value, this function may return another item when there areties.
This function is based on therecipe byRaymond Hettinger.
ReturnTrue if all giveniterables are equal to each other,which means that they contain the same elements in the same order.
The function is useful for comparing iterables of different data typesor iterables that do not support equality checks.
>>>iequals("abc",['a','b','c'],('a','b','c'),iter("abc"))True
>>>iequals("abc","acb")False
Not to be confused withall_equal(), which checks whether allelements of iterable are equal to each other.
Itertools recipes
ReturnsTrue if all the elements are equal to each other.
>>>all_equal('aaaa')True>>>all_equal('aaab')False
A function that accepts a single argument and returns a transformed versionof each input item can be specified withkey:
>>>all_equal('AaaA',key=str.casefold)True>>>all_equal([1,2,3],key=lambdax:x<10)True
Returns the first true value in the iterable.
If no true value is found, returnsdefault
Ifpred is not None, returns the first item for whichpred(item)==True .
>>>first_true(range(10))1>>>first_true(range(10),pred=lambdax:x>5)6>>>first_true(range(10),default='missing',pred=lambdax:x>9)'missing'
These tools yield certain items from an iterable.
New itertools
An extension ofitertools.islice() that supports negative valuesforstop,start, andstep.
>>>iterator=iter('abcdefgh')>>>list(islice_extended(iterator,-4,-1))['e', 'f', 'g']
Slices with negative values require some caching ofiterable, but thisfunction takes care to minimize the amount of memory required.
For example, you can use a negative step with an infinite iterator:
>>>fromitertoolsimportcount>>>list(islice_extended(count(),110,99,-2))[110, 108, 106, 104, 102, 100]
You can also use slice notation directly:
>>>iterator=map(str,count())>>>it=islice_extended(iterator)[10:20:2]>>>list(it)['10', '12', '14', '16', '18']
Return the first item ofiterable, ordefault ifiterable isempty.
>>>first([0,1,2,3])0>>>first([],'some default')'some default'
Ifdefault is not provided and there are no items in the iterable,raiseValueError.
first() is useful when you have a generator of expensive-to-retrievevalues and want any arbitrary one. It is marginally shorter thannext(iter(iterable),default).
Return the last item ofiterable, ordefault ifiterable isempty.
>>>last([0,1,2,3])3>>>last([],'some default')'some default'
Ifdefault is not provided and there are no items in the iterable,raiseValueError.
Return the first item fromiterable, which is expected to contain onlythat item. Raise an exception ifiterable is empty or has more than oneitem.
one() is useful for ensuring that an iterable contains only one item.For example, it can be used to retrieve the result of a database querythat is expected to return a single row.
Ifiterable is empty,ValueError will be raised. You may specify adifferent exception with thetoo_short keyword:
>>>it=[]>>>one(it)Traceback (most recent call last):...ValueError:too few items in iterable (expected 1)'>>>too_short=IndexError('too few items')>>>one(it,too_short=too_short)Traceback (most recent call last):...IndexError:too few items
Similarly, ifiterable contains more than one item,ValueError willbe raised. You may specify a different exception with thetoo_longkeyword:
>>>it=['too','many']>>>one(it)Traceback (most recent call last):...ValueError:Expected exactly one item in iterable, but got 'too','many', and perhaps more.>>>too_long=RuntimeError>>>one(it,too_long=too_long)Traceback (most recent call last):...RuntimeError
Note thatone() attempts to advanceiterable twice to ensure thereis only one item. Seespy() orpeekable() to check iterablecontents less destructively.
Ifiterable has only one item, return it.If it has zero items, returndefault.If it has more than one item, raise the exception given bytoo_long,which isValueError by default.
>>>only([],default='missing')'missing'>>>only([1])1>>>only([1,2])Traceback (most recent call last):...ValueError:Expected exactly one item in iterable, but got 1, 2, and perhaps more.'>>>only([1,2],too_long=TypeError)Traceback (most recent call last):...TypeError
Note thatonly() attempts to advanceiterable twice to ensure thereis only one item. Seespy() orpeekable() to checkiterable contents less destructively.
Validate thatiterable has exactlyn items and return them ifit does. If it has fewer thann items, call functiontoo_shortwith the actual number of items. If it has more thann items, call functiontoo_long with the numbern+1.
>>>iterable=['a','b','c','d']>>>n=4>>>list(strictly_n(iterable,n))['a', 'b', 'c', 'd']
Note that the returned iterable must be consumed in order for the check tobe made.
By default,too_short andtoo_long are functions that raiseValueError.
>>>list(strictly_n('ab',3))Traceback (most recent call last):...ValueError:too few items in iterable (got 2)
>>>list(strictly_n('abc',2))Traceback (most recent call last):...ValueError:too many items in iterable (got at least 3)
You can instead supply functions that do something else.too_short will be called with the number of items initerable.too_long will be called withn + 1.
>>>deftoo_short(item_count):...raiseRuntimeError>>>it=strictly_n('abcd',6,too_short=too_short)>>>list(it)Traceback (most recent call last):...RuntimeError
>>>deftoo_long(item_count):...print('The boss is going to hear about this')>>>it=strictly_n('abcdef',4,too_long=too_long)>>>list(it)The boss is going to hear about this['a', 'b', 'c', 'd']
Yield the items fromiterable, but strip any from thebeginning and end for whichpred returnsTrue.
For example, to remove a set of items from both ends of an iterable:
>>>iterable=(None,False,None,1,2,None,3,False,None)>>>pred=lambdax:xin{None,False,''}>>>list(strip(iterable,pred))[1, 2, None, 3]
This function is analogous tostr.strip().
Yield the items fromiterable, but strip any from the beginningfor whichpred returnsTrue.
For example, to remove a set of items from the start of an iterable:
>>>iterable=(None,False,None,1,2,None,3,False,None)>>>pred=lambdax:xin{None,False,''}>>>list(lstrip(iterable,pred))[1, 2, None, 3, False, None]
This function is analogous to tostr.lstrip(), and is essentiallyan wrapper foritertools.dropwhile().
Yield the items fromiterable, but strip any from the endfor whichpred returnsTrue.
For example, to remove a set of items from the end of an iterable:
>>>iterable=(None,False,None,1,2,None,3,False,None)>>>pred=lambdax:xin{None,False,''}>>>list(rstrip(iterable,pred))[None, False, None, 1, 2, None, 3]
This function is analogous tostr.rstrip().
Yield the items fromiterable for which thevalidator function doesnot raise one of the specifiedexceptions.
validator is called for each item initerable.It should be a function that accepts one argument and raises an exceptionif that item is not valid.
>>>iterable=['1','2','three','4',None]>>>list(filter_except(int,iterable,ValueError,TypeError))['1', '2', '4']
If an exception other than one given byexceptions is raised byvalidator, it is raised like normal.
Transform each item fromiterable withfunction and yield theresult, unlessfunction raises one of the specifiedexceptions.
function is called to transform each item initerable.It should accept one argument.
>>>iterable=['1','2','three','4',None]>>>list(map_except(int,iterable,ValueError,TypeError))[1, 2, 4]
If an exception other than one given byexceptions is raised byfunction, it is raised like normal.
Applyfunc to every element ofiterable, yielding only those whichare notNone.
>>>elems=['1','a','2','b','3']>>>list(filter_map(lambdas:int(s)ifs.isnumeric()elseNone,elems))[1, 2, 3]
Yield each of the items fromiterable. If the iteration raises one ofthe specifiedexceptions, that exception will be suppressed and iterationwill stop.
>>>fromitertoolsimportchain>>>defbreaks_at_five(x):...whileTrue:...ifx>=5:...raiseRuntimeError...yieldx...x+=1>>>it_1=iter_suppress(breaks_at_five(1),RuntimeError)>>>it_2=iter_suppress(breaks_at_five(2),RuntimeError)>>>list(chain(it_1,it_2))[1, 2, 3, 4, 2, 3, 4]
Return the nth or the last item ofiterable,ordefault ifiterable is empty.
>>>nth_or_last([0,1,2,3],2)2>>>nth_or_last([0,1],2)1>>>nth_or_last([],0,'some default')'some default'
Ifdefault is not provided and there are no items in the iterable,raiseValueError.
Yield values at the specified indices.
Example:
>>>data='abcdefghijklmnopqrstuvwxyz'>>>list(extract(data,[7,4,11,11,14]))['h', 'e', 'l', 'l', 'o']
Theiterable is consumed lazily and can be infinite.Theindices are consumed immediately and must be finite.
RaisesIndexError if an index lies beyond the iterable.RaisesValueError for negative indices.
Yield the items fromiterable that haven’t been seen recently.n is the size of the lookback window.
>>>iterable=[0,1,0,2,3,0]>>>n=3>>>list(unique_in_window(iterable,n))[0, 1, 2, 3, 0]
Thekey function, if provided, will be used to determine uniqueness:
>>>list(unique_in_window('abAcda',3,key=lambdax:x.lower()))['a', 'b', 'c', 'd', 'a']
The items initerable must be hashable.
Yield duplicate elements after their first appearance.
>>>list(duplicates_everseen('mississippi'))['s', 'i', 's', 's', 'i', 'p', 'i']>>>list(duplicates_everseen('AaaBbbCccAaa',str.lower))['a', 'a', 'b', 'b', 'c', 'c', 'A', 'a', 'a']
This function is analogous tounique_everseen() and is subject tothe same performance considerations.
Yields serially-duplicate elements after their first appearance.
>>>list(duplicates_justseen('mississippi'))['s', 's', 'p']>>>list(duplicates_justseen('AaaBbbCccAaa',str.lower))['a', 'a', 'b', 'b', 'c', 'c', 'a', 'a']
This function is analogous tounique_justseen().
Classify each element in terms of its uniqueness.
For each element in the input iterable, return a 3-tuple consisting of:
The element itself
False if the element is equal to the one preceding it in the input,True otherwise (i.e. the equivalent ofunique_justseen())
False if this element has been seen anywhere in the input before,True otherwise (i.e. the equivalent ofunique_everseen())
>>>list(classify_unique('otto'))[('o', True, True), ('t', True, True), ('t', False, False), ('o', True, False)]
This function is analogous tounique_everseen() and is subject tothe same performance considerations.
Yield elements of the longest common prefix among giveniterables.
>>>''.join(longest_common_prefix(['abcd','abc','abf']))'ab'
A variant oftakewhile() that yields one additional element.
>>>list(takewhile_inclusive(lambdax:x<5,[1,4,6,4,1]))[1, 4, 6]
takewhile() would return[1,4].
Itertools recipes
Returns the nth item or a default value.
>>>l=range(10)>>>nth(l,3)3>>>nth(l,20,"zebra")'zebra'
A variant oftakewhile() that allows complete access to theremainder of the iterator.
>>>it=iter('ABCdEfGhI')>>>all_upper,remainder=before_and_after(str.isupper,it)>>>''.join(all_upper)'ABC'>>>''.join(remainder)# takewhile() would lose the 'd''dEfGhI'
Note that the first iterator must be fully consumed before the seconditerator can generate valid results.
Return firstn items of theiterable as a list.
>>>take(3,range(10))[0, 1, 2]
If there are fewer thann items in the iterable, all of them arereturned.
>>>take(10,range(3))[0, 1, 2]
Return an iterator over the lastn items ofiterable.
>>>t=tail(3,'ABCDEFG')>>>list(t)['E', 'F', 'G']
Yield unique elements, preserving order.
>>>list(unique_everseen('AAAABBBCCDAABBB'))['A', 'B', 'C', 'D']>>>list(unique_everseen('ABBCcAD',str.lower))['A', 'B', 'C', 'D']
Sequences with a mix of hashable and unhashable items can be used.The function will be slower (i.e.,O(n^2)) for unhashable items.
Remember thatlist objects are unhashable - you can use thekeyparameter to transform the list to a tuple (which is hashable) toavoid a slowdown.
>>>iterable=([1,2],[2,3],[1,2])>>>list(unique_everseen(iterable))# Slow[[1, 2], [2, 3]]>>>list(unique_everseen(iterable,key=tuple))# Faster[[1, 2], [2, 3]]
Similarly, you may want to convert unhashableset objects withkey=frozenset. Fordict objects,key=lambdax:frozenset(x.items()) can be used.
Yields elements in order, ignoring serial duplicates
>>>list(unique_justseen('AAAABBBCCDAABBB'))['A', 'B', 'C', 'D', 'A', 'B']>>>list(unique_justseen('ABBCcAD',str.lower))['A', 'B', 'C', 'A', 'D']
Yields unique elements in sorted order.
>>>list(unique([[1,2],[3,4],[1,2]]))[[1, 2], [3, 4]]
key andreverse are passed tosorted().
>>>list(unique('ABBcCAD',str.casefold))['A', 'B', 'c', 'D']>>>list(unique('ABBcCAD',str.casefold,reverse=True))['D', 'c', 'B', 'A']
The elements initerable need not be hashable, but they must becomparable for sorting to work.
These tools yield combinatorial arrangements of items from iterables.
New itertools
Yield successive distinct permutations of the elements initerable.
>>>sorted(distinct_permutations([1,0,1]))[(0, 1, 1), (1, 0, 1), (1, 1, 0)]
Equivalent to yielding fromset(permutations(iterable)), exceptduplicates are not generated and thrown away. For larger input sequencesthis is much more efficient.
Duplicate permutations arise when there are duplicated elements in theinput iterable. The number of items returned isn! / (x_1! * x_2! * … * x_n!), wheren is the total number ofitems input, and eachx_i is the count of a distinct item in the inputsequence. The functionmultinomial() computes this directly.
Ifr is given, only ther-length permutations are yielded.
>>>sorted(distinct_permutations([1,0,1],r=2))[(0, 1), (1, 0), (1, 1)]>>>sorted(distinct_permutations(range(3),r=2))[(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]
iterable need not be sortable, but note that using equal (x==y)but non-identical (id(x)!=id(y)) elements may produce surprisingbehavior. For example,1 andTrue are equal but non-identical:
>>>list(distinct_permutations([1,True,'3']))[ (1, True, '3'), (1, '3', True), ('3', 1, True)]>>>list(distinct_permutations([1,2,'3']))[ (1, 2, '3'), (1, '3', 2), (2, 1, '3'), (2, '3', 1), ('3', 1, 2), ('3', 2, 1)]
Yield the distinct combinations ofr items taken fromiterable.
>>>list(distinct_combinations([0,0,1],2))[(0, 0), (0, 1)]
Equivalent toset(combinations(iterable)), except duplicates are notgenerated and thrown away. For larger input sequences this is much moreefficient.
Equivalent tolist(combinations_with_replacement(iterable,r))[index].
The subsequences with repetition ofiterable that are of lengthr canbe ordered lexicographically.nth_combination_with_replacement()computes the subsequence at sort positionindex directly, withoutcomputing the previous subsequences with replacement.
>>>nth_combination_with_replacement(range(5),3,5)(0, 1, 1)
ValueError will be raised Ifr is negative or greater than the lengthofiterable.IndexError will be raised if the givenindex is invalid.
Yield the circular shifts ofiterable.
>>>list(circular_shifts(range(4)))[(0, 1, 2, 3), (1, 2, 3, 0), (2, 3, 0, 1), (3, 0, 1, 2)]
Setsteps to the number of places to rotate to the left(or to the right if negative). Defaults to 1.
>>>list(circular_shifts(range(4),2))[(0, 1, 2, 3), (2, 3, 0, 1)]
>>>list(circular_shifts(range(4),-1))[(0, 1, 2, 3), (3, 0, 1, 2), (2, 3, 0, 1), (1, 2, 3, 0)]
Yield all possible order-preserving partitions ofiterable.
>>>iterable='abc'>>>forpartinpartitions(iterable):...print([''.join(p)forpinpart])['abc']['a', 'bc']['ab', 'c']['a', 'b', 'c']
This is unrelated topartition().
Yield the set partitions ofiterable intok parts. Set partitions arenot order-preserving.
>>>iterable='abc'>>>forpartinset_partitions(iterable,2):...print([''.join(p)forpinpart])['a', 'bc']['ab', 'c']['b', 'ac']
Ifk is not given, every set partition is generated.
>>>iterable='abc'>>>forpartinset_partitions(iterable):...print([''.join(p)forpinpart])['abc']['a', 'bc']['ab', 'c']['b', 'ac']['a', 'b', 'c']
ifmin_size and/ormax_size are given, the minimum and/or maximum sizeper block in partition is set.
>>>iterable='abc'>>>forpartinset_partitions(iterable,min_size=2):...print([''.join(p)forpinpart])['abc']>>>forpartinset_partitions(iterable,max_size=2):...print([''.join(p)forpinpart])['a', 'bc']['ab', 'c']['b', 'ac']['a', 'b', 'c']
Equivalent tolist(product(*args)).index(element)
The products ofargs can be ordered lexicographically.product_index() computes the first index ofelement withoutcomputing the previous products.
>>>product_index([8,2],range(10),range(5))42
ValueError will be raised if the givenelement isn’t in the productofargs.
Equivalent tolist(combinations(iterable,r)).index(element)
The subsequences ofiterable that are of lengthr can be orderedlexicographically.combination_index() computes the index of thefirstelement, without computing the previous combinations.
>>>combination_index('adf','abcdefg')10
ValueError will be raised if the givenelement isn’t one of thecombinations ofiterable.
Equivalent tolist(permutations(iterable,r)).index(element)`
The subsequences ofiterable that are of lengthr where order isimportant can be ordered lexicographically.permutation_index()computes the index of the firstelement directly, without computingthe previous permutations.
>>>permutation_index([1,3,2],range(5))19
ValueError will be raised if the givenelement isn’t one of thepermutations ofiterable.
Equivalent tolist(combinations_with_replacement(iterable,r)).index(element)
The subsequences with repetition ofiterable that are of lengthr canbe ordered lexicographically.combination_with_replacement_index()computes the index of the firstelement, without computing the previouscombinations with replacement.
>>>combination_with_replacement_index('adf','abcdefg')20
ValueError will be raised if the givenelement isn’t one of thecombinations with replacement ofiterable.
Yield successive derangements of the elements initerable.
A derangement is a permutation in which no element appears at its originalindex. In other words, a derangement is a permutation that has no fixed points.
Suppose Alice, Bob, Carol, and Dave are playing Secret Santa.The code below outputs all of the different ways to assign gift recipientssuch that nobody is assigned to himself or herself:
>>>fordinderangements(['Alice','Bob','Carol','Dave']):...print(', '.join(d))Bob, Alice, Dave, CarolBob, Carol, Dave, AliceBob, Dave, Alice, CarolCarol, Alice, Dave, BobCarol, Dave, Alice, BobCarol, Dave, Bob, AliceDave, Alice, Bob, CarolDave, Carol, Alice, BobDave, Carol, Bob, Alice
Ifr is given, only ther-length derangements are yielded.
>>>sorted(derangements(range(3),2))[(1, 0), (1, 2), (2, 0)]>>>sorted(derangements([0,2,3],2))[(2, 0), (2, 3), (3, 0)]
Elements are treated as unique based on their position, not on their value.
Consider the Secret Santa example with twodifferent people who havethesame name. Then there are two valid gift assignments even thoughit might appear that a person is assigned to themselves:
>>>names=['Alice','Bob','Bob']>>>list(derangements(names))[('Bob', 'Bob', 'Alice'), ('Bob', 'Alice', 'Bob')]
To avoid confusion, make the inputs distinct:
>>>deduped=[f'{name}{index}'forindex,nameinenumerate(names)]>>>list(derangements(deduped))[('Bob1', 'Bob2', 'Alice0'), ('Bob2', 'Alice0', 'Bob1')]
The number of derangements of a set of sizen is known as the“subfactorial of n”. For n > 0, the subfactorial is:round(math.factorial(n)/math.e).
References:
Likeitertools.product(), but return tuples in an order suchthat only one element in the generated tuple changes from one iterationto the next.
>>>list(gray_product('AB','CD'))[('A', 'C'), ('B', 'C'), ('B', 'D'), ('A', 'D')]
This function consumes all of the input iterables before producing output.If any of the input iterables have fewer than two items,ValueErroris raised.
For information on the algorithm, seethis sectionof Donald Knuth’sThe Art of Computer Programming.
A generalized outer product that applies a binary function to allpairs of items. Returns a 2D matrix withlen(xs) rows andlen(ys)columns.Also accepts*args and**kwargs that are passed tofunc.
Multiplication table:
>>>list(outer_product(mul,range(1,4),range(1,6)))[(1, 2, 3, 4, 5), (2, 4, 6, 8, 10), (3, 6, 9, 12, 15)]
Cross tabulation:
>>>xs=['A','B','A','A','B','B','A','A','B','B']>>>ys=['X','X','X','Y','Z','Z','Y','Y','Z','Z']>>>pair_counts=Counter(zip(xs,ys))>>>count_rows=lambdax,y:pair_counts[x,y]>>>list(outer_product(count_rows,sorted(set(xs)),sorted(set(ys))))[(2, 3, 0), (1, 0, 4)]
Usage with*args and**kwargs:
>>>animals=['cat','wolf','mouse']>>>list(outer_product(min,animals,animals,key=len))[('cat', 'cat', 'cat'), ('cat', 'wolf', 'wolf'), ('cat', 'wolf', 'mouse')]
Yields all possible subsets of the iterable.
>>>list(powerset_of_sets([1,2,3]))[set(), {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}]>>>list(powerset_of_sets([1,1,0]))[set(), {1}, {0}, {0, 1}]
powerset_of_sets() takes care to minimize the numberof hash operations performed.
Itertools recipes
Yields all possible subsets of the iterable.
>>>list(powerset([1,2,3]))[(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]
powerset() will operate on iterables that aren’tsetinstances, so repeated elements in the input will produce repeated elementsin the output.
>>>seq=[1,1,0]>>>list(powerset(seq))[(), (1,), (1,), (0,), (1, 1), (1, 0), (1, 0), (1, 1, 0)]
For a variant that efficiently yields actualset instances, seepowerset_of_sets().
Draw an item at random from each of the input iterables.
>>>random_product('abc',range(4),'XYZ')('c', 3, 'Z')
Ifrepeat is provided as a keyword argument, that many items will bedrawn from each iterable.
>>>random_product('abcd',range(4),repeat=2)('a', 2, 'd', 3)
This equivalent to taking a random selection fromitertools.product(*args,repeat=repeat).
Return a randomr length permutation of the elements initerable.
Ifr is not specified or isNone, thenr defaults to the length ofiterable.
>>>random_permutation(range(5))(3, 4, 0, 1, 2)
This equivalent to taking a random selection fromitertools.permutations(iterable,r).
Return a randomr length subsequence of the elements initerable.
>>>random_combination(range(5),3)(2, 3, 4)
This equivalent to taking a random selection fromitertools.combinations(iterable,r).
Return a randomr length subsequence of elements initerable,allowing individual elements to be repeated.
>>>random_combination_with_replacement(range(3),5)(0, 0, 1, 2, 2)
This equivalent to taking a random selection fromitertools.combinations_with_replacement(iterable,r).
Equivalent tolist(product(*args))[index].
The products ofargs can be ordered lexicographically.nth_product() computes the product at sort positionindex withoutcomputing the previous products.
>>>nth_product(8,range(2),range(2),range(2),range(2))(1, 0, 0, 0)
IndexError will be raised if the givenindex is invalid.
Equivalent tolist(permutations(iterable,r))[index]`
The subsequences ofiterable that are of lengthr where order isimportant can be ordered lexicographically.nth_permutation()computes the subsequence at sort positionindex directly, withoutcomputing the previous subsequences.
>>>nth_permutation('ghijk',2,5)('h', 'i')
ValueError will be raised Ifr is negative or greater than the lengthofiterable.IndexError will be raised if the givenindex is invalid.
Equivalent tolist(combinations(iterable,r))[index].
The subsequences ofiterable that are of lengthr can be orderedlexicographically.nth_combination() computes the subsequence atsort positionindex directly, without computing the previoussubsequences.
>>>nth_combination(range(5),3,5)(0, 3, 4)
ValueError will be raised Ifr is negative or greater than the lengthofiterable.IndexError will be raised if the givenindex is invalid.
These tools provide wrappers to smooth working with objects that produce orconsume iterables.
New itertools
Ifobj is iterable, return an iterator over its items:
>>>obj=(1,2,3)>>>list(always_iterable(obj))[1, 2, 3]
Ifobj is not iterable, return a one-item iterable containingobj:
>>>obj=1>>>list(always_iterable(obj))[1]
Ifobj isNone, return an empty iterable:
>>>obj=None>>>list(always_iterable(None))[]
By default, binary and text strings are not considered iterable:
>>>obj='foo'>>>list(always_iterable(obj))['foo']
Ifbase_type is set, objects for whichisinstance(obj,base_type)returnsTrue won’t be considered iterable.
>>>obj={'a':1}>>>list(always_iterable(obj))# Iterate over the dict's keys['a']>>>list(always_iterable(obj,base_type=dict))# Treat dicts as a unit[{'a': 1}]
Setbase_type toNone to avoid any special handling and treat objectsPython considers iterable as iterable:
>>>obj='foo'>>>list(always_iterable(obj,base_type=None))['f', 'o', 'o']
An extension ofreversed() that supports all iterables, notjust those which implement theReversible orSequence protocols.
>>>print(*always_reversible(xforxinrange(3)))2 1 0
If the iterable is already reversible, this function returns theresult ofreversed(). If the iterable is not reversible,this function will cache the remaining items in the iterable andyield them in reverse order, which may require significant storage.
Wrapiterable and keep a count of how many items have been consumed.
Theitems_seen attribute starts at0 and increments as the iterableis consumed:
>>>iterable=map(str,range(10))>>>it=countable(iterable)>>>it.items_seen0>>>next(it),next(it)('0', '1')>>>list(it)['2', '3', '4', '5', '6', '7', '8', '9']>>>it.items_seen10
Decorator that automatically advances a PEP-342-style “reverse iterator”to its first yield point so you don’t have to callnext() on itmanually.
>>>@consumer...deftally():...i=0...whileTrue:...print('Thing number%s is%s.'%(i,(yield)))...i+=1...>>>t=tally()>>>t.send('red')Thing number 0 is red.>>>t.send('fish')Thing number 1 is fish.
Without the decorator, you would have to callnext(t) beforet.send() could be used.
Wrap an iterable in awith statement, so it closes once exhausted.
For example, this will close the file when the iterator is exhausted:
upper_lines=(line.upper()forlineinwith_iter(open('foo')))
Any context manager which returns an iterable is a candidate forwith_iter.
Convert a function that uses callbacks to an iterator.
Letfunc be a function that takes acallback keyword argument.For example:
>>>deffunc(callback=None):...fori,cin[(1,'a'),(2,'b'),(3,'c')]:...ifcallback:...callback(i,c)...return4
Usewithcallback_iter(func) to get an iterator over the parametersthat are delivered to the callback.
>>>withcallback_iter(func)asit:...forargs,kwargsinit:...print(args)(1, 'a')(2, 'b')(3, 'c')
The function will be called in a background thread. Thedone propertyindicates whether it has completed execution.
>>>it.doneTrue
If it completes successfully, its return value will be availablein theresult property.
>>>it.result4
Notes:
If the function uses some keyword argument besidescallback, supplycallback_kwd.
If it finished executing, but raised an exception, accessing theresult property will raise the same exception.
If it hasn’t finished executing, accessing theresultproperty from within thewith block will raiseRuntimeError.
If it hasn’t finished executing, accessing theresult property fromoutside thewith block will raise amore_itertools.AbortThread exception.
Providewait_seconds to adjust how frequently the it is polled foroutput.
Itertools recipes
Yields results from a function repeatedly until an exception is raised.
Converts a call-until-exception interface to an iterator interface.Likeiter(func,sentinel), but uses an exception instead of a sentinelto end the loop.
>>>l=[0,1,2]>>>list(iter_except(l.pop,IndexError))[2, 1, 0]
Multiple exceptions can be specified as a stopping condition:
>>>l=[1,2,3,'...',4,5,6]>>>list(iter_except(lambda:1+l.pop(),(IndexError,TypeError)))[7, 6, 5]>>>list(iter_except(lambda:1+l.pop(),(IndexError,TypeError)))[4, 3, 2]>>>list(iter_except(lambda:1+l.pop(),(IndexError,TypeError)))[]
These tools work with most numeric data types.
New itertools
Discrete Fourier Transform.xarr is a sequence of complex numbers.Yields the components of the corresponding transformed output vector.
>>>importcmath>>>xarr=[1,2-1j,-1j,-1+2j]# time domain>>>Xarr=[2,-2-2j,-2j,4+4j]# frequency domain>>>magnitudes,phases=zip(*map(cmath.polar,Xarr))>>>all(map(cmath.isclose,dft(xarr),Xarr))True
Inputs are restricted to numeric types that can add and multiplywith a complex number. This includes int, float, complex, andFraction, but excludes Decimal.
Seeidft() for the inverse Discrete Fourier Transform.
Inverse Discrete Fourier Transform.Xarr is a sequence ofcomplex numbers. Yields the components of the correspondinginverse-transformed output vector.
>>>importcmath>>>xarr=[1,2-1j,-1j,-1+2j]# time domain>>>Xarr=[2,-2-2j,-2j,4+4j]# frequency domain>>>all(map(cmath.isclose,idft(Xarr),xarr))True
Inputs are restricted to numeric types that can add and multiplywith a complex number. This includes int, float, complex, andFraction, but excludes Decimal.
Seedft() for the Discrete Fourier Transform.
Itertools recipes
Discrete linear convolution of two iterables.Equivalent to polynomial multiplication.
For example, multiplying(x²-x-20) by(x-3)gives(x³-4x²-17x+60).
>>>list(convolve([1,-1,-20],[1,-3]))[1, -4, -17, 60]
Examples of popular kinds of kernels:
The kernel[0.25,0.25,0.25,0.25] computes a moving average.For image data, this blurs the image and reduces noise.
The kernel[1/2,0,-1/2] estimates the first derivative ofa function evaluated at evenly spaced inputs.
The kernel[1,-2,1] estimates the second derivative of afunction evaluated at evenly spaced inputs.
Convolutions are mathematically commutative; however, the inputs areevaluated differently. The signal is consumed lazily and can beinfinite. The kernel is fully consumed before the calculations begin.
Supports all numeric types: int, float, complex, Decimal, Fraction.
References:
Article:https://betterexplained.com/articles/intuitive-convolution/
Video by 3Blue1Brown:https://www.youtube.com/watch?v=KuXjwB4LzSA
Returns the dot product of the two iterables.
>>>dotproduct([10,15,12],[0.65,0.80,1.25])33.5>>>10*0.65+15*0.80+12*1.2533.5
In Python 3.12 and later, usemath.sumprod() instead.
Multiply two matrices.
>>>list(matmul([(7,5),(3,5)],[(2,5),(7,9)]))[(49, 80), (41, 60)]
The caller should ensure that the dimensions of the input matrices arecompatible with each other.
Supports all numeric types: int, float, complex, Decimal, Fraction.
Compute a polynomial’s coefficients from its roots.
>>>roots=[5,-4,3]# (x - 5) * (x + 4) * (x - 3)>>>polynomial_from_roots(roots)# x³ - 4 x² - 17 x + 60[1, -4, -17, 60]
Note that polynomial coefficients are specified in descending power order.
Supports all numeric types: int, float, complex, Decimal, Fraction.
Compute the first derivative of a polynomial.
Evaluate the derivative ofx³-4x²-17x+60:
>>>coefficients=[1,-4,-17,60]>>>derivative_coefficients=polynomial_derivative(coefficients)>>>derivative_coefficients[3, -8, -17]
Note that polynomial coefficients are specified in descending power order.
Supports all numeric types: int, float, complex, Decimal, Fraction.
Evaluate a polynomial at a specific value.
Computes with better numeric stability than Horner’s method.
Evaluatex^3-4*x^2-17*x+60 atx=2.5:
>>>coefficients=[1,-4,-17,60]>>>x=2.5>>>polynomial_eval(coefficients,x)8.125
Note that polynomial coefficients are specified in descending power order.
Supports all numeric types: int, float, complex, Decimal, Fraction.
Cumulative median of values seen so far or values in a sliding window.
Setmaxlen to a positive integer to specify the maximum sizeof the sliding window. The default ofNone is equivalent toan unbounded window.
For example:
>>>list(running_median([5.0,9.0,4.0,12.0,8.0,9.0]))[5.0, 7.0, 5.0, 7.0, 8.0, 8.5]>>>list(running_median([5.0,9.0,4.0,12.0,8.0,9.0],maxlen=3))[5.0, 7.0, 5.0, 9.0, 8.0, 9.0]
Supports numeric types such as int, float, Decimal, and Fraction,but not complex numbers which are unorderable.
On version Python 3.13 and prior, max-heaps are simulated withnegative values. The negation causes Decimal inputs to apply contextrounding, making the results slightly different than that obtainedby statistics.median().
New itertools
The tools focus on math with integers.
Return the nth prime (counting from 0).
>>>nth_prime(0)2>>>nth_prime(100)547
Ifapproximate is set to True, will return a prime closeto the nth prime. The estimation is much faster than computingan exact result.
>>>nth_prime(200_000_000,approximate=True)# Exact result is 42222347634217820427
Itertools recipes
Yield the prime factors of n.
>>>list(factor(360))[2, 2, 2, 3, 3, 5]
Finds small factors with trial division. Larger factors areeither verified as prime withis_prime or split intosmaller factors with Pollard’s rho algorithm.
ReturnTrue ifn is prime andFalse otherwise.
Basic examples:
>>>is_prime(37)True>>>is_prime(3*13)False>>>is_prime(18_446_744_073_709_551_557)True
Find the next prime over one billion:
>>>next(filter(is_prime,count(10**9)))1000000007
Generate random primes up to 200 bits and up to 60 decimal digits:
>>>fromrandomimportseed,randrange,getrandbits>>>seed(18675309)
>>>next(filter(is_prime,map(getrandbits,repeat(200))))893303929355758292373272075469392561129886005037663238028407
>>>next(filter(is_prime,map(randrange,repeat(10**60))))269638077304026462407872868003560484232362454342414618963649
This function is exact for values ofn below 10**24. For larger inputs,the probabilistic Miller-Rabin primality test has a less than 1 in 2**128chance of a false positive.
Number of distinct arrangements of a multiset.
The expressionmultinomial(3,4,2) has several equivalentinterpretations:
In the expansion of(a+b+c)⁹, the coefficient of thea³b⁴c² term is 1260.
There are 1260 distinct ways to arrange 9 balls consisting of 3 reds, 4greens, and 2 blues.
There are 1260 unique ways to place 9 distinct objects into three binswith sizes 3, 4, and 2.
Themultinomial() function computes the length ofdistinct_permutations(). For example, there are 83,160 distinctanagrams of the word “abracadabra”:
>>>frommore_itertoolsimportdistinct_permutations,ilen>>>ilen(distinct_permutations('abracadabra'))83160
This can be computed directly from the letter counts, 5a 2b 2r 1c 1d:
>>>fromcollectionsimportCounter>>>list(Counter('abracadabra').values())[5, 2, 2, 1, 1]>>>multinomial(5,2,2,1,1)83160
A binomial coefficient is a special case of multinomial where there areonly two categories. For example, the number of ways to arrange 12 ballswith 5 reds and 7 blues ismultinomial(5,7) ormath.comb(12,5).
Likewise, factorial is a special case of multinomial wherethe multiplicities are all just 1 so thatmultinomial(1,1,1,1,1,1,1)==math.factorial(7).
Yield the primes less than n.
>>>list(sieve(30))[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Return the count of natural numbers up ton that are coprime withn.
Euler’s totient function φ(n) gives the number of totatives.Totative are integers k in the range 1 ≤ k ≤ n such that gcd(n, k) = 1.
>>>n=9>>>totient(n)6
>>>totatives=[xforxinrange(1,n)ifgcd(n,x)==1]>>>totatives[1, 2, 4, 5, 7, 8]>>>len(totatives)6
Reference:https://en.wikipedia.org/wiki/Euler%27s_totient_function
New itertools
Yield the index of each item initerable for whichpred returnsTrue.
pred defaults tobool(), which will select truthy items:
>>>list(locate([0,1,1,0,1,0,0]))[1, 2, 4]
Setpred to a custom function to, e.g., find the indexes for a particularitem.
>>>list(locate(['a','b','c','b'],lambdax:x=='b'))[1, 3]
Ifwindow_size is given, then thepred function will be called withthat many items. This enables searching for sub-sequences:
>>>iterable=[0,1,2,3,0,1,2,3,0,1,2,3]>>>pred=lambda*args:args==(1,2,3)>>>list(locate(iterable,pred=pred,window_size=3))[1, 5, 9]
Use withseekable() to find indexes and then retrieve the associateditems:
>>>fromitertoolsimportcount>>>frommore_itertoolsimportseekable>>>source=(3*n+1if(n%2)elsen//2fornincount())>>>it=seekable(source)>>>pred=lambdax:x>100>>>indexes=locate(it,pred=pred)>>>i=next(indexes)>>>it.seek(i)>>>next(it)106
Yield the index of each item initerable for whichpred returnsTrue, starting from the right and moving left.
pred defaults tobool(), which will select truthy items:
>>>list(rlocate([0,1,1,0,1,0,0]))# Truthy at 1, 2, and 4[4, 2, 1]
Setpred to a custom function to, e.g., find the indexes for a particularitem:
>>>iterator=iter('abcb')>>>pred=lambdax:x=='b'>>>list(rlocate(iterator,pred))[3, 1]
Ifwindow_size is given, then thepred function will be called withthat many items. This enables searching for sub-sequences:
>>>iterable=[0,1,2,3,0,1,2,3,0,1,2,3]>>>pred=lambda*args:args==(1,2,3)>>>list(rlocate(iterable,pred=pred,window_size=3))[9, 5, 1]
Beware, this function won’t return anything for infinite iterables.Ifiterable is reversible,rlocate will reverse it and search fromthe right. Otherwise, it will search from the left and return the resultsin reverse order.
Seelocate() to for other example applications.
Yield the items fromiterable, replacing the items for whichpredreturnsTrue with the items from the iterablesubstitutes.
>>>iterable=[1,1,0,1,1,0,1,1]>>>pred=lambdax:x==0>>>substitutes=(2,3)>>>list(replace(iterable,pred,substitutes))[1, 1, 2, 3, 1, 1, 2, 3, 1, 1]
Ifcount is given, the number of replacements will be limited:
>>>iterable=[1,1,0,1,1,0,1,1,0]>>>pred=lambdax:x==0>>>substitutes=[None]>>>list(replace(iterable,pred,substitutes,count=2))[1, 1, None, 1, 1, None, 1, 1, 0]
Usewindow_size to control the number of items passed as arguments topred. This allows for locating and replacing subsequences.
>>>iterable=[0,1,2,5,0,1,2,5]>>>window_size=3>>>pred=lambda*args:args==(0,1,2)# 3 items passed to pred>>>substitutes=[3,4]# Splice in these items>>>list(replace(iterable,pred,substitutes,window_size=window_size))[3, 4, 5, 3, 4, 5]
An extension of the built-inrange() function whose arguments canbe any orderable numeric type.
With onlystop specified,start defaults to0 andstepdefaults to1. The output items will match the type ofstop:
>>>list(numeric_range(3.5))[0.0, 1.0, 2.0, 3.0]
With onlystart andstop specified,step defaults to1. Theoutput items will match the type ofstart:
>>>fromdecimalimportDecimal>>>start=Decimal('2.1')>>>stop=Decimal('5.1')>>>list(numeric_range(start,stop))[Decimal('2.1'), Decimal('3.1'), Decimal('4.1')]
Withstart,stop, andstep specified the output items will matchthe type ofstart+step:
>>>fromfractionsimportFraction>>>start=Fraction(1,2)# Start at 1/2>>>stop=Fraction(5,2)# End at 5/2>>>step=Fraction(1,2)# Count by 1/2>>>list(numeric_range(start,stop,step))[Fraction(1, 2), Fraction(1, 1), Fraction(3, 2), Fraction(2, 1)]
Ifstep is zero,ValueError is raised. Negative steps are supported:
>>>list(numeric_range(3,-1,-1.0))[3.0, 2.0, 1.0, 0.0]
Be aware of the limitations of floating-point numbers; the representationof the yielded numbers may be surprising.
datetime.datetime objects can be used forstart andstop, ifstepis adatetime.timedelta object:
>>>importdatetime>>>start=datetime.datetime(2019,1,1)>>>stop=datetime.datetime(2019,1,3)>>>step=datetime.timedelta(days=1)>>>items=iter(numeric_range(start,stop,step))>>>next(items)datetime.datetime(2019, 1, 1, 0, 0)>>>next(items)datetime.datetime(2019, 1, 2, 0, 0)
Invokefunc on each item initerable (or on eachchunk_size groupof items) before yielding the item.
func must be a function that takes a single argument. Its return valuewill be discarded.
before andafter are optional functions that take no arguments. Theywill be executed before iteration starts and after it ends, respectively.
side_effect can be used for logging, updating progress bars, or anythingthat is not functionally “pure.”
Emitting a status message:
>>>frommore_itertoolsimportconsume>>>func=lambdaitem:print('Received{}'.format(item))>>>consume(side_effect(func,range(2)))Received 0Received 1
Operating on chunks of items:
>>>pair_sums=[]>>>func=lambdachunk:pair_sums.append(sum(chunk))>>>list(side_effect(func,[0,1,2,3,4,5],2))[0, 1, 2, 3, 4, 5]>>>list(pair_sums)[1, 5, 9]
Writing to a file-like object:
>>>fromioimportStringIO>>>frommore_itertoolsimportconsume>>>f=StringIO()>>>func=lambdax:print(x,file=f)>>>before=lambda:print(u'HEADER',file=f)>>>after=f.close>>>it=[u'a',u'b',u'c']>>>consume(side_effect(func,it,before=before,after=after))>>>f.closedTrue
Returnstart,func(start),func(func(start)), …
Produces an infinite iterator. To add a stopping condition,usetake(),takewhile, ortakewhile_inclusive():.
>>>take(10,iterate(lambdax:2*x,1))[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
>>>collatz=lambdax:3*x+1ifx%2==1elsex//2>>>list(takewhile_inclusive(lambdax:x!=1,iterate(collatz,10)))[10, 5, 16, 8, 4, 2, 1]
This function is the inverse ofitertools.accumulate(). By defaultit will compute the first difference ofiterable usingoperator.sub():
>>>fromitertoolsimportaccumulate>>>iterable=accumulate([0,1,2,3,4])# produces 0, 1, 3, 6, 10>>>list(difference(iterable))[0, 1, 2, 3, 4]
func defaults tooperator.sub(), but other functions can bespecified. They will be applied as follows:
A,B,C,D,...-->A,func(B,A),func(C,B),func(D,C),...
For example, to do progressive division:
>>>iterable=[1,2,6,24,120]>>>func=lambdax,y:x//y>>>list(difference(iterable,func))[1, 2, 3, 4, 5]
If theinitial keyword is set, the first element will be skipped whencomputing successive differences.
>>>it=[10,11,13,16]# from accumulate([1, 2, 3], initial=10)>>>list(difference(it,initial=10))[1, 2, 3]
Return a decorator version ofwrapping_func, which is a function thatmodifies an iterable.result_index is the position in that function’ssignature where the iterable goes.
This lets you use itertools on the “production end,” i.e. at functiondefinition. This can augment what the function returns without changing thefunction’s code.
For example, to produce a decorator version ofchunked():
>>>frommore_itertoolsimportchunked>>>chunker=make_decorator(chunked,result_index=0)>>>@chunker(3)...defiter_range(n):...returniter(range(n))...>>>list(iter_range(9))[[0, 1, 2], [3, 4, 5], [6, 7, 8]]
To only allow truthy items to be returned:
>>>truth_serum=make_decorator(filter,result_index=1)>>>@truth_serum(bool)...defboolean_test():...return[0,1,'',' ',False,True]...>>>list(boolean_test())[1, ' ', True]
Thepeekable() andseekable() wrappers make for practicaldecorators:
>>>frommore_itertoolsimportpeekable>>>peekable_function=make_decorator(peekable)>>>@peekable_function()...defstr_range(*args):...return(str(x)forxinrange(*args))...>>>it=str_range(1,20,2)>>>next(it),next(it),next(it)('1', '3', '5')>>>it.peek()'7'>>>next(it)'7'
Return a read-only view of the sequence objecttarget.
SequenceView objects are analogous to Python’s built-in“dictionary view” types. They provide a dynamic view of a sequence’s items,meaning that when the sequence updates, so does the view.
>>>seq=['0','1','2']>>>view=SequenceView(seq)>>>viewSequenceView(['0', '1', '2'])>>>seq.append('3')>>>viewSequenceView(['0', '1', '2', '3'])
Sequence views support indexing, slicing, and length queries. They actlike the underlying sequence, except they don’t allow assignment:
>>>view[1]'1'>>>view[1:-1]['1', '2']>>>len(view)4
Sequence views are useful as an alternative to copying, as they don’trequire (much) extra storage.
Yield items fromiterable untillimit_seconds have passed.If the time limit expires before all items have been yielded, thetimed_out parameter will be set toTrue.
>>>fromtimeimportsleep>>>defgenerator():...yield1...yield2...sleep(0.2)...yield3>>>iterable=time_limited(0.1,generator())>>>list(iterable)[1, 2]>>>iterable.timed_outTrue
Note that the time is checked before each item is yielded, and iterationstops if the time elapsed is greater thanlimit_seconds. If your timelimit is 1 second, but it takes 2 seconds to generate the first item fromthe iterable, the function will run for 2 seconds and not yield anything.As a special case, whenlimit_seconds is zero, the iterator neverreturns anything.
Evaluate each item fromiterable usingpred. If the result isequivalent toTrue, transform the item withfunc and yield it.Otherwise, transform the item withfunc_else and yield it.
pred,func, andfunc_else should each be functions that acceptone argument. By default,func_else is the identity function.
>>>frommathimportsqrt>>>iterable=list(range(-5,5))>>>iterable[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]>>>list(map_if(iterable,lambdax:x>3,lambdax:'toobig'))[-5, -4, -3, -2, -1, 0, 1, 2, 3, 'toobig']>>>list(map_if(iterable,lambdax:x>=0,...lambdax:f'{sqrt(x):.2f}',lambdax:None))[None, None, None, None, None, '0.00', '1.00', '1.41', '1.73', '2.00']
Applyfunc to every item ofiterable by dictionary unpackingthe item intofunc.
The difference betweenitertools.starmap() anddoublestarmap()parallels the distinction betweenfunc(*a) andfunc(**a).
>>>iterable=[{'a':1,'b':2},{'a':40,'b':60}]>>>list(doublestarmap(lambdaa,b:a+b,iterable))[3, 100]
TypeError will be raised iffunc’s signature doesn’t match themapping contained initerable or ifiterable does not contain mappings.
Itertools recipes
Yield the index of each place initerable thatvalue occurs,beginning with indexstart and ending before indexstop.
>>>list(iter_index('AABCADEAF','A'))[0, 1, 4, 7]>>>list(iter_index('AABCADEAF','A',1))# start index is inclusive[1, 4, 7]>>>list(iter_index('AABCADEAF','A',1,7))# stop index is not inclusive[1, 4]
The behavior for non-scalarvalues matches the built-in Python types.
>>>list(iter_index('ABCDABCD','AB'))[0, 4]>>>list(iter_index([0,1,2,3,0,1,2,3],[0,1]))[]>>>list(iter_index([[0,1],[2,3],[0,1],[2,3]],[0,1]))[0, 2]
Seelocate() for a more general means of finding the indexesassociated with particular values.
Advanceiterable byn steps. Ifn isNone, consume itentirely.
Efficiently exhausts an iterator without returning values. Defaults toconsuming the whole iterator, but an optional second argument may beprovided to limit consumption.
>>>i=(xforxinrange(10))>>>next(i)0>>>consume(i,3)>>>next(i)4>>>consume(i)>>>next(i)Traceback (most recent call last): File"<stdin>", line1, in<module>StopIteration
If the iterator has fewer items remaining than the provided limit, thewhole iterator will be consumed.
>>>i=(xforxinrange(3))>>>consume(i,5)>>>next(i)Traceback (most recent call last): File"<stdin>", line1, in<module>StopIteration
Return an iterator over the results offunc(start),func(start+1),func(start+2)…
func should be a function that accepts one integer argument.
Ifstart is not specified it defaults to 0. It will be incremented eachtime the iterator is advanced.
>>>square=lambdax:x**2>>>iterator=tabulate(square,-3)>>>take(4,iterator)[9, 4, 1, 0]
Callfunc withargs repeatedly, returning an iterable over theresults.
Iftimes is specified, the iterable will terminate after that manyrepetitions:
>>>fromoperatorimportadd>>>times=4>>>args=3,5>>>list(repeatfunc(add,times,*args))[8, 8, 8, 8]
Iftimes isNone the iterable will not terminate:
>>>fromrandomimportrandrange>>>times=None>>>args=1,11>>>take(6,repeatfunc(randrange,times,*args))[2, 4, 8, 1, 8, 4]
Change the shape of amatrix.
Ifshape is an integer, the matrix must be two dimensionaland the shape is interpreted as the desired number of columns:
>>>matrix=[(0,1),(2,3),(4,5)]>>>cols=3>>>list(reshape(matrix,cols))[(0, 1, 2), (3, 4, 5)]
Ifshape is a tuple (or other iterable), the input matrix can haveany number of dimensions. It will first be flattened and then rebuiltto the desired shape which can also be multidimensional:
>>>matrix=[(0,1),(2,3),(4,5)]# Start with a 3 x 2 matrix
>>>list(reshape(matrix,(2,3)))# Make a 2 x 3 matrix[(0, 1, 2), (3, 4, 5)]
>>>list(reshape(matrix,(6,)))# Make a vector of length six[0, 1, 2, 3, 4, 5]
>>>list(reshape(matrix,(2,1,3,1)))# Make 2 x 1 x 3 x 1 tensor[(((0,), (1,), (2,)),), (((3,), (4,), (5,)),)]
Each dimension is assumed to be uniform, either all arrays or all scalars.Flattening stops when the first value in a dimension is a scalar.Scalars are bytes, strings, and non-iterables.The reshape iterator stops when the requested shape is completeor when the input is exhausted, whichever comes first.
islice_extendedfirst()last()one()only()strictly_n()strip()lstrip()rstrip()filter_except()map_except()filter_map()iter_suppress()nth_or_last()extract()unique_in_window()duplicates_everseen()duplicates_justseen()classify_unique()longest_common_prefix()takewhile_inclusive()nth()before_and_after()take()tail()unique_everseen()unique_justseen()unique()distinct_permutations()distinct_combinations()nth_combination_with_replacement()circular_shifts()partitions()set_partitions()product_index()combination_index()permutation_index()combination_with_replacement_index()derangements()gray_product()outer_product()powerset_of_sets()powerset()random_product()random_permutation()random_combination()random_combination_with_replacement()nth_product()nth_permutation()nth_combination()