函式程式設計 HOWTO

作者:

A. M. Kuchling

發佈版本:

0.32

In this document, we'll take a tour of Python's features suitable forimplementing programs in a functional style. After an introduction to theconcepts of functional programming, we'll look at language features such asiterators andgenerators and relevant library modules such asitertools andfunctools.

簡介

This section explains the basic concept of functional programming; ifyou're just interested in learning about Python language features,skip to the next section on疊代器.

Programming languages support decomposing problems in several different ways:

  • Most programming languages areprocedural: programs are lists ofinstructions that tell the computer what to do with the program's input. C,Pascal, and even Unix shells are procedural languages.

  • Indeclarative languages, you write a specification that describes theproblem to be solved, and the language implementation figures out how toperform the computation efficiently. SQL is the declarative language you'remost likely to be familiar with; a SQL query describes the data set you wantto retrieve, and the SQL engine decides whether to scan tables or use indexes,which subclauses should be performed first, etc.

  • Object-oriented programs manipulate collections of objects. Objects haveinternal state and support methods that query or modify this internal state insome way. Smalltalk and Java are object-oriented languages. C++ and Pythonare languages that support object-oriented programming, but don't force theuse of object-oriented features.

  • Functional programming decomposes a problem into a set of functions.Ideally, functions only take inputs and produce outputs, and don't have anyinternal state that affects the output produced for a given input. Well-knownfunctional languages include the ML family (Standard ML, OCaml, and othervariants) and Haskell.

The designers of some computer languages choose to emphasize oneparticular approach to programming. This often makes it difficult towrite programs that use a different approach. Other languages aremulti-paradigm languages that support several different approaches.Lisp, C++, and Python are multi-paradigm; you can write programs orlibraries that are largely procedural, object-oriented, or functionalin all of these languages. In a large program, different sectionsmight be written using different approaches; the GUI might beobject-oriented while the processing logic is procedural orfunctional, for example.

In a functional program, input flows through a set of functions. Each functionoperates on its input and produces some output. Functional style discouragesfunctions with side effects that modify internal state or make other changesthat aren't visible in the function's return value. Functions that have no sideeffects at all are calledpurely functional. Avoiding side effects meansnot using data structures that get updated as a program runs; every function'soutput must only depend on its input.

Some languages are very strict about purity and don't even have assignmentstatements such asa=3 orc=a+b, but it's difficult to avoid allside effects, such as printing to the screen or writing to a disk file. Anotherexample is a call to theprint() ortime.sleep() function, neitherof which returns a useful value. Both are called only for their side effectsof sending some text to the screen or pausing execution for a second.

Python programs written in functional style usually won't go to the extreme ofavoiding all I/O or all assignments; instead, they'll provide afunctional-appearing interface but will use non-functional features internally.For example, the implementation of a function will still use assignments tolocal variables, but won't modify global variables or have other side effects.

Functional programming can be considered the opposite of object-orientedprogramming. Objects are little capsules containing some internal state alongwith a collection of method calls that let you modify this state, and programsconsist of making the right set of state changes. Functional programming wantsto avoid state changes as much as possible and works with data flowing betweenfunctions. In Python you might combine the two approaches by writing functionsthat take and return instances representing objects in your application (e-mailmessages, transactions, etc.).

Functional design may seem like an odd constraint to work under. Why should youavoid objects and side effects? There are theoretical and practical advantagesto the functional style:

  • 形式可證明性 (Formal provability)。

  • 模組化 (Modularity)。

  • 可組合性 (Composability)。

  • 容易除錯與測試。

形式可證明性

A theoretical benefit is that it's easier to construct a mathematical proof thata functional program is correct.

For a long time researchers have been interested in finding ways tomathematically prove programs correct. This is different from testing a programon numerous inputs and concluding that its output is usually correct, or readinga program's source code and concluding that the code looks right; the goal isinstead a rigorous proof that a program produces the right result for allpossible inputs.

The technique used to prove programs correct is to write downinvariants,properties of the input data and of the program's variables that are alwaystrue. For each line of code, you then show that if invariants X and Y are truebefore the line is executed, the slightly different invariants X' and Y' aretrueafter the line is executed. This continues until you reach the end ofthe program, at which point the invariants should match the desired conditionson the program's output.

Functional programming's avoidance of assignments arose because assignments aredifficult to handle with this technique; assignments can break invariants thatwere true before the assignment without producing any new invariants that can bepropagated onward.

Unfortunately, proving programs correct is largely impractical and not relevantto Python software. Even trivial programs require proofs that are several pageslong; the proof of correctness for a moderately complicated program would beenormous, and few or none of the programs you use daily (the Python interpreter,your XML parser, your web browser) could be proven correct. Even if you wrotedown or generated a proof, there would then be the question of verifying theproof; maybe there's an error in it, and you wrongly believe you've proved theprogram correct.

模組化

A more practical benefit of functional programming is that it forces you tobreak apart your problem into small pieces. Programs are more modular as aresult. It's easier to specify and write a small function that does one thingthan a large function that performs a complicated transformation. Smallfunctions are also easier to read and to check for errors.

容易除錯與測試

Testing and debugging a functional-style program is easier.

Debugging is simplified because functions are generally small and clearlyspecified. When a program doesn't work, each function is an interface pointwhere you can check that the data are correct. You can look at the intermediateinputs and outputs to quickly isolate the function that's responsible for a bug.

Testing is easier because each function is a potential subject for a unit test.Functions don't depend on system state that needs to be replicated beforerunning a test; instead you only have to synthesize the right input and thencheck that the output matches expectations.

可組合性

As you work on a functional-style program, you'll write a number of functionswith varying inputs and outputs. Some of these functions will be unavoidablyspecialized to a particular application, but others will be useful in a widevariety of programs. For example, a function that takes a directory path andreturns all the XML files in the directory, or a function that takes a filenameand returns its contents, can be applied to many different situations.

Over time you'll form a personal library of utilities. Often you'll assemblenew programs by arranging existing functions in a new configuration and writinga few functions specialized for the current task.

疊代器

I'll start by looking at a Python language feature that's an importantfoundation for writing functional-style programs: iterators.

An iterator is an object representing a stream of data; this object returns thedata one element at a time. A Python iterator must support a method called__next__() that takes no arguments and always returns the nextelement of the stream. If there are no more elements in the stream,__next__() must raise theStopIteration exception.Iterators don't have to be finite, though; it's perfectly reasonable to writean iterator that produces an infinite stream of data.

The built-initer() function takes an arbitrary object and tries to returnan iterator that will return the object's contents or elements, raisingTypeError if the object doesn't support iteration. Several of Python'sbuilt-in data types support iteration, the most common being lists anddictionaries. An object is callediterable if you can get an iteratorfor it.

You can experiment with the iteration interface manually:

>>>L=[1,2,3]>>>it=iter(L)>>>it<...iterator object at ...>>>>it.__next__()# same as next(it)1>>>next(it)2>>>next(it)3>>>next(it)Traceback (most recent call last):  File"<stdin>", line1, in<module>StopIteration>>>

Python expects iterable objects in several different contexts, the mostimportant being thefor statement. In the statementforXinY,Y must be an iterator or some object for whichiter() can create aniterator. These two statements are equivalent:

foriiniter(obj):print(i)foriinobj:print(i)

Iterators can be materialized as lists or tuples by using thelist() ortuple() constructor functions:

>>>L=[1,2,3]>>>iterator=iter(L)>>>t=tuple(iterator)>>>t(1, 2, 3)

Sequence unpacking also supports iterators: if you know an iterator will returnN elements, you can unpack them into an N-tuple:

>>>L=[1,2,3]>>>iterator=iter(L)>>>a,b,c=iterator>>>a,b,c(1, 2, 3)

Built-in functions such asmax() andmin() can take a singleiterator argument and will return the largest or smallest element. The"in"and"notin" operators also support iterators:Xiniterator is true ifX is found in the stream returned by the iterator. You'll run into obviousproblems if the iterator is infinite;max(),min()will never return, and if the element X never appears in the stream, the"in" and"notin" operators won't return either.

Note that you can only go forward in an iterator; there's no way to get theprevious element, reset the iterator, or make a copy of it. Iterator objectscan optionally provide these additional capabilities, but the iterator protocolonly specifies the__next__() method. Functions may thereforeconsume all of the iterator's output, and if you need to do something differentwith the same stream, you'll have to create a new iterator.

Data Types That Support Iterators

We've already seen how lists and tuples support iterators. In fact, any Pythonsequence type, such as strings, will automatically support creation of aniterator.

Callingiter() on a dictionary returns an iterator that will loop over thedictionary's keys:

>>>m={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,...'Jul':7,'Aug':8,'Sep':9,'Oct':10,'Nov':11,'Dec':12}>>>forkeyinm:...print(key,m[key])Jan 1Feb 2Mar 3Apr 4May 5Jun 6Jul 7Aug 8Sep 9Oct 10Nov 11Dec 12

Note that starting with Python 3.7, dictionary iteration order is guaranteedto be the same as the insertion order. In earlier versions, the behaviour wasunspecified and could vary between implementations.

Applyingiter() to a dictionary always loops over the keys, butdictionaries have methods that return other iterators. If you want to iterateover values or key/value pairs, you can explicitly call thevalues() oritems() methods to get an appropriateiterator.

Thedict() constructor can accept an iterator that returns a finite streamof(key,value) tuples:

>>>L=[('Italy','Rome'),('France','Paris'),('US','Washington DC')]>>>dict(iter(L)){'Italy': 'Rome', 'France': 'Paris', 'US': 'Washington DC'}

Files also support iteration by calling thereadline()method until there are no more lines in the file. This means you can read eachline of a file like this:

forlineinfile:# do something for each line...

Sets can take their contents from an iterable and let you iterate over the set'selements:

>>>S={2,3,5,7,11,13}>>>foriinS:...print(i)23571113

產生器運算式與串列綜合運算式

Two common operations on an iterator's output are 1) performing some operationfor every element, 2) selecting a subset of elements that meet some condition.For example, given a list of strings, you might want to strip off trailingwhitespace from each line or extract all the strings containing a givensubstring.

List comprehensions and generator expressions (short form: "listcomps" and"genexps") are a concise notation for such operations, borrowed from thefunctional programming language Haskell (https://www.haskell.org/). You can stripall the whitespace from a stream of strings with the following code:

>>>line_list=['  line 1\n','line 2\n','\n','']>>># 產生器運算式 -- 回傳疊代器>>>stripped_iter=(line.strip()forlineinline_list)>>># 串列綜合運算式 -- 回傳串列>>>stripped_list=[line.strip()forlineinline_list]

You can select only certain elements by adding an"if" condition:

>>>stripped_list=[line.strip()forlineinline_list...ifline!=""]

With a list comprehension, you get back a Python list;stripped_list is alist containing the resulting lines, not an iterator. Generator expressionsreturn an iterator that computes the values as necessary, not needing tomaterialize all the values at once. This means that list comprehensions aren'tuseful if you're working with iterators that return an infinite stream or a verylarge amount of data. Generator expressions are preferable in these situations.

Generator expressions are surrounded by parentheses ("()") and listcomprehensions are surrounded by square brackets ("[]"). Generator expressionshave the form:

(expressionforexprinsequence1ifcondition1forexpr2insequence2ifcondition2forexpr3insequence3...ifcondition3forexprNinsequenceNifconditionN)

Again, for a list comprehension only the outside brackets are different (squarebrackets instead of parentheses).

The elements of the generated output will be the successive values ofexpression. Theif clauses are all optional; if present,expressionis only evaluated and added to the result whencondition is true.

Generator expressions always have to be written inside parentheses, but theparentheses signalling a function call also count. If you want to create aniterator that will be immediately passed to a function you can write:

obj_total=sum(obj.countforobjinlist_all_objects())

Thefor...in clauses contain the sequences to be iterated over. Thesequences do not have to be the same length, because they are iterated over fromleft to right,not in parallel. For each element insequence1,sequence2 is looped over from the beginning.sequence3 is then loopedover for each resulting pair of elements fromsequence1 andsequence2.

To put it another way, a list comprehension or generator expression isequivalent to the following Python code:

forexpr1insequence1:ifnot(condition1):continue# Skip this elementforexpr2insequence2:ifnot(condition2):continue# Skip this element...forexprNinsequenceN:ifnot(conditionN):continue# Skip this element# Output the value of# the expression.

This means that when there are multiplefor...in clauses but noifclauses, the length of the resulting output will be equal to the product of thelengths of all the sequences. If you have two lists of length 3, the outputlist is 9 elements long:

>>>seq1='abc'>>>seq2=(1,2,3)>>>[(x,y)forxinseq1foryinseq2][('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1), ('c', 2), ('c', 3)]

To avoid introducing an ambiguity into Python's grammar, ifexpression iscreating a tuple, it must be surrounded with parentheses. The first listcomprehension below is a syntax error, while the second one is correct:

# 語法錯誤[x,yforxinseq1foryinseq2]# 正確[(x,y)forxinseq1foryinseq2]

產生器

Generators are a special class of functions that simplify the task of writingiterators. Regular functions compute a value and return it, but generatorsreturn an iterator that returns a stream of values.

You're doubtless familiar with how regular function calls work in Python or C.When you call a function, it gets a private namespace where its local variablesare created. When the function reaches areturn statement, the localvariables are destroyed and the value is returned to the caller. A later callto the same function creates a new private namespace and a fresh set of localvariables. But, what if the local variables weren't thrown away on exiting afunction? What if you could later resume the function where it left off? Thisis what generators provide; they can be thought of as resumable functions.

以下是最簡單的產生器函式範例:

>>>defgenerate_ints(N):...foriinrange(N):...yieldi

Any function containing ayield keyword is a generator function;this is detected by Python'sbytecode compiler which compiles thefunction specially as a result.

When you call a generator function, it doesn't return a single value; instead itreturns a generator object that supports the iterator protocol. On executingtheyield expression, the generator outputs the value ofi, similar to areturn statement. The big difference betweenyield and areturnstatement is that on reaching ayield the generator's state of execution issuspended and local variables are preserved. On the next call to thegenerator's__next__() method, the function will resumeexecuting.

以下是generate_ints() 產生器的使用範例:

>>>gen=generate_ints(3)>>>gen<generator object generate_ints at ...>>>>next(gen)0>>>next(gen)1>>>next(gen)2>>>next(gen)Traceback (most recent call last):  File"stdin", line1, in<module>  File"stdin", line2, ingenerate_intsStopIteration

You could equally writeforiingenerate_ints(5), ora,b,c=generate_ints(3).

Inside a generator function,returnvalue causesStopIteration(value)to be raised from the__next__() method. Once this happens, orthe bottom of the function is reached, the procession of values ends and thegenerator cannot yield any further values.

You could achieve the effect of generators manually by writing your own classand storing all the local variables of the generator as instance variables. Forexample, returning a list of integers could be done by settingself.count to0, and having the__next__() method incrementself.count andreturn it.However, for a moderately complicated generator, writing a corresponding classcan be much messier.

The test suite included with Python's library,Lib/test/test_generators.py, containsa number of more interesting examples. Here's one generator that implements anin-order traversal of a tree using generators recursively.

# A recursive generator that generates Tree leaves in in-order.definorder(t):ift:forxininorder(t.left):yieldxyieldt.labelforxininorder(t.right):yieldx

Two other examples intest_generators.py produce solutions for the N-Queensproblem (placing N queens on an NxN chess board so that no queen threatensanother) and the Knight's Tour (finding a route that takes a knight to everysquare of an NxN chessboard without visiting any square twice).

Passing values into a generator

In Python 2.4 and earlier, generators only produced output. Once a generator'scode was invoked to create an iterator, there was no way to pass any newinformation into the function when its execution is resumed. You could hacktogether this ability by making the generator look at a global variable or bypassing in some mutable object that callers then modify, but these approachesare messy.

In Python 2.5 there's a simple way to pass values into a generator.yield became an expression, returning a value that can be assigned toa variable or otherwise operated on:

val=(yieldi)

I recommend that youalways put parentheses around ayield expressionwhen you're doing something with the returned value, as in the above example.The parentheses aren't always necessary, but it's easier to always add theminstead of having to remember when they're needed.

(PEP 342 explains the exact rules, which are that ayield-expression mustalways be parenthesized except when it occurs at the top-level expression on theright-hand side of an assignment. This means you can writeval=yieldibut have to use parentheses when there's an operation, as inval=(yieldi)+12.)

Values are sent into a generator by calling itssend(value) method. This method resumes the generator's code and theyield expression returns the specified value. If the regular__next__() method is called, theyield returnsNone.

Here's a simple counter that increments by 1 and allows changing the value ofthe internal counter.

defcounter(maximum):i=0whilei<maximum:val=(yieldi)# 如有提供值則改變計數器ifvalisnotNone:i=valelse:i+=1

And here's an example of changing the counter:

>>>it=counter(10)>>>next(it)0>>>next(it)1>>>it.send(8)8>>>next(it)9>>>next(it)Traceback (most recent call last):  File"t.py", line15, in<module>it.next()StopIteration

Becauseyield will often be returningNone, you should always check forthis case. Don't just use its value in expressions unless you're sure that thesend() method will be the only method used to resume yourgenerator function.

In addition tosend(), there are two other methods ongenerators:

  • throw(value) is used toraise an exception inside the generator; the exception is raised by theyield expression where the generator's execution is paused.

  • close() raises aGeneratorExit exception inside thegenerator to terminate the iteration. On receiving this exception, thegenerator's code must either raiseGeneratorExit orStopIteration; catching the exception and doing anything else isillegal and will trigger aRuntimeError.close()will also be called by Python's garbage collector when the generator isgarbage-collected.

    If you need to run cleanup code when aGeneratorExit occurs, I suggestusing atry:...finally: suite instead of catchingGeneratorExit.

The cumulative effect of these changes is to turn generators from one-wayproducers of information into both producers and consumers.

Generators also becomecoroutines, a more generalized form of subroutines.Subroutines are entered at one point and exited at another point (the top of thefunction, and areturn statement), but coroutines can be entered, exited,and resumed at many different points (theyield statements).

內建函式

Let's look in more detail at built-in functions often used with iterators.

Two of Python's built-in functions,map() andfilter() duplicate thefeatures of generator expressions:

map(f,iterA,iterB,...) 回傳一個元素為序列的疊代器

f(iterA[0],iterB[0]),f(iterA[1],iterB[1]),f(iterA[2],iterB[2]),...

>>>defupper(s):...returns.upper()
>>>list(map(upper,['sentence','fragment']))['SENTENCE', 'FRAGMENT']>>>[upper(s)forsin['sentence','fragment']]['SENTENCE', 'FRAGMENT']

You can of course achieve the same effect with a list comprehension.

filter(predicate,iter) returns an iterator over all thesequence elements that meet a certain condition, and is similarly duplicated bylist comprehensions. Apredicate is a function that returns the truthvalue of some condition; for use withfilter(), the predicate must take asingle value.

>>>defis_even(x):...return(x%2)==0
>>>list(filter(is_even,range(10)))[0, 2, 4, 6, 8]

This can also be written as a list comprehension:

>>>list(xforxinrange(10)ifis_even(x))[0, 2, 4, 6, 8]

enumerate(iter,start=0) counts off the elements in theiterable returning 2-tuples containing the count (fromstart) andeach element.

>>>foriteminenumerate(['subject','verb','object']):...print(item)(0, 'subject')(1, 'verb')(2, 'object')

enumerate() is often used when looping through a list and recording theindexes at which certain conditions are met:

f=open('data.txt','r')fori,lineinenumerate(f):ifline.strip()=='':print('Blank line at line #%i'%i)

sorted(iterable,key=None,reverse=False) collects all theelements of the iterable into a list, sorts the list, and returns the sortedresult. Thekey andreverse arguments are passed through to theconstructed list'ssort() method.

>>>importrandom>>># Generate 8 random numbers between [0, 10000)>>>rand_list=random.sample(range(10000),8)>>>rand_list[769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]>>>sorted(rand_list)[769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]>>>sorted(rand_list,reverse=True)[9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]

(For a more detailed discussion of sorting, see the排序技法.)

Theany(iter) andall(iter) built-ins look at thetruth values of an iterable's contents.any() returnsTrue if any elementin the iterable is a true value, andall() returnsTrue if all of theelements are true values:

>>>any([0,1,0])True>>>any([0,0,0])False>>>any([1,1,1])True>>>all([0,1,0])False>>>all([0,0,0])False>>>all([1,1,1])True

zip(iterA,iterB,...) takes one element from each iterable andreturns them in a tuple:

zip(['a','b','c'],(1,2,3))=>('a',1),('b',2),('c',3)

It doesn't construct an in-memory list and exhaust all the input iteratorsbefore returning; instead tuples are constructed and returned only if they'rerequested. (The technical term for this behaviour islazy evaluation.)

This iterator is intended to be used with iterables that are all of the samelength. If the iterables are of different lengths, the resulting stream will bethe same length as the shortest iterable.

zip(['a','b'],(1,2,3))=>('a',1),('b',2)

You should avoid doing this, though, because an element may be taken from thelonger iterators and discarded. This means you can't go on to use the iteratorsfurther because you risk skipping a discarded element.

itertools 模組

Theitertools module contains a number of commonly used iterators as wellas functions for combining several iterators. This section will introduce themodule's contents by showing small examples.

The module's functions fall into a few broad classes:

  • Functions that create a new iterator based on an existing iterator.

  • Functions for treating an iterator's elements as function arguments.

  • Functions for selecting portions of an iterator's output.

  • A function for grouping an iterator's output.

建立新的疊代器

itertools.count(start,step) returns an infinitestream of evenly spaced values. You can optionally supply the starting number,which defaults to 0, and the interval between numbers, which defaults to 1:

itertools.count()=>0,1,2,3,4,5,6,7,8,9,...itertools.count(10)=>10,11,12,13,14,15,16,17,18,19,...itertools.count(10,5)=>10,15,20,25,30,35,40,45,50,55,...

itertools.cycle(iter) saves a copy of the contents ofa provided iterable and returns a new iterator that returns its elements fromfirst to last. The new iterator will repeat these elements infinitely.

itertools.cycle([1,2,3,4,5])=>1,2,3,4,5,1,2,3,4,5,...

itertools.repeat(elem,[n]) returns the providedelementn times, or returns the element endlessly ifn is not provided.

itertools.repeat('abc')=>abc,abc,abc,abc,abc,abc,abc,abc,abc,abc,...itertools.repeat('abc',5)=>abc,abc,abc,abc,abc

itertools.chain(iterA,iterB,...) takes an arbitrarynumber of iterables as input, and returns all the elements of the firstiterator, then all the elements of the second, and so on, until all of theiterables have been exhausted.

itertools.chain(['a','b','c'],(1,2,3))=>a,b,c,1,2,3

itertools.islice(iter,[start],stop,[step]) returnsa stream that's a slice of the iterator. With a singlestop argument, itwill return the firststop elements. If you supply a starting index, you'llgetstop-start elements, and if you supply a value forstep, elementswill be skipped accordingly. Unlike Python's string and list slicing, you can'tuse negative values forstart,stop, orstep.

itertools.islice(range(10),8)=>0,1,2,3,4,5,6,7itertools.islice(range(10),2,8)=>2,3,4,5,6,7itertools.islice(range(10),2,8,2)=>2,4,6

itertools.tee(iter,[n]) replicates an iterator; itreturnsn independent iterators that will all return the contents of thesource iterator.If you don't supply a value forn, the default is 2. Replicating iteratorsrequires saving some of the contents of the source iterator, so this can consumesignificant memory if the iterator is large and one of the new iterators isconsumed more than the others.

itertools.tee(itertools.count())=>iterA,iterBwhereiterA->0,1,2,3,4,5,6,7,8,9,...anditerB->0,1,2,3,4,5,6,7,8,9,...

Calling functions on elements

Theoperator module contains a set of functions corresponding to Python'soperators. Some examples areoperator.add(a,b) (addstwo values),operator.ne(a,b) (same asa!=b), andoperator.attrgetter('id')(returns a callable that fetches the.id attribute).

itertools.starmap(func,iter) assumes that theiterable will return a stream of tuples, and callsfunc using these tuples asthe arguments:

itertools.starmap(os.path.join,[('/bin','python'),('/usr','bin','java'),('/usr','bin','perl'),('/usr','bin','ruby')])=>/bin/python,/usr/bin/java,/usr/bin/perl,/usr/bin/ruby

Selecting elements

Another group of functions chooses a subset of an iterator's elements based on apredicate.

itertools.filterfalse(predicate,iter) is theopposite offilter(), returning all elements for which the predicatereturns false:

itertools.filterfalse(is_even,itertools.count())=>1,3,5,7,9,11,13,15,...

itertools.takewhile(predicate,iter) returnselements for as long as the predicate returns true. Once the predicate returnsfalse, the iterator will signal the end of its results.

defless_than_10(x):returnx<10itertools.takewhile(less_than_10,itertools.count())=>0,1,2,3,4,5,6,7,8,9itertools.takewhile(is_even,itertools.count())=>0

itertools.dropwhile(predicate,iter) discardselements while the predicate returns true, and then returns the rest of theiterable's results.

itertools.dropwhile(less_than_10,itertools.count())=>10,11,12,13,14,15,16,17,18,19,...itertools.dropwhile(is_even,itertools.count())=>1,2,3,4,5,6,7,8,9,10,...

itertools.compress(data,selectors) takes twoiterators and returns only those elements ofdata for which the correspondingelement ofselectors is true, stopping whenever either one is exhausted:

itertools.compress([1,2,3,4,5],[True,True,False,False,True])=>1,2,5

Combinatoric functions

Theitertools.combinations(iterable,r)returns an iterator giving all possibler-tuple combinations of theelements contained initerable.

itertools.combinations([1,2,3,4,5],2)=>(1,2),(1,3),(1,4),(1,5),(2,3),(2,4),(2,5),(3,4),(3,5),(4,5)itertools.combinations([1,2,3,4,5],3)=>(1,2,3),(1,2,4),(1,2,5),(1,3,4),(1,3,5),(1,4,5),(2,3,4),(2,3,5),(2,4,5),(3,4,5)

The elements within each tuple remain in the same order asiterable returned them. For example, the number 1 is always before2, 3, 4, or 5 in the examples above. A similar function,itertools.permutations(iterable,r=None),removes this constraint on the order, returning all possiblearrangements of lengthr:

itertools.permutations([1,2,3,4,5],2)=>(1,2),(1,3),(1,4),(1,5),(2,1),(2,3),(2,4),(2,5),(3,1),(3,2),(3,4),(3,5),(4,1),(4,2),(4,3),(4,5),(5,1),(5,2),(5,3),(5,4)itertools.permutations([1,2,3,4,5])=>(1,2,3,4,5),(1,2,3,5,4),(1,2,4,3,5),...(5,4,3,2,1)

If you don't supply a value forr the length of the iterable is used,meaning that all the elements are permuted.

Note that these functions produce all of the possible combinations byposition and don't require that the contents ofiterable are unique:

itertools.permutations('aba',3)=>('a','b','a'),('a','a','b'),('b','a','a'),('b','a','a'),('a','a','b'),('a','b','a')

The identical tuple('a','a','b') occurs twice, but the two 'a'strings came from different positions.

Theitertools.combinations_with_replacement(iterable,r)function relaxes a different constraint: elements can be repeatedwithin a single tuple. Conceptually an element is selected for thefirst position of each tuple and then is replaced before the secondelement is selected.

itertools.combinations_with_replacement([1,2,3,4,5],2)=>(1,1),(1,2),(1,3),(1,4),(1,5),(2,2),(2,3),(2,4),(2,5),(3,3),(3,4),(3,5),(4,4),(4,5),(5,5)

Grouping elements

The last function I'll discuss,itertools.groupby(iter,key_func=None), is the most complicated.key_func(elem) is a functionthat can compute a key value for each element returned by the iterable. If youdon't supply a key function, the key is simply each element itself.

groupby() collects all the consecutive elements from theunderlying iterable that have the same key value, and returns a stream of2-tuples containing a key value and an iterator for the elements with that key.

city_list=[('Decatur','AL'),('Huntsville','AL'),('Selma','AL'),('Anchorage','AK'),('Nome','AK'),('Flagstaff','AZ'),('Phoenix','AZ'),('Tucson','AZ'),...]defget_state(city_state):returncity_state[1]itertools.groupby(city_list,get_state)=>('AL',iterator-1),('AK',iterator-2),('AZ',iterator-3),...whereiterator-1=>('Decatur','AL'),('Huntsville','AL'),('Selma','AL')iterator-2=>('Anchorage','AK'),('Nome','AK')iterator-3=>('Flagstaff','AZ'),('Phoenix','AZ'),('Tucson','AZ')

groupby() assumes that the underlying iterable's contents willalready be sorted based on the key. Note that the returned iterators also usethe underlying iterable, so you have to consume the results of iterator-1 beforerequesting iterator-2 and its corresponding key.

functools 模組

Thefunctools module contains some higher-order functions.Ahigher-order function takes one or more functions as input and returns anew function. The most useful tool in this module is thefunctools.partial() function.

For programs written in a functional style, you'll sometimes want to constructvariants of existing functions that have some of the parameters filled in.Consider a Python functionf(a,b,c); you may wish to create a new functiong(b,c) that's equivalent tof(1,b,c); you're filling in a value forone off()'s parameters. This is called "partial function application".

The constructor forpartial() takes the arguments(function,arg1,arg2,...,kwarg1=value1,kwarg2=value2). The resultingobject is callable, so you can just call it to invokefunction with thefilled-in arguments.

以下是個很小但實際的範例:

importfunctoolsdeflog(message,subsystem):"""Write the contents of 'message' to the specified subsystem."""print('%s:%s'%(subsystem,message))...server_log=functools.partial(log,subsystem='server')server_log('Unable to open socket')

functools.reduce(func,iter,[initial_value])cumulatively performs an operation on all the iterable's elements and,therefore, can't be applied to infinite iterables.func must be a functionthat takes two elements and returns a single value.functools.reduce()takes the first two elements A and B returned by the iterator and calculatesfunc(A,B). It then requests the third element, C, calculatesfunc(func(A,B),C), combines this result with the fourth element returned,and continues until the iterable is exhausted. If the iterable returns novalues at all, aTypeError exception is raised. If the initial value issupplied, it's used as a starting point andfunc(initial_value,A) is thefirst calculation.

>>>importoperator,functools>>>functools.reduce(operator.concat,['A','BB','C'])'ABBC'>>>functools.reduce(operator.concat,[])Traceback (most recent call last):...TypeError:reduce() of empty sequence with no initial value>>>functools.reduce(operator.mul,[1,2,3],1)6>>>functools.reduce(operator.mul,[],1)1

If you useoperator.add() withfunctools.reduce(), you'll add up all theelements of the iterable. This case is so common that there's a specialbuilt-in calledsum() to compute it:

>>>importfunctools,operator>>>functools.reduce(operator.add,[1,2,3,4],0)10>>>sum([1,2,3,4])10>>>sum([])0

For many uses offunctools.reduce(), though, it can be clearer to justwrite the obviousfor loop:

importfunctools# Instead of:product=functools.reduce(operator.mul,[1,2,3],1)# You can write:product=1foriin[1,2,3]:product*=i

A related function isitertools.accumulate(iterable,func=operator.add). It performs the same calculation, but instead ofreturning only the final result,accumulate() returns an iteratorthat also yields each partial result:

itertools.accumulate([1,2,3,4,5])=>1,3,6,10,15itertools.accumulate([1,2,3,4,5],operator.mul)=>1,2,6,24,120

operator 模組

Theoperator module was mentioned earlier. It contains a set offunctions corresponding to Python's operators. These functions are often usefulin functional-style code because they save you from writing trivial functionsthat perform a single operation.

Some of the functions in this module are:

  • 數學運算:add()sub()mul()floordiv()abs()...

  • Logical operations:not_(),truth().

  • Bitwise operations:and_(),or_(),invert().

  • Comparisons:eq(),ne(),lt(),le(),gt(), andge().

  • Object identity:is_(),is_not().

Consult the operator module's documentation for a complete list.

Small functions and the lambda expression

When writing functional-style programs, you'll often need little functions thatact as predicates or that combine elements in some way.

If there's a Python built-in or a module function that's suitable, you don'tneed to define a new function at all:

stripped_lines=[line.strip()forlineinlines]existing_files=filter(os.path.exists,file_list)

If the function you need doesn't exist, you need to write it. One way to writesmall functions is to use thelambda expression.lambda takes anumber of parameters and an expression combining these parameters, and createsan anonymous function that returns the value of the expression:

adder=lambdax,y:x+yprint_assign=lambdaname,value:name+'='+str(value)

An alternative is to just use thedef statement and define a function in theusual way:

defadder(x,y):returnx+ydefprint_assign(name,value):returnname+'='+str(value)

Which alternative is preferable? That's a style question; my usual course is toavoid usinglambda.

One reason for my preference is thatlambda is quite limited in thefunctions it can define. The result has to be computable as a singleexpression, which means you can't have multiwayif...elif...elsecomparisons ortry...except statements. If you try to do too much in alambda statement, you'll end up with an overly complicated expression that'shard to read. Quick, what's the following code doing?

importfunctoolstotal=functools.reduce(lambdaa,b:(0,a[1]+b[1]),items)[1]

You can figure it out, but it takes time to disentangle the expression to figureout what's going on. Using a short nesteddef statements makes things alittle bit better:

importfunctoolsdefcombine(a,b):return0,a[1]+b[1]total=functools.reduce(combine,items)[1]

But it would be best of all if I had simply used afor loop:

total=0fora,binitems:total+=b

Or thesum() built-in and a generator expression:

total=sum(bfora,binitems)

Many uses offunctools.reduce() are clearer when written asfor loops.

Fredrik Lundh once suggested the following set of rules for refactoring uses oflambda:

  1. Write a lambda function.

  2. Write a comment explaining what the heck that lambda does.

  3. Study the comment for a while, and think of a name that captures the essenceof the comment.

  4. Convert the lambda to a def statement, using that name.

  5. Remove the comment.

I really like these rules, but you're free to disagreeabout whether this lambda-free style is better.

Revision History and Acknowledgements

The author would like to thank the following people for offering suggestions,corrections and assistance with various drafts of this article: Ian Bicking,Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, LeandroLameiro, Jussi Salmela, Collin Winter, Blake Winton.

Version 0.1: posted June 30 2006.

Version 0.11: posted July 1 2006. Typo fixes.

Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one.Typo fixes.

Version 0.21: Added more references suggested on the tutor mailing list.

Version 0.30: Adds a section on thefunctional module written by CollinWinter; adds short section on the operator module; a few other edits.

References

General

Structure and Interpretation of Computer Programs, by Harold Abelson andGerald Jay Sussman with Julie Sussman. The book can be found athttps://mitpress.mit.edu/sicp. In this classic textbook of computer science,chapters 2 and 3 discuss the use of sequences and streams to organize the dataflow inside a program. The book uses Scheme for its examples, but many of thedesign approaches described in these chapters are applicable to functional-stylePython code.

https://www.defmacro.org/ramblings/fp.html: A general introduction to functionalprogramming that uses Java examples and has a lengthy historical introduction.

https://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entrydescribing functional programming.

https://en.wikipedia.org/wiki/Coroutine: 協程 (coroutines) 的條目。

https://en.wikipedia.org/wiki/Partial_application: 偏函式 (partial application) 概念的條目。

https://en.wikipedia.org/wiki/Currying: currying 概念的條目。

Python 特有的

https://gnosis.cx/TPiP/: The first chapter of David Mertz's bookText Processing in Python discusses functional programmingfor text processing, in the section titled "Utilizing Higher-Order Functions inText Processing".

Mertz also wrote a 3-part series of articles on functional programmingfor IBM's DeveloperWorks site; seepart 1,part 2, andpart 3,

Python 說明文件

itertools 模組的說明文件。

functools 模組的說明文件。

operator 模組的說明文件。

PEP 289:「產生器運算式 (Generator Expressions)」

PEP 342: "Coroutines via Enhanced Generators" describes the new generatorfeatures in Python 2.5.