Python 2.4 有什麼新功能

作者:

A.M. Kuchling

This article explains the new features in Python 2.4.1, released on March 30,2005.

Python 2.4 is a medium-sized release. It doesn't introduce as many changes asthe radical Python 2.2, but introduces more features than the conservative 2.3release. The most significant new language features are function decorators andgenerator expressions; most other changes are to the standard library.

According to the CVS change logs, there were 481 patches applied and 502 bugsfixed between Python 2.3 and 2.4. Both figures are likely to be underestimates.

This article doesn't attempt to provide a complete specification of every singlenew feature, but instead provides a brief introduction to each feature. Forfull details, you should refer to the documentation for Python 2.4, such as thePython Library Reference and the Python Reference Manual. Often you will bereferred to the PEP for a particular new feature for explanations of theimplementation and design rationale.

PEP 218: Built-In Set Objects

Python 2.3 introduced thesets module. C implementations of set datatypes have now been added to the Python core as two new built-in types,set(iterable) andfrozenset(iterable). They provide high speedoperations for membership testing, for eliminating duplicates from sequences,and for mathematical operations like unions, intersections, differences, andsymmetric differences.

>>>a=set('abracadabra')# form a set from a string>>>'z'ina# fast membership testingFalse>>>a# unique letters in aset(['a', 'r', 'b', 'c', 'd'])>>>''.join(a)# convert back into a string'arbcd'>>>b=set('alacazam')# form a second set>>>a-b# letters in a but not in bset(['r', 'd', 'b'])>>>a|b# letters in either a or bset(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])>>>a&b# letters in both a and bset(['a', 'c'])>>>a^b# letters in a or b but not bothset(['r', 'd', 'b', 'm', 'z', 'l'])>>>a.add('z')# add a new element>>>a.update('wxy')# add multiple new elements>>>aset(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])>>>a.remove('x')# take one element out>>>aset(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])

Thefrozenset() type is an immutable version ofset(). Since it isimmutable and hashable, it may be used as a dictionary key or as a member ofanother set.

Thesets module remains in the standard library, and may be useful if youwish to subclass theSet orImmutableSet classes. There arecurrently no plans to deprecate the module.

也參考

PEP 218 - Adding a Built-In Set Object Type

Originally proposed by Greg Wilson and ultimately implemented by RaymondHettinger.

PEP 237: Unifying Long Integers and Integers

The lengthy transition process for this PEP, begun in Python 2.2, takes anotherstep forward in Python 2.4. In 2.3, certain integer operations that wouldbehave differently after int/long unification triggeredFutureWarningwarnings and returned values limited to 32 or 64 bits (depending on yourplatform). In 2.4, these expressions no longer produce a warning and insteadproduce a different result that's usually a long integer.

The problematic expressions are primarily left shifts and lengthy hexadecimaland octal constants. For example,2<<32 results in a warning in 2.3,evaluating to 0 on 32-bit platforms. In Python 2.4, this expression now returnsthe correct answer, 8589934592.

也參考

PEP 237 - Unifying Long Integers and Integers

Original PEP written by Moshe Zadka and GvR. The changes for 2.4 wereimplemented by Kalle Svensson.

PEP 289: Generator Expressions

The iterator feature introduced in Python 2.2 and theitertools modulemake it easier to write programs that loop through large data sets withouthaving the entire data set in memory at one time. List comprehensions don't fitinto this picture very well because they produce a Python list object containingall of the items. This unavoidably pulls all of the objects into memory, whichcan be a problem if your data set is very large. When trying to write afunctionally styled program, it would be natural to write something like:

links=[linkforlinkinget_all_links()ifnotlink.followed]forlinkinlinks:...

instead of

forlinkinget_all_links():iflink.followed:continue...

The first form is more concise and perhaps more readable, but if you're dealingwith a large number of link objects you'd have to write the second form to avoidhaving all link objects in memory at the same time.

Generator expressions work similarly to list comprehensions but don'tmaterialize the entire list; instead they create a generator that will returnelements one by one. The above example could be written as:

links=(linkforlinkinget_all_links()ifnotlink.followed)forlinkinlinks:...

Generator expressions always have to be written inside parentheses, as in theabove example. The parentheses signalling a function call also count, so if youwant to create an iterator that will be immediately passed to a function youcould write:

printsum(obj.countforobjinlist_all_objects())

Generator expressions differ from list comprehensions in various small ways.Most notably, the loop variable (obj in the above example) is not accessibleoutside of the generator expression. List comprehensions leave the variableassigned to its last value; future versions of Python will change this, makinglist comprehensions match generator expressions in this respect.

也參考

PEP 289 - Generator Expressions

Proposed by Raymond Hettinger and implemented by Jiwon Seo with early effortssteered by Hye-Shik Chang.

PEP 292: Simpler String Substitutions

Some new classes in the standard library provide an alternative mechanism forsubstituting variables into strings; this style of substitution may be betterfor applications where untrained users need to edit templates.

The usual way of substituting variables by name is the% operator:

>>>'%(page)i:%(title)s'%{'page':2,'title':'The Best of Times'}'2: The Best of Times'

When writing the template string, it can be easy to forget thei orsafter the closing parenthesis. This isn't a big problem if the template is in aPython module, because you run the code, get an "Unsupported format character"ValueError, and fix the problem. However, consider an application suchas Mailman where template strings or translations are being edited by users whoaren't aware of the Python language. The format string's syntax is complicatedto explain to such users, and if they make a mistake, it's difficult to providehelpful feedback to them.

PEP 292 adds aTemplate class to thestring module that uses$ to indicate a substitution:

>>>importstring>>>t=string.Template('$page: $title')>>>t.substitute({'page':2,'title':'The Best of Times'})'2: The Best of Times'

If a key is missing from the dictionary, thesubstitute() method willraise aKeyError. There's also asafe_substitute() method thatignores missing keys:

>>>t=string.Template('$page: $title')>>>t.safe_substitute({'page':3})'3: $title'

也參考

PEP 292 - Simpler String Substitutions

Written and implemented by Barry Warsaw.

PEP 318: Decorators for Functions and Methods

Python 2.2 extended Python's object model by adding static methods and classmethods, but it didn't extend Python's syntax to provide any new way of definingstatic or class methods. Instead, you had to write adef statementin the usual way, and pass the resulting method to astaticmethod() orclassmethod() function that would wrap up the function as a method of thenew type. Your code would look like this:

classC:defmeth(cls):...meth=classmethod(meth)# Rebind name to wrapped-up class method

If the method was very long, it would be easy to miss or forget theclassmethod() invocation after the function body.

The intention was always to add some syntax to make such definitions morereadable, but at the time of 2.2's release a good syntax was not obvious. Todaya good syntaxstill isn't obvious but users are asking for easier access tothe feature; a new syntactic feature has been added to meet this need.

The new feature is called "function decorators". The name comes from the ideathatclassmethod(),staticmethod(), and friends are storingadditional information on a function object; they'redecorating functions withmore details.

The notation borrows from Java and uses the'@' character as an indicator.Using the new syntax, the example above would be written:

classC:@classmethoddefmeth(cls):...

The@classmethod is shorthand for themeth=classmethod(meth) assignment.More generally, if you have the following:

@A@B@Cdeff():...

It's equivalent to the following pre-decorator code:

deff():...f=A(B(C(f)))

Decorators must come on the line before a function definition, one decorator perline, and can't be on the same line as the def statement, meaning that@Adeff():... is illegal. You can only decorate function definitions, either atthe module level or inside a class; you can't decorate class definitions.

A decorator is just a function that takes the function to be decorated as anargument and returns either the same function or some new object. The returnvalue of the decorator need not be callable (though it typically is), unlessfurther decorators will be applied to the result. It's easy to write your owndecorators. The following simple example just sets an attribute on the functionobject:

>>>defdeco(func):...func.attr='decorated'...returnfunc...>>>@deco...deff():pass...>>>f<function f at 0x402ef0d4>>>>f.attr'decorated'>>>

As a slightly more realistic example, the following decorator checks that thesupplied argument is an integer:

defrequire_int(func):defwrapper(arg):assertisinstance(arg,int)returnfunc(arg)returnwrapper@require_intdefp1(arg):printarg@require_intdefp2(arg):printarg*2

An example inPEP 318 contains a fancier version of this idea that lets youboth specify the required type and check the returned type.

Decorator functions can take arguments. If arguments are supplied, yourdecorator function is called with only those arguments and must return a newdecorator function; this function must take a single function and return afunction, as previously described. In other words,@A@B@C(args) becomes:

deff():..._deco=C(args)f=A(B(_deco(f)))

Getting this right can be slightly brain-bending, but it's not too difficult.

A small related change makes thefunc_nameattribute of functionswritable. This attribute is used to display function names in tracebacks, sodecorators should change the name of any new function that's constructed andreturned.

也參考

PEP 318 - Decorators for Functions, Methods and Classes

Written by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several peoplewrote patches implementing function decorators, but the one that was actuallychecked in was patch #979728, written by Mark Russell.

https://wiki.python.org/moin/PythonDecoratorLibrary

This Wiki page contains several examples of decorators.

PEP 322: Reverse Iteration

A new built-in function,reversed(seq), takes a sequence and returns aniterator that loops over the elements of the sequence in reverse order.

>>>foriinreversed(xrange(1,4)):...printi...321

Compared to extended slicing, such asrange(1,4)[::-1],reversed() iseasier to read, runs faster, and uses substantially less memory.

Note thatreversed() only accepts sequences, not arbitrary iterators. Ifyou want to reverse an iterator, first convert it to a list withlist().

>>>input=open('/etc/passwd','r')>>>forlineinreversed(list(input)):...printline...root:*:0:0:System Administrator:/var/root:/bin/tcsh  ...

也參考

PEP 322 - Reverse Iteration

Written and implemented by Raymond Hettinger.

PEP 324: New subprocess Module

The standard library provides a number of ways to execute a subprocess, offeringdifferent features and different levels of complexity.os.system(command) is easy to use, but slow (it runs a shell processwhich executes the command) and dangerous (you have to be careful about escapingthe shell's metacharacters). Thepopen2 module offers classes that cancapture standard output and standard error from the subprocess, but the namingis confusing. Thesubprocess module cleans this up, providing a unifiedinterface that offers all the features you might need.

Instead ofpopen2's collection of classes,subprocess contains asingle class calledsubprocess.Popen whose constructor supports a number ofdifferent keyword arguments.

classPopen(args,bufsize=0,executable=None,stdin=None,stdout=None,stderr=None,preexec_fn=None,close_fds=False,shell=False,cwd=None,env=None,universal_newlines=False,startupinfo=None,creationflags=0):

args is commonly a sequence of strings that will be the arguments to theprogram executed as the subprocess. (If theshell argument is true,argscan be a string which will then be passed on to the shell for interpretation,just asos.system() does.)

stdin,stdout, andstderr specify what the subprocess's input, output, anderror streams will be. You can provide a file object or a file descriptor, oryou can use the constantsubprocess.PIPE to create a pipe between thesubprocess and the parent.

The constructor has a number of handy options:

  • close_fds requests that all file descriptors be closed before running thesubprocess.

  • cwd specifies the working directory in which the subprocess will be executed(defaulting to whatever the parent's working directory is).

  • env is a dictionary specifying environment variables.

  • preexec_fn is a function that gets called before the child is started.

  • universal_newlines opens the child's input and output using Python'suniversal newlines feature.

Once you've created thePopen instance, you can call itswait()method to pause until the subprocess has exited,poll() to check if it'sexited without pausing, orcommunicate(data) to send the stringdatato the subprocess's standard input.communicate(data) then reads anydata that the subprocess has sent to its standard output or standard error,returning a tuple(stdout_data,stderr_data).

call() is a shortcut that passes its arguments along to thePopenconstructor, waits for the command to complete, and returns the status code ofthe subprocess. It can serve as a safer analog toos.system():

sts=subprocess.call(['dpkg','-i','/tmp/new-package.deb'])ifsts==0:# 成功...else:# dpkg 回傳一個錯誤...

The command is invoked without use of the shell. If you really do want to usethe shell, you can addshell=True as a keyword argument and provide a stringinstead of a sequence:

sts=subprocess.call('dpkg -i /tmp/new-package.deb',shell=True)

The PEP takes various examples of shell and Python code and shows how they'd betranslated into Python code that usessubprocess. Reading this sectionof the PEP is highly recommended.

也參考

PEP 324 - subprocess - 新的行程模組

Written and implemented by Peter Åstrand, with assistance from Fredrik Lundh andothers.

PEP 327: Decimal Data Type

Python has always supported floating-point (FP) numbers, based on the underlyingCdouble type, as a data type. However, while most programminglanguages provide a floating-point type, many people (even programmers) areunaware that floating-point numbers don't represent certain decimal fractionsaccurately. The newDecimal type can represent these fractionsaccurately, up to a user-specified precision limit.

Why is Decimal needed?

The limitations arise from the representation used for floating-point numbers.FP numbers are made up of three components:

  • The sign, which is positive or negative.

  • The mantissa, which is a single-digit binary number followed by a fractionalpart. For example,1.01 in base-2 notation is1+0/2+1/4, or 1.25 indecimal notation.

  • The exponent, which tells where the decimal point is located in the numberrepresented.

For example, the number 1.25 has positive sign, a mantissa value of 1.01 (inbinary), and an exponent of 0 (the decimal point doesn't need to be shifted).The number 5 has the same sign and mantissa, but the exponent is 2 because themantissa is multiplied by 4 (2 to the power of the exponent 2); 1.25 * 4 equals5.

Modern systems usually provide floating-point support that conforms to astandard called IEEE 754. C'sdouble type is usually implemented as a64-bit IEEE 754 number, which uses 52 bits of space for the mantissa. Thismeans that numbers can only be specified to 52 bits of precision. If you'retrying to represent numbers whose expansion repeats endlessly, the expansion iscut off after 52 bits. Unfortunately, most software needs to produce output inbase 10, and common fractions in base 10 are often repeating decimals in binary.For example, 1.1 decimal is binary1.0001100110011...; .1 = 1/16 + 1/32 +1/256 plus an infinite number of additional terms. IEEE 754 has to chop offthat infinitely repeated decimal after 52 digits, so the representation isslightly inaccurate.

Sometimes you can see this inaccuracy when the number is printed:

>>>1.11.1000000000000001

The inaccuracy isn't always visible when you print the number because theFP-to-decimal-string conversion is provided by the C library, and most C libraries tryto produce sensible output. Even if it's not displayed, however, the inaccuracyis still there and subsequent operations can magnify the error.

For many applications this doesn't matter. If I'm plotting points anddisplaying them on my monitor, the difference between 1.1 and 1.1000000000000001is too small to be visible. Reports often limit output to a certain number ofdecimal places, and if you round the number to two or three or even eightdecimal places, the error is never apparent. However, for applications where itdoes matter, it's a lot of work to implement your own custom arithmeticroutines.

Hence, theDecimal type was created.

Decimal 型別

A new module,decimal, was added to Python's standard library. Itcontains two classes,Decimal andContext.Decimalinstances represent numbers, andContext instances are used to wrap upvarious settings such as the precision and default rounding mode.

Decimal instances are immutable, like regular Python integers and FPnumbers; once it's been created, you can't change the value an instancerepresents.Decimal instances can be created from integers orstrings:

>>>importdecimal>>>decimal.Decimal(1972)Decimal("1972")>>>decimal.Decimal("1.1")Decimal("1.1")

You can also provide tuples containing the sign, the mantissa represented as atuple of decimal digits, and the exponent:

>>>decimal.Decimal((1,(1,4,7,5),-2))Decimal("-14.75")

Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 isnegative.

Converting from floating-point numbers poses a bit of a problem: should the FPnumber representing 1.1 turn into the decimal number for exactly 1.1, or for 1.1plus whatever inaccuracies are introduced? The decision was to dodge the issueand leave such a conversion out of the API. Instead, you should convert thefloating-point number into a string using the desired precision and pass thestring to theDecimal constructor:

>>>f=1.1>>>decimal.Decimal(str(f))Decimal("1.1")>>>decimal.Decimal('%.12f'%f)Decimal("1.100000000000")

Once you haveDecimal instances, you can perform the usual mathematicaloperations on them. One limitation: exponentiation requires an integerexponent:

>>>a=decimal.Decimal('35.72')>>>b=decimal.Decimal('1.73')>>>a+bDecimal("37.45")>>>a-bDecimal("33.99")>>>a*bDecimal("61.7956")>>>a/bDecimal("20.64739884393063583815028902")>>>a**2Decimal("1275.9184")>>>a**bTraceback (most recent call last):...decimal.InvalidOperation:x ** (non-integer)

You can combineDecimal instances with integers, but not withfloating-point numbers:

>>>a+4Decimal("39.72")>>>a+4.5Traceback (most recent call last):...TypeError:You can interact Decimal only with int, long or Decimal data types.>>>

Decimal numbers can be used with themath andcmathmodules, but note that they'll be immediately converted to floating-pointnumbers before the operation is performed, resulting in a possible loss ofprecision and accuracy. You'll also get back a regular floating-point numberand not aDecimal.

>>>importmath,cmath>>>d=decimal.Decimal('123456789012.345')>>>math.sqrt(d)351364.18288201344>>>cmath.sqrt(-d)351364.18288201344j

Decimal instances have asqrt() method that returns aDecimal, but if you need other things such as trigonometric functionsyou'll have to implement them.

>>>d.sqrt()Decimal("351364.1828820134592177245001")

Context 型別

Instances of theContext class encapsulate several settings fordecimal operations:

  • prec is the precision, the number of decimal places.

  • rounding specifies the rounding mode. Thedecimal module hasconstants for the various possibilities:ROUND_DOWN,ROUND_CEILING,ROUND_HALF_EVEN, and various others.

  • traps is a dictionary specifying what happens on encountering certainerror conditions: either an exception is raised or a value is returned. Someexamples of error conditions are division by zero, loss of precision, andoverflow.

There's a thread-local default context available by callinggetcontext();you can change the properties of this context to alter the default precision,rounding, or trap handling. The following example shows the effect of changingthe precision of the default context:

>>>decimal.getcontext().prec28>>>decimal.Decimal(1)/decimal.Decimal(7)Decimal("0.1428571428571428571428571429")>>>decimal.getcontext().prec=9>>>decimal.Decimal(1)/decimal.Decimal(7)Decimal("0.142857143")

The default action for error conditions is selectable; the module can eitherreturn a special value such as infinity or not-a-number, or exceptions can beraised:

>>>decimal.Decimal(1)/decimal.Decimal(0)Traceback (most recent call last):...decimal.DivisionByZero:x / 0>>>decimal.getcontext().traps[decimal.DivisionByZero]=False>>>decimal.Decimal(1)/decimal.Decimal(0)Decimal("Infinity")>>>

TheContext instance also has various methods for formatting numberssuch asto_eng_string() andto_sci_string().

For more information, see the documentation for thedecimal module, whichincludes a quick-start tutorial and a reference.

也參考

PEP 327 - Decimal Data Type

Written by Facundo Batista and implemented by Facundo Batista, Eric Price,Raymond Hettinger, Aahz, and Tim Peters.

http://www.lahey.com/float.htm

The article uses Fortran code to illustrate many of the problems thatfloating-point inaccuracy can cause.

https://speleotrove.com/decimal/

A description of a decimal-based representation. This representation is beingproposed as a standard, and underlies the new Python decimal type. Much of thismaterial was written by Mike Cowlishaw, designer of the Rexx language.

PEP 328: Multi-line Imports

One language change is a small syntactic tweak aimed at making it easier toimport many names from a module. In afrommoduleimportnames statement,names is a sequence of names separated by commas. If the sequence is verylong, you can either write multiple imports from the same module, or you can usebackslashes to escape the line endings like this:

fromSimpleXMLRPCServerimportSimpleXMLRPCServer,\SimpleXMLRPCRequestHandler,\CGIXMLRPCRequestHandler,\resolve_dotted_attribute

The syntactic change in Python 2.4 simply allows putting the names withinparentheses. Python ignores newlines within a parenthesized expression, so thebackslashes are no longer needed:

fromSimpleXMLRPCServerimport(SimpleXMLRPCServer,SimpleXMLRPCRequestHandler,CGIXMLRPCRequestHandler,resolve_dotted_attribute)

The PEP also proposes that allimport statements be absolute imports,with a leading. character to indicate a relative import. This part of thePEP was not implemented for Python 2.4, but was completed for Python 2.5.

也參考

PEP 328 - Imports: Multi-Line and Absolute/Relative

Written by Aahz. Multi-line imports were implemented by Dima Dorfman.

PEP 331: Locale-Independent Float/String Conversions

Thelocale modules lets Python software select various conversions anddisplay conventions that are localized to a particular country or language.However, the module was careful to not change the numeric locale because variousfunctions in Python's implementation required that the numeric locale remain setto the'C' locale. Often this was because the code was using the Clibrary'satof() function.

Not setting the numeric locale caused trouble for extensions that used third-partyC libraries, however, because they wouldn't have the correct locale set.The motivating example was GTK+, whose user interface widgets weren't displayingnumbers in the current locale.

The solution described in the PEP is to add three new functions to the PythonAPI that perform ASCII-only conversions, ignoring the locale setting:

  • PyOS_ascii_strtod(str,ptr) andPyOS_ascii_atof(str,ptr)both convert a string to a Cdouble.

  • PyOS_ascii_formatd(buffer,buf_len,format,d) converts adouble to an ASCII string.

The code for these functions came from the GLib library(https://developer-old.gnome.org/glib/2.26/), whose developers kindlyrelicensed the relevant functions and donated them to the Python SoftwareFoundation. Thelocale module can now change the numeric locale,letting extensions such as GTK+ produce the correct results.

也參考

PEP 331 - Locale-Independent Float/String Conversions

Written by Christian R. Reis, and implemented by Gustavo Carneiro.

其他語言更動

Here are all of the changes that Python 2.4 makes to the core Python language.

  • Decorators for functions and methods were added (PEP 318).

  • Built-inset() andfrozenset() types were added (PEP 218).Other new built-ins include thereversed(seq) function (PEP 322).

  • Generator expressions were added (PEP 289).

  • Certain numeric expressions no longer return values restricted to 32 or 64bits (PEP 237).

  • You can now put parentheses around the list of names in afrommoduleimportnames statement (PEP 328).

  • Thedict.update() method now accepts the same argument forms as thedict constructor. This includes any mapping, any iterable of key/valuepairs, and keyword arguments. (Contributed by Raymond Hettinger.)

  • The string methodsljust(),rjust(), andcenter() now takean optional argument for specifying a fill character other than a space.(Contributed by Raymond Hettinger.)

  • Strings also gained anrsplit() method that works like thesplit()method but splits from the end of the string. (Contributed by SeanReifschneider.)

    >>>'www.python.org'.split('.',1)['www', 'python.org']'www.python.org'.rsplit('.', 1)['www.python', 'org']
  • Three keyword parameters,cmp,key, andreverse, were added to thesort() method of lists. These parameters make some common usages ofsort() simpler. All of these parameters are optional.

    For thecmp parameter, the value should be a comparison function that takestwo parameters and returns -1, 0, or +1 depending on how the parameters compare.This function will then be used to sort the list. Previously this was the onlyparameter that could be provided tosort().

    key should be a single-parameter function that takes a list element andreturns a comparison key for the element. The list is then sorted using thecomparison keys. The following example sorts a list case-insensitively:

    >>>L=['A','b','c','D']>>>L.sort()# Case-sensitive sort>>>L['A', 'D', 'b', 'c']>>># Using 'key' parameter to sort list>>>L.sort(key=lambdax:x.lower())>>>L['A', 'b', 'c', 'D']>>># Old-fashioned way>>>L.sort(cmp=lambdax,y:cmp(x.lower(),y.lower()))>>>L['A', 'b', 'c', 'D']

    The last example, which uses thecmp parameter, is the old way to perform acase-insensitive sort. It works but is slower than using akey parameter.Usingkey callslower() method once for each element in the list whileusingcmp will call it twice for each comparison, so usingkey saves oninvocations of thelower() method.

    For simple key functions and comparison functions, it is often possible to avoidalambda expression by using an unbound method instead. For example,the above case-insensitive sort is best written as:

    >>>L.sort(key=str.lower)>>>L['A', 'b', 'c', 'D']

    Finally, thereverse parameter takes a Boolean value. If the value is true,the list will be sorted into reverse order. Instead ofL.sort();L.reverse(), you can now writeL.sort(reverse=True).

    The results of sorting are now guaranteed to be stable. This means that twoentries with equal keys will be returned in the same order as they were input.For example, you can sort a list of people by name, and then sort the list byage, resulting in a list sorted by age where people with the same age are inname-sorted order.

    (All changes tosort() contributed by Raymond Hettinger.)

  • There is a new built-in functionsorted(iterable) that works like thein-placelist.sort() method but can be used in expressions. Thedifferences are:

  • the input may be any iterable;

  • a newly formed copy is sorted, leaving the original intact; and

  • the expression returns the new sorted copy

    >>>L=[9,7,8,3,2,4,1,6,5]>>>[10+iforiinsorted(L)]# usable in a list comprehension[11, 12, 13, 14, 15, 16, 17, 18, 19]>>>L# original is left unchanged[9,7,8,3,2,4,1,6,5]>>>sorted('Monty Python')# any iterable may be an input[' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y']>>># List the contents of a dict sorted by key values>>>colormap=dict(red=1,blue=2,green=3,black=4,yellow=5)>>>fork,vinsorted(colormap.iteritems()):...printk,v...black 4blue 2green 3red 1yellow 5

    (由 Raymond Hettinger 所貢獻。)

  • Integer operations will no longer trigger anOverflowWarning. TheOverflowWarning warning will disappear in Python 2.5.

  • The interpreter gained a new switch,-m, that takes a name, searchesfor the corresponding module onsys.path, and runs the module as a script.For example, you can now run the Python profiler withpython-mprofile.(Contributed by Nick Coghlan.)

  • Theeval(expr,globals,locals) andexecfile(filename,globals,locals) functions and theexec statement now accept any mapping typefor thelocals parameter. Previously this had to be a regular Pythondictionary. (Contributed by Raymond Hettinger.)

  • Thezip() built-in function anditertools.izip() now return anempty list if called with no arguments. Previously they raised aTypeError exception. This makes them more suitable for use with variablelength argument lists:

    >>>deftranspose(array):...returnzip(*array)...>>>transpose([(1,2,3),(4,5,6)])[(1, 4), (2, 5), (3, 6)]>>>transpose([])[]

    (由 Raymond Hettinger 所貢獻。)

  • Encountering a failure while importing a module no longer leaves a partially initializedmodule object insys.modules. The incomplete module object leftbehind would fool further imports of the same module into succeeding, leading toconfusing errors. (Fixed by Tim Peters.)

  • None is now a constant; code that binds a new value to the nameNone is now a syntax error. (Contributed by Raymond Hettinger.)

最佳化

  • The inner loops for list and tuple slicing were optimized and now run aboutone-third faster. The inner loops for dictionaries were also optimized,resulting in performance boosts forkeys(),values(),items(),iterkeys(),itervalues(), anditeritems(). (Contributed byRaymond Hettinger.)

  • The machinery for growing and shrinking lists was optimized for speed and forspace efficiency. Appending and popping from lists now runs faster due to moreefficient code paths and less frequent use of the underlying systemrealloc(). List comprehensions also benefit.list.extend() wasalso optimized and no longer converts its argument into a temporary list beforeextending the base list. (Contributed by Raymond Hettinger.)

  • list(),tuple(),map(),filter(), andzip() nowrun several times faster with non-sequence arguments that supply a__len__() method. (Contributed by Raymond Hettinger.)

  • The methodslist.__getitem__(),dict.__getitem__(), anddict.__contains__() are now implemented asmethod_descriptorobjects rather thanwrapper_descriptor objects. This form of accessdoubles their performance and makes them more suitable for use as arguments tofunctionals:map(mydict.__getitem__,keylist). (Contributed by RaymondHettinger.)

  • Added a new opcode,LIST_APPEND, that simplifies the generated bytecodefor list comprehensions and speeds them up by about a third. (Contributed byRaymond Hettinger.)

  • The peephole bytecode optimizer has been improved to produce shorter, fasterbytecode; remarkably, the resulting bytecode is more readable. (Enhanced byRaymond Hettinger.)

  • String concatenations in statements of the forms=s+"abc" ands+="abc" are now performed more efficiently in certain circumstances. Thisoptimization won't be present in other Python implementations such as Jython, soyou shouldn't rely on it; using thejoin() method of strings is stillrecommended when you want to efficiently glue a large number of stringstogether. (Contributed by Armin Rigo.)

The net result of the 2.4 optimizations is that Python 2.4 runs the pystonebenchmark around 5% faster than Python 2.3 and 35% faster than Python 2.2.(pystone is not a particularly good benchmark, but it's the most commonly usedmeasurement of Python's performance. Your own applications may show greater orsmaller benefits from Python 2.4.)

New, Improved, and Deprecated Modules

As usual, Python's standard library received a number of enhancements and bugfixes. Here's a partial list of the most notable changes, sorted alphabeticallyby module name. Consult theMisc/NEWS file in the source tree for a morecomplete list of changes, or look through the CVS logs for all the details.

  • Theasyncore module'sloop() function now has acount parameterthat lets you perform a limited number of passes through the polling loop. Thedefault is still to loop forever.

  • Thebase64 module now has more completeRFC 3548 support for Base64,Base32, and Base16 encoding and decoding, including optional case folding andoptional alternative alphabets. (Contributed by Barry Warsaw.)

  • Thebisect module now has an underlying C implementation for improvedperformance. (Contributed by Dmitry Vasiliev.)

  • The CJKCodecs collections of East Asian codecs, maintained by Hye-Shik Chang,was integrated into 2.4. The new encodings are:

  • Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz

  • Chinese (ROC): big5, cp950

  • Japanese: cp932, euc-jis-2004, euc-jp, euc-jisx0213, iso-2022-jp,

    iso-2022-jp-1, iso-2022-jp-2, iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004,shift-jis, shift-jisx0213, shift-jis-2004

  • Korean: cp949, euc-kr, johab, iso-2022-kr

  • Some other new encodings were added: HP Roman8, ISO_8859-11, ISO_8859-16,PCTP-154, and TIS-620.

  • The UTF-8 and UTF-16 codecs now cope better with receiving partial input.Previously theStreamReader class would try to read more data, makingit impossible to resume decoding from the stream. Theread() method willnow return as much data as it can and future calls will resume decoding whereprevious ones left off. (Implemented by Walter Dörwald.)

  • There is a newcollections module for various specialized collectiondatatypes. Currently it contains just one type,deque, a double-endedqueue that supports efficiently adding and removing elements from eitherend:

    >>>fromcollectionsimportdeque>>>d=deque('ghi')# make a new deque with three items>>>d.append('j')# add a new entry to the right side>>>d.appendleft('f')# add a new entry to the left side>>>d# show the representation of the dequedeque(['f', 'g', 'h', 'i', 'j'])>>>d.pop()# return and remove the rightmost item'j'>>>d.popleft()# return and remove the leftmost item'f'>>>list(d)# list the contents of the deque['g', 'h', 'i']>>>'h'ind# search the dequeTrue

    Several modules, such as theQueue andthreading modules, now takeadvantage ofcollections.deque for improved performance. (Contributedby Raymond Hettinger.)

  • TheConfigParser classes have been enhanced slightly. Theread()method now returns a list of the files that were successfully parsed, and theset() method raisesTypeError if passed avalue argument thatisn't a string. (Contributed by John Belmonte and David Goodger.)

  • Thecurses module now supports the ncurses extensionuse_default_colors(). On platforms where the terminal supportstransparency, this makes it possible to use a transparent background.(Contributed by Jörg Lehmann.)

  • Thedifflib module now includes anHtmlDiff class that createsan HTML table showing a side by side comparison of two versions of a text.(Contributed by Dan Gass.)

  • Theemail package was updated to version 3.0, which dropped variousdeprecated APIs and removes support for Python versions earlier than 2.3. The3.0 version of the package uses a new incremental parser for MIME messages,available in theemail.FeedParser module. The new parser doesn't requirereading the entire message into memory, and doesn't raise exceptions if amessage is malformed; instead it records any problems in thedefectattribute of the message. (Developed by Anthony Baxter, Barry Warsaw, ThomasWouters, and others.)

  • Theheapq module has been converted to C. The resulting tenfoldimprovement in speed makes the module suitable for handling high volumes ofdata. In addition, the module has two new functionsnlargest() andnsmallest() that use heaps to find the N largest or smallest values in adataset without the expense of a full sort. (Contributed by Raymond Hettinger.)

  • Thehttplib module now contains constants for HTTP status codes definedin various HTTP-related RFC documents. Constants have names such asOK,CREATED,CONTINUE, andMOVED_PERMANENTLY; use pydoc to get a full list. (Contributed byAndrew Eland.)

  • Theimaplib module now supports IMAP's THREAD command (contributed byYves Dionne) and newdeleteacl() andmyrights() methods (contributedby Arnaud Mazin).

  • Theitertools module gained agroupby(iterable[,*func*])function.iterable is something that can be iterated over to return a streamof elements, and the optionalfunc parameter is a function that takes anelement and returns a key value; if omitted, the key is simply the elementitself.groupby() then groups the elements into subsequences which havematching values of the key, and returns a series of 2-tuples containing the keyvalue and an iterator over the subsequence.

    Here's an example to make this clearer. Thekey function simply returnswhether a number is even or odd, so the result ofgroupby() is to returnconsecutive runs of odd or even numbers.

    >>>importitertools>>>L=[2,4,6,7,8,9,11,12,14]>>>forkey_val,itinitertools.groupby(L,lambdax:x%2):...printkey_val,list(it)...0 [2, 4, 6]1 [7]0 [8]1 [9, 11]0 [12, 14]>>>

    groupby() is typically used with sorted input. The logic forgroupby() is similar to the Unixuniq filter which makes it handy foreliminating, counting, or identifying duplicate elements:

    >>>word='abracadabra'>>>letters=sorted(word)# Turn string into a sorted list of letters>>>letters['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r']>>>fork,ginitertools.groupby(letters):...printk,list(g)...a ['a', 'a', 'a', 'a', 'a']b ['b', 'b']c ['c']d ['d']r ['r', 'r']>>># List unique letters>>>[kfork,gingroupby(letters)]['a', 'b', 'c', 'd', 'r']>>># Count letter occurrences>>>[(k,len(list(g)))fork,gingroupby(letters)][('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]

    (由 Hye-Shik Chang 所貢獻。)

  • itertools also gained a function namedtee(iterator,N) thatreturnsN independent iterators that replicateiterator. IfN is omitted,the default is 2.

    >>>L=[1,2,3]>>>i1,i2=itertools.tee(L)>>>i1,i2(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>)>>>list(i1)# Run the first iterator to exhaustion[1, 2, 3]>>>list(i2)# Run the second iterator to exhaustion[1, 2, 3]

    Note thattee() has to keep copies of the values returned by theiterator; in the worst case, it may need to keep all of them. This shouldtherefore be used carefully if the leading iterator can run far ahead of thetrailing iterator in a long stream of inputs. If the separation is large, thenyou might as well uselist() instead. When the iterators track closelywith one another,tee() is ideal. Possible applications includebookmarking, windowing, or lookahead iterators. (Contributed by RaymondHettinger.)

  • A number of functions were added to thelocale module, such asbind_textdomain_codeset() to specify a particular encoding and a family ofl*gettext() functions that return messages in the chosen encoding.(Contributed by Gustavo Niemeyer.)

  • Some keyword arguments were added to thelogging package'sbasicConfig() function to simplify log configuration. The defaultbehavior is to log messages to standard error, but various keyword arguments canbe specified to log to a particular file, change the logging format, or set thelogging level. For example:

    importlogginglogging.basicConfig(filename='/var/log/application.log',level=0,# Log all messagesformat='%(levelname):%(process):%(thread):%(message)')

    Other additions to thelogging package include alog(level,msg)convenience method, as well as aTimedRotatingFileHandler class thatrotates its log files at a timed interval. The module already hadRotatingFileHandler, which rotated logs once the file exceeded acertain size. Both classes derive from a newBaseRotatingHandler classthat can be used to implement other rotating handlers.

    (由 Vinay Sajip 所實作。)

  • Themarshal module now shares interned strings on unpacking a datastructure. This may shrink the size of certain pickle strings, but the primaryeffect is to make.pyc files significantly smaller. (Contributed byMartin von Löwis.)

  • Thenntplib module'sNNTP class gaineddescription() anddescriptions() methods to retrieve newsgroup descriptions for a singlegroup or for a range of groups. (Contributed by Jürgen A. Erhard.)

  • Two new functions were added to theoperator module,attrgetter(attr) anditemgetter(index). Both functions returncallables that take a single argument and return the corresponding attribute oritem; these callables make excellent data extractors when used withmap()orsorted(). For example:

    >>>L=[('c',2),('d',1),('a',4),('b',3)]>>>map(operator.itemgetter(0),L)['c', 'd', 'a', 'b']>>>map(operator.itemgetter(1),L)[2, 1, 4, 3]>>>sorted(L,key=operator.itemgetter(1))# Sort list by second tuple item[('d', 1), ('c', 2), ('b', 3), ('a', 4)]

    (由 Raymond Hettinger 所貢獻。)

  • Theoptparse module was updated in various ways. The module now passesits messages throughgettext.gettext(), making it possible tointernationalize Optik's help and error messages. Help messages for options cannow include the string'%default', which will be replaced by the option'sdefault value. (Contributed by Greg Ward.)

  • The long-term plan is to deprecate therfc822 module in some futurePython release in favor of theemail package. To this end, theemail.Utils.formatdate function has been changed to make it usable as areplacement forrfc822.formatdate(). You may want to write new e-mailprocessing code with this in mind. (Change implemented by Anthony Baxter.)

  • A newurandom(n) function was added to theos module, returninga string containingn bytes of random data. This function provides access toplatform-specific sources of randomness such as/dev/urandom on Linux orthe Windows CryptoAPI. (Contributed by Trevor Perrin.)

  • Another new function:os.path.lexists(path) returns true if the filespecified bypath exists, whether or not it's a symbolic link. This differsfrom the existingos.path.exists(path) function, which returns false ifpath is a symlink that points to a destination that doesn't exist.(Contributed by Beni Cherniavsky.)

  • A newgetsid() function was added to theposix module thatunderlies theos module. (Contributed by J. Raynor.)

  • Thepoplib module now supports POP over SSL. (Contributed by HectorUrtubia.)

  • Theprofile module can now profile C extension functions. (Contributedby Nick Bastin.)

  • Therandom module has a new method calledgetrandbits(N) thatreturns a long integerN bits in length. The existingrandrange()method now usesgetrandbits() where appropriate, making generation ofarbitrarily large random numbers more efficient. (Contributed by RaymondHettinger.)

  • The regular expression language accepted by there module was extendedwith simple conditional expressions, written as(?(group)A|B).group iseither a numeric group ID or a group name defined with(?P<group>...)earlier in the expression. If the specified group matched, the regularexpression patternA will be tested against the string; if the group didn'tmatch, the patternB will be used instead. (Contributed by Gustavo Niemeyer.)

  • There module is also no longer recursive, thanks to a massive amountof work by Gustavo Niemeyer. In a recursive regular expression engine, certainpatterns result in a large amount of C stack space being consumed, and it waspossible to overflow the stack. For example, if you matched a 30000-byte stringofa characters against the expression(a|b)+, one stack frame wasconsumed per character. Python 2.3 tried to check for stack overflow and raiseaRuntimeError exception, but certain patterns could sidestep thechecking and if you were unlucky Python could segfault. Python 2.4's regularexpression engine can match this pattern without problems.

  • Thesignal module now performs tighter error-checking on the parametersto thesignal.signal() function. For example, you can't set a handler ontheSIGKILL signal; previous versions of Python would quietly acceptthis, but 2.4 will raise aRuntimeError exception.

  • Two new functions were added to thesocket module.socketpair()returns a pair of connected sockets andgetservbyport(port) looks up theservice name for a given port number. (Contributed by Dave Cole and BarryWarsaw.)

  • Thesys.exitfunc() function has been deprecated. Code should be usingthe existingatexit module, which correctly handles calling multiple exitfunctions. Eventuallysys.exitfunc() will become a purely internalinterface, accessed only byatexit.

  • Thetarfile module now generates GNU-format tar files by default.(Contributed by Lars Gustäbel.)

  • Thethreading module now has an elegantly simple way to supportthread-local data. The module contains alocal class whose attributevalues are local to different threads.

    importthreadingdata=threading.local()data.number=42data.url=('www.python.org',80)

    Other threads can assign and retrieve their own values for thenumberandurl attributes. You can subclasslocal to initializeattributes or to add methods. (Contributed by Jim Fulton.)

  • Thetimeit module now automatically disables periodic garbagecollection during the timing loop. This change makes consecutive timings morecomparable. (Contributed by Raymond Hettinger.)

  • Theweakref module now supports a wider variety of objects includingPython functions, class instances, sets, frozensets, deques, arrays, files,sockets, and regular expression pattern objects. (Contributed by RaymondHettinger.)

  • Thexmlrpclib module now supports a multi-call extension fortransmitting multiple XML-RPC calls in a single HTTP operation. (Contributed byBrian Quinlan.)

  • Thempz,rotor, andxreadlines modules have beenremoved.

cookielib

Thecookielib library supports client-side handling for HTTP cookies,mirroring theCookie module's server-side cookie support. Cookies arestored in cookie jars; the library transparently stores cookies offered by theweb server in the cookie jar, and fetches the cookie from the jar whenconnecting to the server. As in web browsers, policy objects control whethercookies are accepted or not.

In order to store cookies across sessions, two implementations of cookie jarsare provided: one that stores cookies in the Netscape format so applications canuse the Mozilla or Lynx cookie files, and one that stores cookies in the sameformat as the Perl libwww library.

urllib2 has been changed to interact withcookielib:HTTPCookieProcessor manages a cookie jar that is used when accessingURLs.

This module was contributed by John J. Lee.

doctest

Thedoctest module underwent considerable refactoring thanks to EdwardLoper and Tim Peters. Testing can still be as simple as runningdoctest.testmod(), but the refactorings allow customizing the module'soperation in various ways

The newDocTestFinder class extracts the tests from a given object'sdocstrings:

deff(x,y):""">>> f(2,2)4>>> f(3,2)6    """returnx*yfinder=doctest.DocTestFinder()# Get list of DocTest instancestests=finder.find(f)

The newDocTestRunner class then runs individual tests and can producea summary of the results:

runner=doctest.DocTestRunner()fortintests:tried,failed=runner.run(t)runner.summarize(verbose=1)

The above example produces the following output:

1itemspassedalltests:2testsinf2testsin1items.2passedand0failed.Testpassed.

DocTestRunner uses an instance of theOutputChecker class tocompare the expected output with the actual output. This class takes a numberof different flags that customize its behaviour; ambitious users can also writea completely new subclass ofOutputChecker.

The default output checker provides a number of handy features. For example,with thedoctest.ELLIPSIS option flag, an ellipsis (...) in theexpected output matches any substring, making it easier to accommodate outputsthat vary in minor ways:

defo(n):""">>> o(1)<__main__.C instance at 0x...>>>>"""

Another special string,<BLANKLINE>, matches a blank line:

defp(n):""">>> p(1)<BLANKLINE>>>>"""

Another new capability is producing a diff-style display of the output byspecifying thedoctest.REPORT_UDIFF (unified diffs),doctest.REPORT_CDIFF (context diffs), ordoctest.REPORT_NDIFF(delta-style) option flags. For example:

defg(n):""">>> g(4)hereisalengthy>>>"""L='here is a rather lengthy list of words'.split()forwordinL[:n]:printword

Running the above function's tests withdoctest.REPORT_UDIFF specified,you get the following output:

**********************************************************************File "t.py", line 15, in gFailed example:    g(4)Differences (unified diff with -expected +actual):    @@ -2,3 +2,3 @@     is     a    -lengthy    +rather**********************************************************************

建置和 C API 變更

Some of the changes to Python's build process and to the C API are:

  • Three new convenience macros were added for common return values fromextension functions:Py_RETURN_NONE,Py_RETURN_TRUE, andPy_RETURN_FALSE. (Contributed by Brett Cannon.)

  • Another new macro,Py_CLEAR, decreases the reference count ofobj and setsobj to the null pointer. (Contributed by Jim Fulton.)

  • A new function,PyTuple_Pack(N,obj1,obj2,...,objN), constructstuples from a variable length argument list of Python objects. (Contributed byRaymond Hettinger.)

  • A new function,PyDict_Contains(d,k), implements fast dictionarylookups without masking exceptions raised during the look-up process.(Contributed by Raymond Hettinger.)

  • ThePy_IS_NAN(X) macro returns 1 if its float or double argumentX is a NaN. (Contributed by Tim Peters.)

  • C code can avoid unnecessary locking by using the newPyEval_ThreadsInitialized() function to tell if any thread operationshave been performed. If this function returns false, no lock operations areneeded. (Contributed by Nick Coghlan.)

  • A new function,PyArg_VaParseTupleAndKeywords(), is the same asPyArg_ParseTupleAndKeywords() but takes ava_list instead of anumber of arguments. (Contributed by Greg Chapman.)

  • A new method flag,METH_COEXIST, allows a function defined in slotsto co-exist with aPyCFunction having the same name. This can halvethe access time for a method such asset.__contains__(). (Contributed byRaymond Hettinger.)

  • Python can now be built with additional profiling for the interpreter itself,intended as an aid to people developing the Python core. Providing--enable-profiling to theconfigure script will let youprofile the interpreter withgprof, and providing the--with-tsc switch enables profiling using the Pentium'sTime-Stamp-Counter register. Note that the--with-tsc switch is slightlymisnamed, because the profiling feature also works on the PowerPC platform,though that processor architecture doesn't call that register "the TSCregister". (Contributed by Jeremy Hylton.)

  • Thetracebackobject type has been renamed toPyTracebackObject.

Port-Specific Changes

  • The Windows port now builds under MSVC++ 7.1 as well as version 6.(Contributed by Martin von Löwis.)

Porting to Python 2.4

This section lists previously described changes that may require changes to yourcode:

  • Left shifts and hexadecimal/octal constants that are too large no longertrigger aFutureWarning and return a value limited to 32 or 64 bits;instead they return a long integer.

  • Integer operations will no longer trigger anOverflowWarning. TheOverflowWarning warning will disappear in Python 2.5.

  • Thezip() built-in function anditertools.izip() now return anempty list instead of raising aTypeError exception if called with noarguments.

  • You can no longer compare thedate anddatetime instancesprovided by thedatetime module. Two instances of different classeswill now always be unequal, and relative comparisons (<,>) will raiseaTypeError.

  • dircache.listdir() now passes exceptions to the caller instead ofreturning empty lists.

  • LexicalHandler.startDTD() used to receive the public and system IDs inthe wrong order. This has been corrected; applications relying on the wrongorder need to be fixed.

  • fcntl.ioctl() now warns if themutate argument is omitted andrelevant.

  • Thetarfile module now generates GNU-format tar files by default.

  • Encountering a failure while importing a module no longer leaves apartially initialized module object insys.modules.

  • None is now a constant; code that binds a new value to the nameNone is now a syntax error.

  • Thesignals.signal() function now raises aRuntimeError exceptionfor certain illegal values; previously these errors would pass silently. Forexample, you can no longer set a handler on theSIGKILL signal.

致謝

The author would like to thank the following people for offering suggestions,corrections and assistance with various drafts of this article: Koray Can,Hye-Shik Chang, Michael Dyck, Raymond Hettinger, Brian Hurt, Hamish Lawson,Fredrik Lundh, Sean Reifschneider, Sadruddin Rejeb.