Python 2.4 有什麼新功能¶
- 作者:
A.M. Kuchling
This article explains the new features in Python 2.4.1, released on March 30,2005.
Python 2.4 is a medium-sized release. It doesn't introduce as many changes asthe radical Python 2.2, but introduces more features than the conservative 2.3release. The most significant new language features are function decorators andgenerator expressions; most other changes are to the standard library.
According to the CVS change logs, there were 481 patches applied and 502 bugsfixed between Python 2.3 and 2.4. Both figures are likely to be underestimates.
This article doesn't attempt to provide a complete specification of every singlenew feature, but instead provides a brief introduction to each feature. Forfull details, you should refer to the documentation for Python 2.4, such as thePython Library Reference and the Python Reference Manual. Often you will bereferred to the PEP for a particular new feature for explanations of theimplementation and design rationale.
PEP 218: Built-In Set Objects¶
Python 2.3 introduced thesets
module. C implementations of set datatypes have now been added to the Python core as two new built-in types,set(iterable)
andfrozenset(iterable)
. They provide high speedoperations for membership testing, for eliminating duplicates from sequences,and for mathematical operations like unions, intersections, differences, andsymmetric differences.
>>>a=set('abracadabra')# form a set from a string>>>'z'ina# fast membership testingFalse>>>a# unique letters in aset(['a', 'r', 'b', 'c', 'd'])>>>''.join(a)# convert back into a string'arbcd'>>>b=set('alacazam')# form a second set>>>a-b# letters in a but not in bset(['r', 'd', 'b'])>>>a|b# letters in either a or bset(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])>>>a&b# letters in both a and bset(['a', 'c'])>>>a^b# letters in a or b but not bothset(['r', 'd', 'b', 'm', 'z', 'l'])>>>a.add('z')# add a new element>>>a.update('wxy')# add multiple new elements>>>aset(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])>>>a.remove('x')# take one element out>>>aset(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])
Thefrozenset()
type is an immutable version ofset()
. Since it isimmutable and hashable, it may be used as a dictionary key or as a member ofanother set.
Thesets
module remains in the standard library, and may be useful if youwish to subclass theSet
orImmutableSet
classes. There arecurrently no plans to deprecate the module.
也參考
- PEP 218 - Adding a Built-In Set Object Type
Originally proposed by Greg Wilson and ultimately implemented by RaymondHettinger.
PEP 237: Unifying Long Integers and Integers¶
The lengthy transition process for this PEP, begun in Python 2.2, takes anotherstep forward in Python 2.4. In 2.3, certain integer operations that wouldbehave differently after int/long unification triggeredFutureWarning
warnings and returned values limited to 32 or 64 bits (depending on yourplatform). In 2.4, these expressions no longer produce a warning and insteadproduce a different result that's usually a long integer.
The problematic expressions are primarily left shifts and lengthy hexadecimaland octal constants. For example,2<<32
results in a warning in 2.3,evaluating to 0 on 32-bit platforms. In Python 2.4, this expression now returnsthe correct answer, 8589934592.
也參考
- PEP 237 - Unifying Long Integers and Integers
Original PEP written by Moshe Zadka and GvR. The changes for 2.4 wereimplemented by Kalle Svensson.
PEP 289: Generator Expressions¶
The iterator feature introduced in Python 2.2 and theitertools
modulemake it easier to write programs that loop through large data sets withouthaving the entire data set in memory at one time. List comprehensions don't fitinto this picture very well because they produce a Python list object containingall of the items. This unavoidably pulls all of the objects into memory, whichcan be a problem if your data set is very large. When trying to write afunctionally styled program, it would be natural to write something like:
links=[linkforlinkinget_all_links()ifnotlink.followed]forlinkinlinks:...
instead of
forlinkinget_all_links():iflink.followed:continue...
The first form is more concise and perhaps more readable, but if you're dealingwith a large number of link objects you'd have to write the second form to avoidhaving all link objects in memory at the same time.
Generator expressions work similarly to list comprehensions but don'tmaterialize the entire list; instead they create a generator that will returnelements one by one. The above example could be written as:
links=(linkforlinkinget_all_links()ifnotlink.followed)forlinkinlinks:...
Generator expressions always have to be written inside parentheses, as in theabove example. The parentheses signalling a function call also count, so if youwant to create an iterator that will be immediately passed to a function youcould write:
printsum(obj.countforobjinlist_all_objects())
Generator expressions differ from list comprehensions in various small ways.Most notably, the loop variable (obj in the above example) is not accessibleoutside of the generator expression. List comprehensions leave the variableassigned to its last value; future versions of Python will change this, makinglist comprehensions match generator expressions in this respect.
也參考
- PEP 289 - Generator Expressions
Proposed by Raymond Hettinger and implemented by Jiwon Seo with early effortssteered by Hye-Shik Chang.
PEP 292: Simpler String Substitutions¶
Some new classes in the standard library provide an alternative mechanism forsubstituting variables into strings; this style of substitution may be betterfor applications where untrained users need to edit templates.
The usual way of substituting variables by name is the%
operator:
>>>'%(page)i:%(title)s'%{'page':2,'title':'The Best of Times'}'2: The Best of Times'
When writing the template string, it can be easy to forget thei
ors
after the closing parenthesis. This isn't a big problem if the template is in aPython module, because you run the code, get an "Unsupported format character"ValueError
, and fix the problem. However, consider an application suchas Mailman where template strings or translations are being edited by users whoaren't aware of the Python language. The format string's syntax is complicatedto explain to such users, and if they make a mistake, it's difficult to providehelpful feedback to them.
PEP 292 adds aTemplate
class to thestring
module that uses$
to indicate a substitution:
>>>importstring>>>t=string.Template('$page: $title')>>>t.substitute({'page':2,'title':'The Best of Times'})'2: The Best of Times'
If a key is missing from the dictionary, thesubstitute()
method willraise aKeyError
. There's also asafe_substitute()
method thatignores missing keys:
>>>t=string.Template('$page: $title')>>>t.safe_substitute({'page':3})'3: $title'
也參考
- PEP 292 - Simpler String Substitutions
Written and implemented by Barry Warsaw.
PEP 318: Decorators for Functions and Methods¶
Python 2.2 extended Python's object model by adding static methods and classmethods, but it didn't extend Python's syntax to provide any new way of definingstatic or class methods. Instead, you had to write adef
statementin the usual way, and pass the resulting method to astaticmethod()
orclassmethod()
function that would wrap up the function as a method of thenew type. Your code would look like this:
classC:defmeth(cls):...meth=classmethod(meth)# Rebind name to wrapped-up class method
If the method was very long, it would be easy to miss or forget theclassmethod()
invocation after the function body.
The intention was always to add some syntax to make such definitions morereadable, but at the time of 2.2's release a good syntax was not obvious. Todaya good syntaxstill isn't obvious but users are asking for easier access tothe feature; a new syntactic feature has been added to meet this need.
The new feature is called "function decorators". The name comes from the ideathatclassmethod()
,staticmethod()
, and friends are storingadditional information on a function object; they'redecorating functions withmore details.
The notation borrows from Java and uses the'@'
character as an indicator.Using the new syntax, the example above would be written:
classC:@classmethoddefmeth(cls):...
The@classmethod
is shorthand for themeth=classmethod(meth)
assignment.More generally, if you have the following:
@A@B@Cdeff():...
It's equivalent to the following pre-decorator code:
deff():...f=A(B(C(f)))
Decorators must come on the line before a function definition, one decorator perline, and can't be on the same line as the def statement, meaning that@Adeff():...
is illegal. You can only decorate function definitions, either atthe module level or inside a class; you can't decorate class definitions.
A decorator is just a function that takes the function to be decorated as anargument and returns either the same function or some new object. The returnvalue of the decorator need not be callable (though it typically is), unlessfurther decorators will be applied to the result. It's easy to write your owndecorators. The following simple example just sets an attribute on the functionobject:
>>>defdeco(func):...func.attr='decorated'...returnfunc...>>>@deco...deff():pass...>>>f<function f at 0x402ef0d4>>>>f.attr'decorated'>>>
As a slightly more realistic example, the following decorator checks that thesupplied argument is an integer:
defrequire_int(func):defwrapper(arg):assertisinstance(arg,int)returnfunc(arg)returnwrapper@require_intdefp1(arg):printarg@require_intdefp2(arg):printarg*2
An example inPEP 318 contains a fancier version of this idea that lets youboth specify the required type and check the returned type.
Decorator functions can take arguments. If arguments are supplied, yourdecorator function is called with only those arguments and must return a newdecorator function; this function must take a single function and return afunction, as previously described. In other words,@A@B@C(args)
becomes:
deff():..._deco=C(args)f=A(B(_deco(f)))
Getting this right can be slightly brain-bending, but it's not too difficult.
A small related change makes thefunc_name
attribute of functionswritable. This attribute is used to display function names in tracebacks, sodecorators should change the name of any new function that's constructed andreturned.
也參考
- PEP 318 - Decorators for Functions, Methods and Classes
Written by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several peoplewrote patches implementing function decorators, but the one that was actuallychecked in was patch #979728, written by Mark Russell.
- https://wiki.python.org/moin/PythonDecoratorLibrary
This Wiki page contains several examples of decorators.
PEP 322: Reverse Iteration¶
A new built-in function,reversed(seq)
, takes a sequence and returns aniterator that loops over the elements of the sequence in reverse order.
>>>foriinreversed(xrange(1,4)):...printi...321
Compared to extended slicing, such asrange(1,4)[::-1]
,reversed()
iseasier to read, runs faster, and uses substantially less memory.
Note thatreversed()
only accepts sequences, not arbitrary iterators. Ifyou want to reverse an iterator, first convert it to a list withlist()
.
>>>input=open('/etc/passwd','r')>>>forlineinreversed(list(input)):...printline...root:*:0:0:System Administrator:/var/root:/bin/tcsh ...
也參考
- PEP 322 - Reverse Iteration
Written and implemented by Raymond Hettinger.
PEP 324: New subprocess Module¶
The standard library provides a number of ways to execute a subprocess, offeringdifferent features and different levels of complexity.os.system(command)
is easy to use, but slow (it runs a shell processwhich executes the command) and dangerous (you have to be careful about escapingthe shell's metacharacters). Thepopen2
module offers classes that cancapture standard output and standard error from the subprocess, but the namingis confusing. Thesubprocess
module cleans this up, providing a unifiedinterface that offers all the features you might need.
Instead ofpopen2
's collection of classes,subprocess
contains asingle class calledsubprocess.Popen
whose constructor supports a number ofdifferent keyword arguments.
classPopen(args,bufsize=0,executable=None,stdin=None,stdout=None,stderr=None,preexec_fn=None,close_fds=False,shell=False,cwd=None,env=None,universal_newlines=False,startupinfo=None,creationflags=0):
args is commonly a sequence of strings that will be the arguments to theprogram executed as the subprocess. (If theshell argument is true,argscan be a string which will then be passed on to the shell for interpretation,just asos.system()
does.)
stdin,stdout, andstderr specify what the subprocess's input, output, anderror streams will be. You can provide a file object or a file descriptor, oryou can use the constantsubprocess.PIPE
to create a pipe between thesubprocess and the parent.
The constructor has a number of handy options:
close_fds requests that all file descriptors be closed before running thesubprocess.
cwd specifies the working directory in which the subprocess will be executed(defaulting to whatever the parent's working directory is).
env is a dictionary specifying environment variables.
preexec_fn is a function that gets called before the child is started.
universal_newlines opens the child's input and output using Python'suniversal newlines feature.
Once you've created thePopen
instance, you can call itswait()
method to pause until the subprocess has exited,poll()
to check if it'sexited without pausing, orcommunicate(data)
to send the stringdatato the subprocess's standard input.communicate(data)
then reads anydata that the subprocess has sent to its standard output or standard error,returning a tuple(stdout_data,stderr_data)
.
call()
is a shortcut that passes its arguments along to thePopen
constructor, waits for the command to complete, and returns the status code ofthe subprocess. It can serve as a safer analog toos.system()
:
sts=subprocess.call(['dpkg','-i','/tmp/new-package.deb'])ifsts==0:# 成功...else:# dpkg 回傳一個錯誤...
The command is invoked without use of the shell. If you really do want to usethe shell, you can addshell=True
as a keyword argument and provide a stringinstead of a sequence:
sts=subprocess.call('dpkg -i /tmp/new-package.deb',shell=True)
The PEP takes various examples of shell and Python code and shows how they'd betranslated into Python code that usessubprocess
. Reading this sectionof the PEP is highly recommended.
也參考
- PEP 324 - subprocess - 新的行程模組
Written and implemented by Peter Åstrand, with assistance from Fredrik Lundh andothers.
PEP 327: Decimal Data Type¶
Python has always supported floating-point (FP) numbers, based on the underlyingCdouble type, as a data type. However, while most programminglanguages provide a floating-point type, many people (even programmers) areunaware that floating-point numbers don't represent certain decimal fractionsaccurately. The newDecimal
type can represent these fractionsaccurately, up to a user-specified precision limit.
Why is Decimal needed?¶
The limitations arise from the representation used for floating-point numbers.FP numbers are made up of three components:
The sign, which is positive or negative.
The mantissa, which is a single-digit binary number followed by a fractionalpart. For example,
1.01
in base-2 notation is1+0/2+1/4
, or 1.25 indecimal notation.The exponent, which tells where the decimal point is located in the numberrepresented.
For example, the number 1.25 has positive sign, a mantissa value of 1.01 (inbinary), and an exponent of 0 (the decimal point doesn't need to be shifted).The number 5 has the same sign and mantissa, but the exponent is 2 because themantissa is multiplied by 4 (2 to the power of the exponent 2); 1.25 * 4 equals5.
Modern systems usually provide floating-point support that conforms to astandard called IEEE 754. C'sdouble type is usually implemented as a64-bit IEEE 754 number, which uses 52 bits of space for the mantissa. Thismeans that numbers can only be specified to 52 bits of precision. If you'retrying to represent numbers whose expansion repeats endlessly, the expansion iscut off after 52 bits. Unfortunately, most software needs to produce output inbase 10, and common fractions in base 10 are often repeating decimals in binary.For example, 1.1 decimal is binary1.0001100110011...
; .1 = 1/16 + 1/32 +1/256 plus an infinite number of additional terms. IEEE 754 has to chop offthat infinitely repeated decimal after 52 digits, so the representation isslightly inaccurate.
Sometimes you can see this inaccuracy when the number is printed:
>>>1.11.1000000000000001
The inaccuracy isn't always visible when you print the number because theFP-to-decimal-string conversion is provided by the C library, and most C libraries tryto produce sensible output. Even if it's not displayed, however, the inaccuracyis still there and subsequent operations can magnify the error.
For many applications this doesn't matter. If I'm plotting points anddisplaying them on my monitor, the difference between 1.1 and 1.1000000000000001is too small to be visible. Reports often limit output to a certain number ofdecimal places, and if you round the number to two or three or even eightdecimal places, the error is never apparent. However, for applications where itdoes matter, it's a lot of work to implement your own custom arithmeticroutines.
Hence, theDecimal
type was created.
Decimal
型別¶
A new module,decimal
, was added to Python's standard library. Itcontains two classes,Decimal
andContext
.Decimal
instances represent numbers, andContext
instances are used to wrap upvarious settings such as the precision and default rounding mode.
Decimal
instances are immutable, like regular Python integers and FPnumbers; once it's been created, you can't change the value an instancerepresents.Decimal
instances can be created from integers orstrings:
>>>importdecimal>>>decimal.Decimal(1972)Decimal("1972")>>>decimal.Decimal("1.1")Decimal("1.1")
You can also provide tuples containing the sign, the mantissa represented as atuple of decimal digits, and the exponent:
>>>decimal.Decimal((1,(1,4,7,5),-2))Decimal("-14.75")
Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 isnegative.
Converting from floating-point numbers poses a bit of a problem: should the FPnumber representing 1.1 turn into the decimal number for exactly 1.1, or for 1.1plus whatever inaccuracies are introduced? The decision was to dodge the issueand leave such a conversion out of the API. Instead, you should convert thefloating-point number into a string using the desired precision and pass thestring to theDecimal
constructor:
>>>f=1.1>>>decimal.Decimal(str(f))Decimal("1.1")>>>decimal.Decimal('%.12f'%f)Decimal("1.100000000000")
Once you haveDecimal
instances, you can perform the usual mathematicaloperations on them. One limitation: exponentiation requires an integerexponent:
>>>a=decimal.Decimal('35.72')>>>b=decimal.Decimal('1.73')>>>a+bDecimal("37.45")>>>a-bDecimal("33.99")>>>a*bDecimal("61.7956")>>>a/bDecimal("20.64739884393063583815028902")>>>a**2Decimal("1275.9184")>>>a**bTraceback (most recent call last):...decimal.InvalidOperation:x ** (non-integer)
You can combineDecimal
instances with integers, but not withfloating-point numbers:
>>>a+4Decimal("39.72")>>>a+4.5Traceback (most recent call last):...TypeError:You can interact Decimal only with int, long or Decimal data types.>>>
Decimal
numbers can be used with themath
andcmath
modules, but note that they'll be immediately converted to floating-pointnumbers before the operation is performed, resulting in a possible loss ofprecision and accuracy. You'll also get back a regular floating-point numberand not aDecimal
.
>>>importmath,cmath>>>d=decimal.Decimal('123456789012.345')>>>math.sqrt(d)351364.18288201344>>>cmath.sqrt(-d)351364.18288201344j
Decimal
instances have asqrt()
method that returns aDecimal
, but if you need other things such as trigonometric functionsyou'll have to implement them.
>>>d.sqrt()Decimal("351364.1828820134592177245001")
Context
型別¶
Instances of theContext
class encapsulate several settings fordecimal operations:
prec
is the precision, the number of decimal places.rounding
specifies the rounding mode. Thedecimal
module hasconstants for the various possibilities:ROUND_DOWN
,ROUND_CEILING
,ROUND_HALF_EVEN
, and various others.traps
is a dictionary specifying what happens on encountering certainerror conditions: either an exception is raised or a value is returned. Someexamples of error conditions are division by zero, loss of precision, andoverflow.
There's a thread-local default context available by callinggetcontext()
;you can change the properties of this context to alter the default precision,rounding, or trap handling. The following example shows the effect of changingthe precision of the default context:
>>>decimal.getcontext().prec28>>>decimal.Decimal(1)/decimal.Decimal(7)Decimal("0.1428571428571428571428571429")>>>decimal.getcontext().prec=9>>>decimal.Decimal(1)/decimal.Decimal(7)Decimal("0.142857143")
The default action for error conditions is selectable; the module can eitherreturn a special value such as infinity or not-a-number, or exceptions can beraised:
>>>decimal.Decimal(1)/decimal.Decimal(0)Traceback (most recent call last):...decimal.DivisionByZero:x / 0>>>decimal.getcontext().traps[decimal.DivisionByZero]=False>>>decimal.Decimal(1)/decimal.Decimal(0)Decimal("Infinity")>>>
TheContext
instance also has various methods for formatting numberssuch asto_eng_string()
andto_sci_string()
.
For more information, see the documentation for thedecimal
module, whichincludes a quick-start tutorial and a reference.
也參考
- PEP 327 - Decimal Data Type
Written by Facundo Batista and implemented by Facundo Batista, Eric Price,Raymond Hettinger, Aahz, and Tim Peters.
- http://www.lahey.com/float.htm
The article uses Fortran code to illustrate many of the problems thatfloating-point inaccuracy can cause.
- https://speleotrove.com/decimal/
A description of a decimal-based representation. This representation is beingproposed as a standard, and underlies the new Python decimal type. Much of thismaterial was written by Mike Cowlishaw, designer of the Rexx language.
PEP 328: Multi-line Imports¶
One language change is a small syntactic tweak aimed at making it easier toimport many names from a module. In afrommoduleimportnames
statement,names is a sequence of names separated by commas. If the sequence is verylong, you can either write multiple imports from the same module, or you can usebackslashes to escape the line endings like this:
fromSimpleXMLRPCServerimportSimpleXMLRPCServer,\SimpleXMLRPCRequestHandler,\CGIXMLRPCRequestHandler,\resolve_dotted_attribute
The syntactic change in Python 2.4 simply allows putting the names withinparentheses. Python ignores newlines within a parenthesized expression, so thebackslashes are no longer needed:
fromSimpleXMLRPCServerimport(SimpleXMLRPCServer,SimpleXMLRPCRequestHandler,CGIXMLRPCRequestHandler,resolve_dotted_attribute)
The PEP also proposes that allimport
statements be absolute imports,with a leading.
character to indicate a relative import. This part of thePEP was not implemented for Python 2.4, but was completed for Python 2.5.
也參考
- PEP 328 - Imports: Multi-Line and Absolute/Relative
Written by Aahz. Multi-line imports were implemented by Dima Dorfman.
PEP 331: Locale-Independent Float/String Conversions¶
Thelocale
modules lets Python software select various conversions anddisplay conventions that are localized to a particular country or language.However, the module was careful to not change the numeric locale because variousfunctions in Python's implementation required that the numeric locale remain setto the'C'
locale. Often this was because the code was using the Clibrary'satof()
function.
Not setting the numeric locale caused trouble for extensions that used third-partyC libraries, however, because they wouldn't have the correct locale set.The motivating example was GTK+, whose user interface widgets weren't displayingnumbers in the current locale.
The solution described in the PEP is to add three new functions to the PythonAPI that perform ASCII-only conversions, ignoring the locale setting:
PyOS_ascii_strtod(str,ptr)
andPyOS_ascii_atof(str,ptr)
both convert a string to a Cdouble.PyOS_ascii_formatd(buffer,buf_len,format,d)
converts adouble to an ASCII string.
The code for these functions came from the GLib library(https://developer-old.gnome.org/glib/2.26/), whose developers kindlyrelicensed the relevant functions and donated them to the Python SoftwareFoundation. Thelocale
module can now change the numeric locale,letting extensions such as GTK+ produce the correct results.
也參考
- PEP 331 - Locale-Independent Float/String Conversions
Written by Christian R. Reis, and implemented by Gustavo Carneiro.
其他語言更動¶
Here are all of the changes that Python 2.4 makes to the core Python language.
Decorators for functions and methods were added (PEP 318).
Built-in
set()
andfrozenset()
types were added (PEP 218).Other new built-ins include thereversed(seq)
function (PEP 322).Generator expressions were added (PEP 289).
Certain numeric expressions no longer return values restricted to 32 or 64bits (PEP 237).
You can now put parentheses around the list of names in a
frommoduleimportnames
statement (PEP 328).The
dict.update()
method now accepts the same argument forms as thedict
constructor. This includes any mapping, any iterable of key/valuepairs, and keyword arguments. (Contributed by Raymond Hettinger.)The string methods
ljust()
,rjust()
, andcenter()
now takean optional argument for specifying a fill character other than a space.(Contributed by Raymond Hettinger.)Strings also gained an
rsplit()
method that works like thesplit()
method but splits from the end of the string. (Contributed by SeanReifschneider.)>>>'www.python.org'.split('.',1)['www', 'python.org']'www.python.org'.rsplit('.', 1)['www.python', 'org']
Three keyword parameters,cmp,key, andreverse, were added to the
sort()
method of lists. These parameters make some common usages ofsort()
simpler. All of these parameters are optional.For thecmp parameter, the value should be a comparison function that takestwo parameters and returns -1, 0, or +1 depending on how the parameters compare.This function will then be used to sort the list. Previously this was the onlyparameter that could be provided to
sort()
.key should be a single-parameter function that takes a list element andreturns a comparison key for the element. The list is then sorted using thecomparison keys. The following example sorts a list case-insensitively:
>>>L=['A','b','c','D']>>>L.sort()# Case-sensitive sort>>>L['A', 'D', 'b', 'c']>>># Using 'key' parameter to sort list>>>L.sort(key=lambdax:x.lower())>>>L['A', 'b', 'c', 'D']>>># Old-fashioned way>>>L.sort(cmp=lambdax,y:cmp(x.lower(),y.lower()))>>>L['A', 'b', 'c', 'D']
The last example, which uses thecmp parameter, is the old way to perform acase-insensitive sort. It works but is slower than using akey parameter.Usingkey calls
lower()
method once for each element in the list whileusingcmp will call it twice for each comparison, so usingkey saves oninvocations of thelower()
method.For simple key functions and comparison functions, it is often possible to avoida
lambda
expression by using an unbound method instead. For example,the above case-insensitive sort is best written as:>>>L.sort(key=str.lower)>>>L['A', 'b', 'c', 'D']
Finally, thereverse parameter takes a Boolean value. If the value is true,the list will be sorted into reverse order. Instead of
L.sort();L.reverse()
, you can now writeL.sort(reverse=True)
.The results of sorting are now guaranteed to be stable. This means that twoentries with equal keys will be returned in the same order as they were input.For example, you can sort a list of people by name, and then sort the list byage, resulting in a list sorted by age where people with the same age are inname-sorted order.
(All changes to
sort()
contributed by Raymond Hettinger.)There is a new built-in function
sorted(iterable)
that works like thein-placelist.sort()
method but can be used in expressions. Thedifferences are:the input may be any iterable;
a newly formed copy is sorted, leaving the original intact; and
the expression returns the new sorted copy
>>>L=[9,7,8,3,2,4,1,6,5]>>>[10+iforiinsorted(L)]# usable in a list comprehension[11, 12, 13, 14, 15, 16, 17, 18, 19]>>>L# original is left unchanged[9,7,8,3,2,4,1,6,5]>>>sorted('Monty Python')# any iterable may be an input[' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y']>>># List the contents of a dict sorted by key values>>>colormap=dict(red=1,blue=2,green=3,black=4,yellow=5)>>>fork,vinsorted(colormap.iteritems()):...printk,v...black 4blue 2green 3red 1yellow 5
(由 Raymond Hettinger 所貢獻。)
Integer operations will no longer trigger an
OverflowWarning
. TheOverflowWarning
warning will disappear in Python 2.5.The interpreter gained a new switch,
-m
, that takes a name, searchesfor the corresponding module onsys.path
, and runs the module as a script.For example, you can now run the Python profiler withpython-mprofile
.(Contributed by Nick Coghlan.)The
eval(expr,globals,locals)
andexecfile(filename,globals,locals)
functions and theexec
statement now accept any mapping typefor thelocals parameter. Previously this had to be a regular Pythondictionary. (Contributed by Raymond Hettinger.)The
zip()
built-in function anditertools.izip()
now return anempty list if called with no arguments. Previously they raised aTypeError
exception. This makes them more suitable for use with variablelength argument lists:>>>deftranspose(array):...returnzip(*array)...>>>transpose([(1,2,3),(4,5,6)])[(1, 4), (2, 5), (3, 6)]>>>transpose([])[]
(由 Raymond Hettinger 所貢獻。)
Encountering a failure while importing a module no longer leaves a partially initializedmodule object in
sys.modules
. The incomplete module object leftbehind would fool further imports of the same module into succeeding, leading toconfusing errors. (Fixed by Tim Peters.)None
is now a constant; code that binds a new value to the nameNone
is now a syntax error. (Contributed by Raymond Hettinger.)
最佳化¶
The inner loops for list and tuple slicing were optimized and now run aboutone-third faster. The inner loops for dictionaries were also optimized,resulting in performance boosts for
keys()
,values()
,items()
,iterkeys()
,itervalues()
, anditeritems()
. (Contributed byRaymond Hettinger.)The machinery for growing and shrinking lists was optimized for speed and forspace efficiency. Appending and popping from lists now runs faster due to moreefficient code paths and less frequent use of the underlying system
realloc()
. List comprehensions also benefit.list.extend()
wasalso optimized and no longer converts its argument into a temporary list beforeextending the base list. (Contributed by Raymond Hettinger.)list()
,tuple()
,map()
,filter()
, andzip()
nowrun several times faster with non-sequence arguments that supply a__len__()
method. (Contributed by Raymond Hettinger.)The methods
list.__getitem__()
,dict.__getitem__()
, anddict.__contains__()
are now implemented asmethod_descriptor
objects rather thanwrapper_descriptor
objects. This form of accessdoubles their performance and makes them more suitable for use as arguments tofunctionals:map(mydict.__getitem__,keylist)
. (Contributed by RaymondHettinger.)Added a new opcode,
LIST_APPEND
, that simplifies the generated bytecodefor list comprehensions and speeds them up by about a third. (Contributed byRaymond Hettinger.)The peephole bytecode optimizer has been improved to produce shorter, fasterbytecode; remarkably, the resulting bytecode is more readable. (Enhanced byRaymond Hettinger.)
String concatenations in statements of the form
s=s+"abc"
ands+="abc"
are now performed more efficiently in certain circumstances. Thisoptimization won't be present in other Python implementations such as Jython, soyou shouldn't rely on it; using thejoin()
method of strings is stillrecommended when you want to efficiently glue a large number of stringstogether. (Contributed by Armin Rigo.)
The net result of the 2.4 optimizations is that Python 2.4 runs the pystonebenchmark around 5% faster than Python 2.3 and 35% faster than Python 2.2.(pystone is not a particularly good benchmark, but it's the most commonly usedmeasurement of Python's performance. Your own applications may show greater orsmaller benefits from Python 2.4.)
New, Improved, and Deprecated Modules¶
As usual, Python's standard library received a number of enhancements and bugfixes. Here's a partial list of the most notable changes, sorted alphabeticallyby module name. Consult theMisc/NEWS
file in the source tree for a morecomplete list of changes, or look through the CVS logs for all the details.
The
asyncore
module'sloop()
function now has acount parameterthat lets you perform a limited number of passes through the polling loop. Thedefault is still to loop forever.The
base64
module now has more completeRFC 3548 support for Base64,Base32, and Base16 encoding and decoding, including optional case folding andoptional alternative alphabets. (Contributed by Barry Warsaw.)The
bisect
module now has an underlying C implementation for improvedperformance. (Contributed by Dmitry Vasiliev.)The CJKCodecs collections of East Asian codecs, maintained by Hye-Shik Chang,was integrated into 2.4. The new encodings are:
Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz
Chinese (ROC): big5, cp950
- Japanese: cp932, euc-jis-2004, euc-jp, euc-jisx0213, iso-2022-jp,
iso-2022-jp-1, iso-2022-jp-2, iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004,shift-jis, shift-jisx0213, shift-jis-2004
Korean: cp949, euc-kr, johab, iso-2022-kr
Some other new encodings were added: HP Roman8, ISO_8859-11, ISO_8859-16,PCTP-154, and TIS-620.
The UTF-8 and UTF-16 codecs now cope better with receiving partial input.Previously the
StreamReader
class would try to read more data, makingit impossible to resume decoding from the stream. Theread()
method willnow return as much data as it can and future calls will resume decoding whereprevious ones left off. (Implemented by Walter Dörwald.)There is a new
collections
module for various specialized collectiondatatypes. Currently it contains just one type,deque
, a double-endedqueue that supports efficiently adding and removing elements from eitherend:>>>fromcollectionsimportdeque>>>d=deque('ghi')# make a new deque with three items>>>d.append('j')# add a new entry to the right side>>>d.appendleft('f')# add a new entry to the left side>>>d# show the representation of the dequedeque(['f', 'g', 'h', 'i', 'j'])>>>d.pop()# return and remove the rightmost item'j'>>>d.popleft()# return and remove the leftmost item'f'>>>list(d)# list the contents of the deque['g', 'h', 'i']>>>'h'ind# search the dequeTrue
Several modules, such as the
Queue
andthreading
modules, now takeadvantage ofcollections.deque
for improved performance. (Contributedby Raymond Hettinger.)The
ConfigParser
classes have been enhanced slightly. Theread()
method now returns a list of the files that were successfully parsed, and theset()
method raisesTypeError
if passed avalue argument thatisn't a string. (Contributed by John Belmonte and David Goodger.)The
curses
module now supports the ncurses extensionuse_default_colors()
. On platforms where the terminal supportstransparency, this makes it possible to use a transparent background.(Contributed by Jörg Lehmann.)The
difflib
module now includes anHtmlDiff
class that createsan HTML table showing a side by side comparison of two versions of a text.(Contributed by Dan Gass.)The
email
package was updated to version 3.0, which dropped variousdeprecated APIs and removes support for Python versions earlier than 2.3. The3.0 version of the package uses a new incremental parser for MIME messages,available in theemail.FeedParser
module. The new parser doesn't requirereading the entire message into memory, and doesn't raise exceptions if amessage is malformed; instead it records any problems in thedefect
attribute of the message. (Developed by Anthony Baxter, Barry Warsaw, ThomasWouters, and others.)The
heapq
module has been converted to C. The resulting tenfoldimprovement in speed makes the module suitable for handling high volumes ofdata. In addition, the module has two new functionsnlargest()
andnsmallest()
that use heaps to find the N largest or smallest values in adataset without the expense of a full sort. (Contributed by Raymond Hettinger.)The
httplib
module now contains constants for HTTP status codes definedin various HTTP-related RFC documents. Constants have names such asOK
,CREATED
,CONTINUE
, andMOVED_PERMANENTLY
; use pydoc to get a full list. (Contributed byAndrew Eland.)The
imaplib
module now supports IMAP's THREAD command (contributed byYves Dionne) and newdeleteacl()
andmyrights()
methods (contributedby Arnaud Mazin).The
itertools
module gained agroupby(iterable[,*func*])
function.iterable is something that can be iterated over to return a streamof elements, and the optionalfunc parameter is a function that takes anelement and returns a key value; if omitted, the key is simply the elementitself.groupby()
then groups the elements into subsequences which havematching values of the key, and returns a series of 2-tuples containing the keyvalue and an iterator over the subsequence.Here's an example to make this clearer. Thekey function simply returnswhether a number is even or odd, so the result of
groupby()
is to returnconsecutive runs of odd or even numbers.>>>importitertools>>>L=[2,4,6,7,8,9,11,12,14]>>>forkey_val,itinitertools.groupby(L,lambdax:x%2):...printkey_val,list(it)...0 [2, 4, 6]1 [7]0 [8]1 [9, 11]0 [12, 14]>>>
groupby()
is typically used with sorted input. The logic forgroupby()
is similar to the Unixuniq
filter which makes it handy foreliminating, counting, or identifying duplicate elements:>>>word='abracadabra'>>>letters=sorted(word)# Turn string into a sorted list of letters>>>letters['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r']>>>fork,ginitertools.groupby(letters):...printk,list(g)...a ['a', 'a', 'a', 'a', 'a']b ['b', 'b']c ['c']d ['d']r ['r', 'r']>>># List unique letters>>>[kfork,gingroupby(letters)]['a', 'b', 'c', 'd', 'r']>>># Count letter occurrences>>>[(k,len(list(g)))fork,gingroupby(letters)][('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]
(由 Hye-Shik Chang 所貢獻。)
itertools
also gained a function namedtee(iterator,N)
thatreturnsN independent iterators that replicateiterator. IfN is omitted,the default is 2.>>>L=[1,2,3]>>>i1,i2=itertools.tee(L)>>>i1,i2(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>)>>>list(i1)# Run the first iterator to exhaustion[1, 2, 3]>>>list(i2)# Run the second iterator to exhaustion[1, 2, 3]
Note that
tee()
has to keep copies of the values returned by theiterator; in the worst case, it may need to keep all of them. This shouldtherefore be used carefully if the leading iterator can run far ahead of thetrailing iterator in a long stream of inputs. If the separation is large, thenyou might as well uselist()
instead. When the iterators track closelywith one another,tee()
is ideal. Possible applications includebookmarking, windowing, or lookahead iterators. (Contributed by RaymondHettinger.)A number of functions were added to the
locale
module, such asbind_textdomain_codeset()
to specify a particular encoding and a family ofl*gettext()
functions that return messages in the chosen encoding.(Contributed by Gustavo Niemeyer.)Some keyword arguments were added to the
logging
package'sbasicConfig()
function to simplify log configuration. The defaultbehavior is to log messages to standard error, but various keyword arguments canbe specified to log to a particular file, change the logging format, or set thelogging level. For example:importlogginglogging.basicConfig(filename='/var/log/application.log',level=0,# Log all messagesformat='%(levelname):%(process):%(thread):%(message)')
Other additions to the
logging
package include alog(level,msg)
convenience method, as well as aTimedRotatingFileHandler
class thatrotates its log files at a timed interval. The module already hadRotatingFileHandler
, which rotated logs once the file exceeded acertain size. Both classes derive from a newBaseRotatingHandler
classthat can be used to implement other rotating handlers.(由 Vinay Sajip 所實作。)
The
marshal
module now shares interned strings on unpacking a datastructure. This may shrink the size of certain pickle strings, but the primaryeffect is to make.pyc
files significantly smaller. (Contributed byMartin von Löwis.)The
nntplib
module'sNNTP
class gaineddescription()
anddescriptions()
methods to retrieve newsgroup descriptions for a singlegroup or for a range of groups. (Contributed by Jürgen A. Erhard.)Two new functions were added to the
operator
module,attrgetter(attr)
anditemgetter(index)
. Both functions returncallables that take a single argument and return the corresponding attribute oritem; these callables make excellent data extractors when used withmap()
orsorted()
. For example:>>>L=[('c',2),('d',1),('a',4),('b',3)]>>>map(operator.itemgetter(0),L)['c', 'd', 'a', 'b']>>>map(operator.itemgetter(1),L)[2, 1, 4, 3]>>>sorted(L,key=operator.itemgetter(1))# Sort list by second tuple item[('d', 1), ('c', 2), ('b', 3), ('a', 4)]
(由 Raymond Hettinger 所貢獻。)
The
optparse
module was updated in various ways. The module now passesits messages throughgettext.gettext()
, making it possible tointernationalize Optik's help and error messages. Help messages for options cannow include the string'%default'
, which will be replaced by the option'sdefault value. (Contributed by Greg Ward.)The long-term plan is to deprecate the
rfc822
module in some futurePython release in favor of theemail
package. To this end, theemail.Utils.formatdate
function has been changed to make it usable as areplacement forrfc822.formatdate()
. You may want to write new e-mailprocessing code with this in mind. (Change implemented by Anthony Baxter.)A new
urandom(n)
function was added to theos
module, returninga string containingn bytes of random data. This function provides access toplatform-specific sources of randomness such as/dev/urandom
on Linux orthe Windows CryptoAPI. (Contributed by Trevor Perrin.)Another new function:
os.path.lexists(path)
returns true if the filespecified bypath exists, whether or not it's a symbolic link. This differsfrom the existingos.path.exists(path)
function, which returns false ifpath is a symlink that points to a destination that doesn't exist.(Contributed by Beni Cherniavsky.)A new
getsid()
function was added to theposix
module thatunderlies theos
module. (Contributed by J. Raynor.)The
poplib
module now supports POP over SSL. (Contributed by HectorUrtubia.)The
profile
module can now profile C extension functions. (Contributedby Nick Bastin.)The
random
module has a new method calledgetrandbits(N)
thatreturns a long integerN bits in length. The existingrandrange()
method now usesgetrandbits()
where appropriate, making generation ofarbitrarily large random numbers more efficient. (Contributed by RaymondHettinger.)The regular expression language accepted by the
re
module was extendedwith simple conditional expressions, written as(?(group)A|B)
.group iseither a numeric group ID or a group name defined with(?P<group>...)
earlier in the expression. If the specified group matched, the regularexpression patternA will be tested against the string; if the group didn'tmatch, the patternB will be used instead. (Contributed by Gustavo Niemeyer.)The
re
module is also no longer recursive, thanks to a massive amountof work by Gustavo Niemeyer. In a recursive regular expression engine, certainpatterns result in a large amount of C stack space being consumed, and it waspossible to overflow the stack. For example, if you matched a 30000-byte stringofa
characters against the expression(a|b)+
, one stack frame wasconsumed per character. Python 2.3 tried to check for stack overflow and raiseaRuntimeError
exception, but certain patterns could sidestep thechecking and if you were unlucky Python could segfault. Python 2.4's regularexpression engine can match this pattern without problems.The
signal
module now performs tighter error-checking on the parametersto thesignal.signal()
function. For example, you can't set a handler ontheSIGKILL
signal; previous versions of Python would quietly acceptthis, but 2.4 will raise aRuntimeError
exception.Two new functions were added to the
socket
module.socketpair()
returns a pair of connected sockets andgetservbyport(port)
looks up theservice name for a given port number. (Contributed by Dave Cole and BarryWarsaw.)The
sys.exitfunc()
function has been deprecated. Code should be usingthe existingatexit
module, which correctly handles calling multiple exitfunctions. Eventuallysys.exitfunc()
will become a purely internalinterface, accessed only byatexit
.The
tarfile
module now generates GNU-format tar files by default.(Contributed by Lars Gustäbel.)The
threading
module now has an elegantly simple way to supportthread-local data. The module contains alocal
class whose attributevalues are local to different threads.importthreadingdata=threading.local()data.number=42data.url=('www.python.org',80)
Other threads can assign and retrieve their own values for the
number
andurl
attributes. You can subclasslocal
to initializeattributes or to add methods. (Contributed by Jim Fulton.)The
timeit
module now automatically disables periodic garbagecollection during the timing loop. This change makes consecutive timings morecomparable. (Contributed by Raymond Hettinger.)The
weakref
module now supports a wider variety of objects includingPython functions, class instances, sets, frozensets, deques, arrays, files,sockets, and regular expression pattern objects. (Contributed by RaymondHettinger.)The
xmlrpclib
module now supports a multi-call extension fortransmitting multiple XML-RPC calls in a single HTTP operation. (Contributed byBrian Quinlan.)The
mpz
,rotor
, andxreadlines
modules have beenremoved.
cookielib¶
Thecookielib
library supports client-side handling for HTTP cookies,mirroring theCookie
module's server-side cookie support. Cookies arestored in cookie jars; the library transparently stores cookies offered by theweb server in the cookie jar, and fetches the cookie from the jar whenconnecting to the server. As in web browsers, policy objects control whethercookies are accepted or not.
In order to store cookies across sessions, two implementations of cookie jarsare provided: one that stores cookies in the Netscape format so applications canuse the Mozilla or Lynx cookie files, and one that stores cookies in the sameformat as the Perl libwww library.
urllib2
has been changed to interact withcookielib
:HTTPCookieProcessor
manages a cookie jar that is used when accessingURLs.
This module was contributed by John J. Lee.
doctest¶
Thedoctest
module underwent considerable refactoring thanks to EdwardLoper and Tim Peters. Testing can still be as simple as runningdoctest.testmod()
, but the refactorings allow customizing the module'soperation in various ways
The newDocTestFinder
class extracts the tests from a given object'sdocstrings:
deff(x,y):""">>> f(2,2)4>>> f(3,2)6 """returnx*yfinder=doctest.DocTestFinder()# Get list of DocTest instancestests=finder.find(f)
The newDocTestRunner
class then runs individual tests and can producea summary of the results:
runner=doctest.DocTestRunner()fortintests:tried,failed=runner.run(t)runner.summarize(verbose=1)
The above example produces the following output:
1itemspassedalltests:2testsinf2testsin1items.2passedand0failed.Testpassed.
DocTestRunner
uses an instance of theOutputChecker
class tocompare the expected output with the actual output. This class takes a numberof different flags that customize its behaviour; ambitious users can also writea completely new subclass ofOutputChecker
.
The default output checker provides a number of handy features. For example,with thedoctest.ELLIPSIS
option flag, an ellipsis (...
) in theexpected output matches any substring, making it easier to accommodate outputsthat vary in minor ways:
defo(n):""">>> o(1)<__main__.C instance at 0x...>>>>"""
Another special string,<BLANKLINE>
, matches a blank line:
defp(n):""">>> p(1)<BLANKLINE>>>>"""
Another new capability is producing a diff-style display of the output byspecifying thedoctest.REPORT_UDIFF
(unified diffs),doctest.REPORT_CDIFF
(context diffs), ordoctest.REPORT_NDIFF
(delta-style) option flags. For example:
defg(n):""">>> g(4)hereisalengthy>>>"""L='here is a rather lengthy list of words'.split()forwordinL[:n]:printword
Running the above function's tests withdoctest.REPORT_UDIFF
specified,you get the following output:
**********************************************************************File "t.py", line 15, in gFailed example: g(4)Differences (unified diff with -expected +actual): @@ -2,3 +2,3 @@ is a -lengthy +rather**********************************************************************
建置和 C API 變更¶
Some of the changes to Python's build process and to the C API are:
Three new convenience macros were added for common return values fromextension functions:
Py_RETURN_NONE
,Py_RETURN_TRUE
, andPy_RETURN_FALSE
. (Contributed by Brett Cannon.)Another new macro,
Py_CLEAR
, decreases the reference count ofobj and setsobj to the null pointer. (Contributed by Jim Fulton.)A new function,
PyTuple_Pack(N,obj1,obj2,...,objN)
, constructstuples from a variable length argument list of Python objects. (Contributed byRaymond Hettinger.)A new function,
PyDict_Contains(d,k)
, implements fast dictionarylookups without masking exceptions raised during the look-up process.(Contributed by Raymond Hettinger.)ThePy_IS_NAN(X) macro returns 1 if its float or double argumentX is a NaN. (Contributed by Tim Peters.)
C code can avoid unnecessary locking by using the new
PyEval_ThreadsInitialized()
function to tell if any thread operationshave been performed. If this function returns false, no lock operations areneeded. (Contributed by Nick Coghlan.)A new function,
PyArg_VaParseTupleAndKeywords()
, is the same asPyArg_ParseTupleAndKeywords()
but takes ava_list
instead of anumber of arguments. (Contributed by Greg Chapman.)A new method flag,
METH_COEXIST
, allows a function defined in slotsto co-exist with aPyCFunction
having the same name. This can halvethe access time for a method such asset.__contains__()
. (Contributed byRaymond Hettinger.)Python can now be built with additional profiling for the interpreter itself,intended as an aid to people developing the Python core. Providing
--enable-profiling
to theconfigure script will let youprofile the interpreter withgprof, and providing the--with-tsc
switch enables profiling using the Pentium'sTime-Stamp-Counter register. Note that the--with-tsc
switch is slightlymisnamed, because the profiling feature also works on the PowerPC platform,though that processor architecture doesn't call that register "the TSCregister". (Contributed by Jeremy Hylton.)The
tracebackobject
type has been renamed toPyTracebackObject
.
Port-Specific Changes¶
The Windows port now builds under MSVC++ 7.1 as well as version 6.(Contributed by Martin von Löwis.)
Porting to Python 2.4¶
This section lists previously described changes that may require changes to yourcode:
Left shifts and hexadecimal/octal constants that are too large no longertrigger a
FutureWarning
and return a value limited to 32 or 64 bits;instead they return a long integer.Integer operations will no longer trigger an
OverflowWarning
. TheOverflowWarning
warning will disappear in Python 2.5.The
zip()
built-in function anditertools.izip()
now return anempty list instead of raising aTypeError
exception if called with noarguments.You can no longer compare the
date
anddatetime
instancesprovided by thedatetime
module. Two instances of different classeswill now always be unequal, and relative comparisons (<
,>
) will raiseaTypeError
.dircache.listdir()
now passes exceptions to the caller instead ofreturning empty lists.LexicalHandler.startDTD()
used to receive the public and system IDs inthe wrong order. This has been corrected; applications relying on the wrongorder need to be fixed.fcntl.ioctl()
now warns if themutate argument is omitted andrelevant.The
tarfile
module now generates GNU-format tar files by default.Encountering a failure while importing a module no longer leaves apartially initialized module object in
sys.modules
.None
is now a constant; code that binds a new value to the nameNone
is now a syntax error.The
signals.signal()
function now raises aRuntimeError
exceptionfor certain illegal values; previously these errors would pass silently. Forexample, you can no longer set a handler on theSIGKILL
signal.
致謝¶
The author would like to thank the following people for offering suggestions,corrections and assistance with various drafts of this article: Koray Can,Hye-Shik Chang, Michael Dyck, Raymond Hettinger, Brian Hurt, Hamish Lawson,Fredrik Lundh, Sean Reifschneider, Sadruddin Rejeb.