- Notifications
You must be signed in to change notification settings - Fork133
JeffPaine/beautiful_idiomatic_python
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Notes from Raymond Hettinger's talk at pycon US 2013video,slides.
The code examples and direct quotes are all from Raymond's talk. I've reproduced them here for my own edification and the hopes that others will find them as handy as I have!
foriin [0,1,2,3,4,5]:printi**2foriinrange(6):printi**2
foriinxrange(6):printi**2
xrange
creates an iterator over the range producing the values one at a time. This approach is much more memory efficient thanrange
.xrange
was renamed torange
in python 3.
colors= ['red','green','blue','yellow']foriinrange(len(colors)):printcolors[i]
forcolorincolors:printcolor
colors= ['red','green','blue','yellow']foriinrange(len(colors)-1,-1,-1):printcolors[i]
forcolorinreversed(colors):printcolor
colors= ['red','green','blue','yellow']foriinrange(len(colors)):printi,'--->',colors[i]
fori,colorinenumerate(colors):printi,'--->',color
It's fast and beautiful and saves you from tracking the individual indices and incrementing them.
Whenever you find yourself manipulating indices [in a collection], you're probably doing it wrong.
names= ['raymond','rachel','matthew']colors= ['red','green','blue','yellow']n=min(len(names),len(colors))foriinrange(n):printnames[i],'--->',colors[i]forname,colorinzip(names,colors):printname,'--->',color
forname,colorinizip(names,colors):printname,'--->',color
zip
creates a new list in memory and takes more memory.izip
is more efficient thanzip
.Note: in python 3izip
was renamed tozip
and promoted to a builtin replacing the oldzip
.
colors= ['red','green','blue','yellow']# Forward sorted orderforcolorinsorted(colors):printcolor# Backwards sorted orderforcolorinsorted(colors,reverse=True):printcolor
colors= ['red','green','blue','yellow']defcompare_length(c1,c2):iflen(c1)<len(c2):return-1iflen(c1)>len(c2):return1return0printsorted(colors,cmp=compare_length)
printsorted(colors,key=len)
The original is slow and unpleasant to write. Also, comparison functions are no longer available in python 3.
blocks= []whileTrue:block=f.read(32)ifblock=='':breakblocks.append(block)
blocks= []forblockiniter(partial(f.read,32),''):blocks.append(block)
iter
takes two arguments. The first you call over and over again and the second is a sentinel value.
deffind(seq,target):found=Falsefori,valueinenumerate(seq):ifvalue==target:found=Truebreakifnotfound:return-1returni
deffind(seq,target):fori,valueinenumerate(seq):ifvalue==target:breakelse:return-1returni
Inside of everyfor
loop is anelse
.
d= {'matthew':'blue','rachel':'green','raymond':'red'}forkind:printkforkind.keys():ifk.startswith('r'):deld[k]
When should you use the second and not the first? When you're mutating the dictionary.
If you mutate something while you're iterating over it, you're living in a state of sin and deserve what ever happens to you.
d.keys()
makes a copy of all the keys and stores them in a list. Then you can modify the dictionary.Note: in python 3 to iterate through a dictionary you have to explicitly write:list(d.keys())
becaused.keys()
returns a "dictionary view" (an iterable that provide a dynamic view on the dictionary’s keys). Seedocumentation.
# Not very fast, has to re-hash every key and do a lookupforkind:printk,'--->',d[k]# Makes a big huge listfork,vind.items():printk,'--->',v
fork,vind.iteritems():printk,'--->',v
iteritems()
is better as it returns an iterator.Note: in python 3 there is noiteritems()
anditems()
behaviour is close to whatiteritems()
had. Seedocumentation.
names= ['raymond','rachel','matthew']colors= ['red','green','blue']d=dict(izip(names,colors))# {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}
For python 3:d = dict(zip(names, colors))
colors= ['red','green','red','blue','green','red']# Simple, basic way to count. A good start for beginners.d= {}forcolorincolors:ifcolornotind:d[color]=0d[color]+=1# {'blue': 1, 'green': 2, 'red': 3}
d= {}forcolorincolors:d[color]=d.get(color,0)+1# Slightly more modern but has several caveats, better for advanced users# who understand the intricaciesd=collections.defaultdict(int)forcolorincolors:d[color]+=1
names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']# In this example, we're grouping by name lengthd= {}fornameinnames:key=len(name)ifkeynotind:d[key]= []d[key].append(name)# {5: ['roger', 'betty'], 6: ['rachel', 'judith'], 7: ['raymond', 'matthew', 'melissa', 'charlie']}d= {}fornameinnames:key=len(name)d.setdefault(key, []).append(name)
d=collections.defaultdict(list)fornameinnames:key=len(name)d[key].append(name)
d= {'matthew':'blue','rachel':'green','raymond':'red'}whiled:key,value=d.popitem()printkey,'-->',value
popitem
is atomic so you don't have to put locks around it to use it in threads.
defaults= {'color':'red','user':'guest'}parser=argparse.ArgumentParser()parser.add_argument('-u','--user')parser.add_argument('-c','--color')namespace=parser.parse_args([])command_line_args= {k:vfork,vinvars(namespace).items()ifv}# The common approach below allows you to use defaults at first, then override them# with environment variables and then finally override them with command line arguments.# It copies data like crazy, unfortunately.d=defaults.copy()d.update(os.environ)d.update(command_line_args)
d=ChainMap(command_line_args,os.environ,defaults)
ChainMap
has been introduced into python 3. Fast and beautiful.
- Positional arguments and indicies are nice
- Keywords and names are better
- The first way is convenient for the computer
- The second corresponds to how human’s think
twitter_search('@obama',False,20,True)
twitter_search('@obama',retweets=False,numtweets=20,popular=True)
Is slightly (microseconds) slower but is worth it for the code clarity and developer time savings.
# Old testmod return valuedoctest.testmod()# (0, 4)# Is this good or bad? You don't know because it's not clear.
# New testmod return value, a named tupledoctest.testmod()# TestResults(failed=0, attempted=4)
A named tuple is a subclass of tuple so they still work like a regular tuple, but are more friendly.
To make a named tuple, call namedtuple factory function in collections module:
fromcollectionsimportnamedtupleTestResults=namedtuple('TestResults', ['failed','attempted'])
p='Raymond','Hettinger',0x30,'python@example.com'# A common approach / habit from other languagesfname=p[0]lname=p[1]age=p[2]email=p[3]
fname,lname,age,email=p
The second approach uses tuple unpacking and is faster and more readable.
deffibonacci(n):x=0y=1foriinrange(n):printxt=yy=x+yx=t
deffibonacci(n):x,y=0,1foriinrange(n):printxx,y=y,x+y
Problems with first approach
- x and y are state, and state should be updated all at once or in between lines that state is mis-matched and a common source of issues
- ordering matters
- it's too low level
The second approach is more high-level, doesn't risk getting the order wrong and is fast.
tmp_x=x+dx*ttmp_y=y+dy*t# NOTE: The "influence" function here is just an example function, what it does# is not important. The important part is how to manage updating multiple# variables at once.tmp_dx=influence(m,x,y,dx,dy,partial='x')tmp_dy=influence(m,x,y,dx,dy,partial='y')x=tmp_xy=tmp_ydx=tmp_dxdy=tmp_dy
# NOTE: The "influence" function here is just an example function, what it does# is not important. The important part is how to manage updating multiple# variables at once.x,y,dx,dy= (x+dx*t,y+dy*t,influence(m,x,y,dx,dy,partial='x'),influence(m,x,y,dx,dy,partial='y'))
- An optimization fundamental rule
- Don’t cause data to move around unnecessarily
- It takes only a little care to avoid O(n**2) behavior instead of linear behavior
Basically, just don't move data around unecessarily.
names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']s=names[0]fornameinnames[1:]:s+=', '+nameprints
print', '.join(names)
names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']delnames[0]# The below are signs you're using the wrong data structurenames.pop(0)names.insert(0,'mark')
names=collections.deque(['raymond','rachel','matthew','roger','betty','melissa','judith','charlie'])# More efficient with collections.dequedelnames[0]names.popleft()names.appendleft('mark')
- Helps separate business logic from administrative logic
- Clean, beautiful tools for factoring code and improving code reuse
- Good naming is essential.
- Remember the Spiderman rule: With great power, comes great responsibility!
# Mixes business / administrative logic and is not reusabledefweb_lookup(url,saved={}):ifurlinsaved:returnsaved[url]page=urllib.urlopen(url).read()saved[url]=pagereturnpage
@cachedefweb_lookup(url):returnurllib.urlopen(url).read()
Note: since python 3.2 there is a decorator for this in thestandard library:functools.lru_cache
.
# Saving the old, restoring the newold_context=getcontext().copy()getcontext().prec=50printDecimal(355)/Decimal(113)setcontext(old_context)
withlocalcontext(Context(prec=50)):printDecimal(355)/Decimal(113)
f=open('data.txt')try:data=f.read()finally:f.close()
withopen('data.txt')asf:data=f.read()
# Make a locklock=threading.Lock()# Old-way to use a locklock.acquire()try:print'Critical section 1'print'Critical section 2'finally:lock.release()
# New-way to use a lockwithlock:print'Critical section 1'print'Critical section 2'
try:os.remove('somefile.tmp')exceptOSError:pass
withignored(OSError):os.remove('somefile.tmp')
ignored
is is new in python 3.4,documentation.Note:ignored
is actually calledsuppress
in the standard library.
To make your ownignored
context manager in the meantime:
@contextmanagerdefignored(*exceptions):try:yieldexceptexceptions:pass
Stick that in your utils directory and you too can ignore exceptions
# Temporarily redirect standard out to a file and then return it to normalwithopen('help.txt','w')asf:oldstdout=sys.stdoutsys.stdout=ftry:help(pow)finally:sys.stdout=oldstdout
withopen('help.txt','w')asf:withredirect_stdout(f):help(pow)
redirect_stdout
is proposed for python 3.4,bug report.
To roll your ownredirect_stdout
context manager
@contextmanagerdefredirect_stdout(fileobj):oldstdout=sys.stdoutsys.stdout=fileobjtry:yieldfileobjfinally:sys.stdout=oldstdout
Two conflicting rules:
- Don’t put too much on one line
- Don’t break atoms of thought into subatomic particles
Raymond’s rule:
- One logical line of code equals one sentence in English
result= []foriinrange(10):s=i**2result.append(s)printsum(result)
printsum(i**2foriinxrange(10))
First way tells you what to do, second way tells you what you want.
About
Notes from Raymond Hettinger's talk at PyCon US 2013.
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors4
Uh oh!
There was an error while loading.Please reload this page.