Movatterモバイル変換


[0]ホーム

URL:


Skip to content
Search Gists
Sign in Sign up

Instantly share code, notes, and snippets.

    • Star(2)You must be signed in to star a gist
    • Fork(0)You must be signed in to fork a gist
    Save BartoszCki/fd918625972e6637dc41e59d0d822db6 to your computer and use it in GitHub Desktop.
    Transforming Code into Beautiful, Idiomatic Python: notes from Raymond Hettinger's talk at pycon US 2013. The code examples and direct quotes are all from Raymond's talk. I've reproduced them here for my own edification and the hopes that others will find them as handy as I have!

    Notes from Raymond Hettinger's talk at pycon US 2013video,slides.

    The code examples and direct quotes are all from Raymond's talk. I've reproduced them here for my own edification and the hopes that others will find them as handy as I have!

    Looping over a range of numbers

    foriin [0,1,2,3,4,5]:printi**2foriinrange(6):printi**2

    Better

    foriinxrange(6):printi**2

    xrange creates an iterator over the range producing the values one at a time. This approach is much more memory efficient thanrange.xrange was renamed torange in python 3.

    Looping over a collection

    colors= ['red','green','blue','yellow']foriinrange(len(colors)):printcolors[i]

    Better

    forcolorincolors:printcolor

    Looping backwards

    colors= ['red','green','blue','yellow']foriinrange(len(colors)-1,-1,-1):printcolors[i]

    Better

    forcolorinreversed(colors):printcolor

    Looping over a collection and indices

    colors= ['red','green','blue','yellow']foriinrange(len(colors)):printi,'--->',colors[i]

    Better

    fori,colorinenumerate(colors):printi,'--->',color

    It's fast and beautiful and saves you from tracking the individual indices and incrementing them.

    Whenever you find yourself manipulating indices [in a collection], you're probably doing it wrong.

    Looping over two collections

    names= ['raymond','rachel','matthew']colors= ['red','green','blue','yellow']n=min(len(names),len(colors))foriinrange(n):printnames[i],'--->',colors[i]forname,colorinzip(names,colors):printname,'--->',color

    Better

    forname,colorinizip(names,colors):printname,'--->',color

    zip creates a new list in memory and takes more memory.izip is more efficient thanzip.Note: in python 3izip was renamed tozip and promoted to a builtin replacing the oldzip.

    Looping in sorted order

    colors= ['red','green','blue','yellow']# Forward sorted orderforcolorinsorted(colors):printcolors# Backwards sorted orderforcolorinsorted(colors,reverse=True):printcolors

    Custom Sort Order

    colors= ['red','green','blue','yellow']defcompare_length(c1,c2):iflen(c1)<len(c2):return-1iflen(c1)>len(c2):return1return0printsorted(colors,cmp=compare_length)

    Better

    printsorted(colors,key=len)

    The original is slow and unpleasant to write. Also, comparison functions are no longer available in python 3.

    Call a function until a sentinel value

    blocks= []whileTrue:block=f.read(32)ifblock=='':breakblocks.append(block)

    Better

    blocks= []forblockiniter(partial(f.read,32),''):blocks.append(block)

    iter takes two arguments. The first you call over and over again and the second is a sentinel value.

    Distinguishing multiple exit points in loops

    deffind(seq,target):found=Falsefori,valueinenumerate(seq):ifvalue==target:found=Truebreakifnotfound:return-1returni

    Better

    deffind(seq,target):fori,valueinenumerate(seq):ifvalue==target:breakelse:return-1returni

    Inside of everyfor loop is anelse.

    Looping over dictionary keys

    d= {'matthew':'blue','rachel':'green','raymond':'red'}forkind:printkforkind.keys():ifk.startswith('r'):deld[k]

    When should you use the second and not the first? When you're mutating the dictionary.

    If you mutate something while you're iterating over it, you're living in a state of sin and deserve what ever happens to you.

    d.keys() makes a copy of all the keys and stores them in a list. Then you can modify the dictionary.Note: in python 3 to iterate through a dictionary you have to explicidly write:list(d.keys()) becaused.keys() returns a "dictionary view" (an iterable that provide a dynamic view on the dictionary’s keys). Seedocumentation.

    Looping over dictionary keys and values

    # Not very fast, has to re-hash every key and do a lookupforkind:printk,'--->',d[k]# Makes a big huge listfork,vind.items():printk,'--->',v

    Better

    fork,vind.iteritems():printk,'--->',v

    iteritems() is better as it returns an iterator.Note: in python 3 there is noiteritems() anditems() behaviour is close to whatiteritems() had. Seedocumentation.

    Construct a dictionary from pairs

    names= ['raymond','rachel','matthew']colors= ['red','green','blue']d=dict(izip(names,colors))# {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}

    For python 3:d = dict(zip(names, colors))

    Counting with dictionaries

    colors= ['red','green','red','blue','green','red']# Simple, basic way to count. A good start for beginners.d= {}forcolorincolors:ifcolornotind:d[color]=0d[color]+=1# {'blue': 1, 'green': 2, 'red': 3}

    Better

    d= {}forcolorincolors:d[color]=d.get(color,0)+1# ord= {color:ifori,colorinenumerate(colors,1)}

    Grouping with dictionaries -- Part I and II

    names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']# In this example, we're grouping by name lengthd= {}fornameinnames:key=len(name)ifkeynotind:d[key]= []d[key].append(name)# {5: ['roger', 'betty'], 6: ['rachel', 'judith'], 7: ['raymond', 'matthew', 'melissa', 'charlie']}d= {}fornameinnames:key=len(name)d.setdefault(key, []).append(name)

    Better

    d=defaultdict(list)fornameinnames:key=len(name)d[key].append(name)

    Is a dictionary popitem() atomic?

    d= {'matthew':'blue','rachel':'green','raymond':'red'}whiled:key,value=d.popitem()printkey,'-->',value

    popitem is atomic so you don't have to put locks around it to use it in threads.

    Linking dictionaries

    defaults= {'color':'red','user':'guest'}parser=argparse.ArgumentParser()parser.add_argument('-u','--user')parser.add_argument('-c','--color')namespace=parser.parse_args([])command_line_args= {k:vfork,vinvars(namespace).items()ifv}# The common approach below allows you to use defaults at first, then override them# with environment variables and then finally override them with command line arguments.# It copies data like crazy, unfortunately.d=defaults.copy()d.update(os.environ)d.update(command_line_args)

    Better

    d=ChainMap(command_line_args,os.environ,defaults)

    ChainMap has been introduced into python 3. Fast and beautiful.

    Improving Clarity

    • Positional arguments and indicies are nice
    • Keywords and names are better
    • The first way is convenient for the computer
    • The second corresponds to how human’s think

    Clarify function calls with keyword arguments

    twitter_search('@obama',False,20,True)

    Better

    twitter_search('@obama',retweets=False,numtweets=20,popular=True)

    Is slightly (microseconds) slower but is worth it for the code clarity and developer time savings.

    Clarify multiple return values with named tuples

    # Old testmod return valuedoctest.testmod()# (0, 4)# Is this good or bad? You don't know because it's not clear.

    Better

    # New testmod return value, a namedTupledoctest.testmod()# TestResults(failed=0, attempted=4)

    A namedTuple is a subclass of tuple so they still work like a regular tuple, but are more friendly.

    To make a namedTuple:

    TestResults=namedTuple('TestResults', ['failed','attempted'])

    Unpacking sequences

    p='Raymond','Hettinger',0x30,'python@example.com'# A common approach / habit from other languagesfname=p[0]lname=p[1]age=p[2]email=p[3]

    Better

    fname,lname,age,email=p

    The second approach uses tuple unpacking and is faster and more readable.

    Updating multiple state variables

    deffibonacci(n):x=0y=1foriinrange(n):printxt=yy=x+yx=t

    Better

    deffibonacci(n):x,y=0,1foriinrange(n):printxx,y=y,x+y

    Problems with first approach

    • x and y are state, and state should be updated all at once or in between lines that state is mis-matched and a common source of issues
    • ordering matters
    • it's too low level

    The second approach is more high-level, doesn't risk getting the order wrong and is fast.

    Simultaneous state updates

    tmp_x=x+dx*ttmp_y=y+dy*ttmp_dx=influence(m,x,y,dx,dy,partial='x')tmp_dy=influence(m,x,y,dx,dy,partial='y')x=tmp_xy=tmp_ydx=tmp_dxdy=tmp_dy

    Better

    x,y,dx,dy= (x+dx*t,y+dy*t,influence(m,x,y,dx,dy,partial='x'),influence(m,x,y,dx,dy,partial='y'))

    Efficiency

    • An optimization fundamental rule
    • Don’t cause data to move around unnecessarily
    • It takes only a little care to avoid O(n**2) behavior instead of linear behavior

    Basically, just don't move data around unecessarily.

    Concatenating strings

    names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']s=names[0]fornameinnames[1:]:s+=', '+nameprints

    Better

    print', '.join(names)

    Updating sequences

    names= ['raymond','rachel','matthew','roger','betty','melissa','judith','charlie']delnames[0]# The below are signs you're using the wrong data structurenames.pop(0)names.insert(0,'mark')

    Better

    names=deque(['raymond','rachel','matthew','roger','betty','melissa','judith','charlie'])# More efficient with dequedelnames[0]names.popleft()names.appendleft('mark')

    Decorators and Context Managers

    • Helps separate business logic from administrative logic
    • Clean, beautiful tools for factoring code and improving code reuse
    • Good naming is essential.
    • Remember the Spiderman rule: With great power, comes great responsibility!

    Using decorators to factor-out administrative logic

    # Mixes business / administrative logic and is not reusabledefweb_lookup(url,saved={}):ifurlinsaved:returnsaved[url]page=urllib.urlopen(url).read()saved[url]=pagereturnpage

    Better

    @cachedefweb_lookup(url):returnurllib.urlopen(url).read()

    Note: since python 3.2 there is a decorator for this in the standard library:functools.lru_cache.

    Factor-out temporary contexts

    # Saving the old, restoring the newold_context=getcontext().copy()getcontext().prec=50printDecimal(355)/Decimal(113)setcontext(old_context)

    Better

    withlocalcontext(Context(prec=50)):printDecimal(355)/Decimal(113)

    How to open and close files

    f=open('data.txt')try:data=f.read()finally:f.close()

    Better

    withopen('data.txt')asf:data=f.read()

    How to use locks

    # Make a locklock=threading.Lock()# Old-way to use a locklock.acquire()try:print'Critical section 1'print'Critical section 2'finally:lock.release()

    Better

    # New-way to use a lockwithlock:print'Critical section 1'print'Critical section 2'

    Factor-out temporary contexts

    try:os.remove('somefile.tmp')exceptOSError:pass

    Better

    withignored(OSError):os.remove('somefile.tmp')

    ignored is is new in python 3.4,documentation.Note:ignored is actually calledsuppress in the standard library.

    To make your ownignored context manager in the meantime:

    @contextmanagerdefignored(*exceptions):try:yieldexceptexceptions:pass

    Stick that in your utils directory and you too can ignore exceptions

    Factor-out temporary contexts

    # Temporarily redirect standard out to a file and then return it to normalwithopen('help.txt','w')asf:oldstdout=sys.stdoutsys.stdout=ftry:help(pow)finally:sys.stdout=oldstdout

    Better

    withopen('help.txt','w')asf:withredirect_stdout(f):help(pow)

    redirect_stdout is proposed for python 3.4,bug report.

    To roll your ownredirect_stdout context manager

    @contextmanagerdefredirect_stdout(fileobj):oldstdout=sys.stdoutsys.stdout=fileobjtry:yieldfieldobjfinally:sys.stdout=oldstdout

    Concise Expressive One-Liners

    Two conflicting rules:

    • Don’t put too much on one line
    • Don’t break atoms of thought into subatomic particles

    Raymond’s rule:

    • One logical line of code equals one sentence in English

    List Comprehensions and Generator Expressions

    result= []foriinrange(10):s=i**2result.append(s)printsum(result)

    Better

    printsum(i**2foriinxrange(10))

    First way tells you what to do, second way tells you what you want.

    Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

    [8]ページ先頭

    ©2009-2025 Movatter.jp