Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Goes beyond PEP8 to discuss what makes Python code feel great. A Strunk & White for Python.

NotificationsYou must be signed in to change notification settings

amontalenti/elements-of-python-style

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 

Repository files navigation

This document goes beyond PEP8 to cover the core of what I think of as great Python style. It is opinionated, but not too opinionated. It goes beyond mere issues of syntax and module layout, and into areas of paradigm, organization, and architecture. I hope it can be a kind of condensed"Strunk & White" for Python code.

Table of Contents

Follow MostPEP8 Guidelines

... but, be flexible on naming and line length.

PEP8 covers lots of mundane stuff like whitespace, line breaks between functions/classes/methods, imports, and warning against use of deprecated functionality. Pretty much everything in there is good.

The best tool to enforce these rules, while also helping you catch silly Python syntax errors, isflake8.

PEP8 is meant as a set of guidelines, not rules to be strictly, or religiously, followed. Make sure to read the section of PEP8 that is titled: "A Foolish Consistency is the Hobgoblin of Little Minds." Also see Raymond Hettinger's excellent talk,"Beyond PEP8" for more on this.

The only set of rules that seem to cause a disproportionate amount of controversy are around the line length and naming. These can be easily tweaked.

Flexibility on Line Length

If the strict 79-character line length rule inflake8 bothers you, feel free to ignore or adjust that rule. It's probably still a good rule-of-thumb -- like a "rule" that says English sentences should have 50 or fewer words, or that paragraphs should have fewer than 10 sentences. Here's the link toflake8 config, see themax-line-length config option. Note also that often a# noqa comment can be added to a line to have aflake8 check ignored, but please use these sparingly.

90%+ of your lines should be 79 characters or fewer, though, for the simple reason that "Flat is better than nested". If you find a function where all the lines are longer than this, something else is wrong, and you should look at your code rather than at your flake8 settings.

Consistent Naming

On naming, following some simple rules can prevent a whole lot of team-wide grief.

Preferred Naming Rules

Many of these were adapted fromthe Pocoo team.

  • Class names:CamelCase, and capitalize acronyms:HTTPWriter, notHttpWriter.
  • Variable names:lower_with_underscores.
  • Method and function names:lower_with_underscores.
  • Modules:lower_with_underscores.py. (But, prefer names that don't need underscores!)
  • Constants:UPPER_WITH_UNDERSCORES.
  • Precompiled regular expressions:name_re.

You should generally follow these rules, unless you are mirroring some other tool's naming convention, like a database schema or message format.

You can also choose to useCamelCase for things that are class-like but not quite classes -- the main benefit ofCamelCase is calling attention to something as a "global noun", rather than a local label or a verb. Notice that Python namesTrue,False, andNone useCamelCase even though they are not classes.

Avoid Name Adornments

... like_prefix orsuffix_. Functions and methods can have a_prefix notation to indicate "private", but this should be used sparingly and only for APIs that are expected to be widely used, and where the_private indicator assists withinformation hiding.

PEP8 suggests using a trailing underscore to avoid aliasing a built-in, e.g.

sum_=sum(some_long_list)print(sum_)

This is OK in a pinch, but it might be better to just choose a different name.

You should rarely use__mangled double-underscore prefixes for class/instance/method labels, which have specialname mangling behavior -- it's rarely necessary. Never create your own names using__dunder__ adornments unless you are implementing a Python standard protocol, like__len__; this is a namespace specifically reserved for Python's internal protocols and shouldn't be co-opted for your own stuff.

Avoid One-Character Names

There are some one-character label names that are common and acceptable.

Withlambda, usingx for single-argument functions is OK. For example:

encode=lambdax:x.encode("utf-8","ignore")

With tuple unpacking, using_ as a throwaway label is also OK. For example:

_,url,urlref=data

This basically means, "ignore the first element."

Similar tolambda, inside list/dict/set comprehensions, generator expressions, or very short (1-2 line) for loops, a single-char iteration label can be used. This is also typicallyx, e.g.

sum(xforxinitemsifx>0)

to sum all positive integers in the sequenceitems.

It is also very common to usei as shorthand for "index", and commonly with theenumerate built-in. For example:

fori,iteminenumerate(items):print("%4s: %s"% (i,item))

Outside of these cases, you should rarely, perhapsnever, use single-character label/argument/method names. This is because it just makes it impossible togrep for stuff.

Useself and similar conventions

You should:

  • always name a method's first argumentself
  • always name@classmethod's first argumentcls
  • always use*args and**kwargs for variable argument lists

Nitpicks That Aren't Worth It

There's nothing to gain from not following these rules, so you should just follow them.

Alwaysinherit fromobject and use new-style classes

# badclassJSONWriter:pass# goodclassJSONWriter(object):pass

In Python 2, it's important to follow this rule. In Python 3, all classes implicitly inherit fromobject and this rule isn't necessary any longer.

Don't repeat instance labels in the class

# badclassJSONWriter(object):handler=Nonedef__init__(self,handler):self.handler=handler# goodclassJSONWriter(object):def__init__(self,handler):self.handler=handler

Preferlist/dict/set comprehensions over map/filter.

# badmap(truncate,filter(lambdax:len(x)>30,items))# good[truncate(x)forxinitemsiflen(x)>30]

Though you should prefer comprehensions for most of the simple cases, there are occasions wheremap() orfilter() will be more readable, so use your judgment.

Use parens(...) for continuations

# badfromitertoolsimportgroupby,chain, \izip,islice# goodfromitertoolsimport (groupby,chain,izip,islice)

Use parens(...) for fluent APIs

# badresponse=Search(using=client) \           .filter("term",cat="search") \           .query("match",title="python")# goodresponse= (Search(using=client)            .filter("term",cat="search")            .query("match",title="python"))

Use implicit continuations in function calls

# bad -- simply unnecessary backslashreturnset((key.lower(),val.lower()) \forkey,valinmapping.iteritems())# goodreturnset((key.lower(),val.lower())forkey,valinmapping.iteritems())

Useisinstance(obj, cls), nottype(obj) == cls

This is becauseisinstance covers way more cases, including sub-classes and ABC's. Also, rarely useisinstance at all, since you should usually be doing duck typing, instead!

Usewith for files and locks

Thewith statement subtly handles file closing and lock releasing even in the case of exceptions being raised. So:

# badsomefile=open("somefile.txt","w")somefile.write("sometext")return# goodwithopen("somefile.txt","w")assomefile:somefile.write("sometext")return

Useis when comparing toNone

TheNone value is a singleton but when you're checking forNone, you rarely want to actually call__eq__ on the LHS argument. So:

# badifitem==None:continue# goodifitemisNone:continue

Not only is the good form faster, it's also more correct. It's no more concise to use==, so just remember this rule!

Avoidsys.path hacks

It can be tempting to dosys.path.insert(0, "../") and similar to control Python's import approach, but you should avoid these like the plague.

Python has a somewhat-complex, but very comprehensible, approach to module path resolution. You can adjust how Python loads modules viaPYTHONPATH or via tricks likesetup.py develop. You can also run Python using-m to good effect, e.g.python -m mypkg.mymodule rather thanpython mypkg/mymodule.py. You should not rely upon the current working directory that you run python out of for your code to work properly. David Beazley saves the day once more with his PDF slides which are worth a skim,"Modules and Packages: Live and Let Die!"

... and when you must, don't make too many.

# badclassArgumentError(Exception):pass ...raiseArgumentError(url)# goodraiseValueError("bad value for url: %s"%url)

Note that Python includesa rich set of built-in exception classes. Leverage these appropriately, and you should "customize" them simply by instantiating them with string messages that describe the specific error condition you hit. It is most common to raiseValueError (bad argument),LookupError (bad key), orAssertionError (via theassert statement) in user code.

A good rule of thumb for whether you should create your own exception type is to figure out whether a caller should catch itevery time they call your function. If so, you probablyshould make your own type. But this is relatively rare. A good example of an exception type that clearly had to exist istornado.web.HTTPError. But notice how Tornado did not go overboard: there is one exception class forall HTTP errors raised by the framework or user code.

Short docstrings are proper one-line sentences

# baddefreverse_sort(items):"""    sort items in reverse order    """# gooddefreverse_sort(items):"""Sort items in reverse order."""

Keep the triple-quote's on the same line""", capitalize the first letter, and include a period. Four lines become two, the__doc__ attribute doesn't have crufty newlines, and the pedants are pleased!

It's done by the stdlib and most open source projects. It's supported out-of-the-box by Sphinx. Just do it! The Pythonrequests module uses these to extremely good effect. See therequests.api module, for example.

Strip trailing whitespace

This is perhaps the ultimate nitpick, but if you don't do it, it will drive people crazy. There are no shortage of tools that will do this for you in your text editor automatically; here'sa link to the one I use for vim.

Writing Good Docstrings

Here's a quick reference to using Sphinx-style reST in your function docstrings:

defget(url,qsargs=None,timeout=5.0):"""Send an HTTP GET request.    :param url: URL for the new request.    :type url: str    :param qsargs: Converted to query string arguments.    :type qsargs: dict    :param timeout: In seconds.    :rtype: mymodule.Response    """returnrequest('get',url,qsargs=qsargs,timeout=timeout)

Don't document for the sake of documenting. The way to think about this is:

good_names+explicit_defaults>verbose_docs+type_specs

That is, in the example above, there is no need to saytimeout is afloat, because the default value is5.0, which is clearly afloat. It is useful to indicate in the documentation that the semantic meaning is "seconds", thus5.0 means 5 seconds. Meanwhile, the caller has no clue whatqsargs should be, so we give a hint with thetype annotation, and the caller also has no clue what to expect back from the function, so anrtype annotation is appropriate.

One last point. Guido once said that his key insight for Python is that, "code is read much more often than it is written." Well, a corollary of this is thatsome documentation helps, but too much documentation hurts.

You should basically only document functions you expect to be widely re-used. If you document every function in an internal module, you'll just end up with a less maintainable module, since the documentation needs to be refactored when the code is refactored. Don't "cargo cult" your docstrings and definitely don't auto-generate them with tooling!

Paradigms and Patterns

Functions vs classes

You should usually prefer functions to classes. Functions and modules are the basic units of code re-use in Python, and they are the most flexible form. Classes are an "upgrade path" for certain Python facilities, such as implementing containers, proxies, descriptors, type systems, and more. But usually, functions are a better option.

Some might like the code organization benefits of grouping related functions together into classes. But this is a mistake. You should group related functions together intomodules.

Though sometimes classes can act as a helpful "mini namespace" (e.g. with@staticmethod), more often a group of methods should be contributing to the internal operation of an object, rather than merely being a behavior grouping.

It's always better to have alib.time module for time-related functions than to have aTimeHelper class with a bunch of methods you are forced to subclass in order to use! Classes proliferate other classes, which proliferates complexity and decreases readability.

Generators and iterators

Generators and iterators are Python's most powerful features -- you should master the iterator protocol, theyield keyword, and generator expressions.

Not only are generators important for any function that needs to be called over a large stream of data, but they also have the effect of simplifying code by making it easy for you to write your own iterators. Refactoring code to generators often simplifies it while making it work in more scenarios.

Luciano Ramalho, author of "Fluent Python", has a 30-minute presentation,"Iterators & Generators: the Python Way", which gives an excellent, fast-paced overview. David Beazley, author of "Python Essential Reference" and "Python Cookbook", has a mind-bending three-hour video tutorial entitled"Generators: The Final Frontier" that is a satisfying exposition of generator use cases. Mastering this topic is worth it because it applies everywhere.

Declarative vs imperative

You should prefer declarative to imperative programming. This is code that sayswhat you want to do, rather than code that describeshow to do it. Python'sfunctional programming guide includes some good details and examples of how to use this style effectively.

You should use lightweight data structures likelist,dict,tuple, andset to your advantage. It's always better to lay out your data, and then write some code to transform it, than to build up data by repeatedly calling mutating functions/methods.

An example of this is the common list comprehension refactoring:

# badfiltered= []forxinitems:ifx.endswith(".py"):filtered.append(x)returnfiltered

This should be rewritten as:

# goodreturn [xforxinitemsifx.endswith(".py")]

But another good example is rewriting anif/elif/else chain as adict lookup.

Prefer "pure" functions and generators

This is a concept that we can borrow from the functional programming community. These kinds of functions and generators are alternatively described as "side-effect free", "referentially transparent", or as having "immutable inputs/outputs".

As a simple example, you should avoid code like this:

# baddefdedupe(items):"""Remove dupes in-place, return items and # of dupes."""seen=set()dupe_positions= []fori,iteminenumerate(items):ifiteminseen:dupe_positions.append(i)else:seen.add(item)num_dupes=len(dupe_positions)foridxinreversed(dupe_positions):items.pop(idx)returnitems,num_dupes

This same function can be written as follows:

# gooddefdedupe(items):"""Return deduped items and # of dupes."""deduped=set(items)num_dupes=len(items)-len(deduped)returndeduped,num_dupes

This is a somewhat shocking example. In addition to making this function pure, we also made it much, much shorter. It's not only shorter: it's better. Its purity meansassert dedupe(items) == dedupe(items) always holds true for the "good" version. In the "bad" version,num_dupes willalways be0 on the second call, which can lead to subtle bugs when using the function.

This also illustrates imperative vs declarative style: the function now reads like a description of what we need, rather than a set of instructions to build up what we need.

Prefer simple argument and return types

Functions should operate on data, rather than on custom objects, wherever possible. Prefer simple argument types likedict,set,tuple,list,int,float, andbool. Upgrade from there to standard library types likedatetime,timedelta,array,Decimal, andFuture. Only upgrade to your own custom types when absolutely necessary.

As a good rule of thumb for whether your function is simple enough, ask yourself whether its arguments and return values could always be JSON-serializable. It turns out, this rule of thumb matters more than you might think: JSON-serializability is often a prerequisite to make the functions usable in parallel computing contexts. But, for the purpose of this document, the main benefits are: readability, testability, and overall function simplicity.

Avoid "traditional" OOP

In "traditional OOP languages" like Java and C++, code re-use is achieved through class hierarchies and polymorphism, or so those languages claim. In Python, though we have the ability to subclass and to do class-based polymorphism, in practice, these capabilities are used rarely in idiomatic Python programs.

It's more common to achieve re-use through modules and functions, and it's more common to achieve dynamic dispatch through duck typing. If you find yourself using super classes as a form of code re-use, stop what you're doing and reconsider. If you find yourself using lots of polymorphism, consider whether one of Python's dunder protocols or duck typing strategies might apply better.

See also the excellent Python talk,"Stop Writing Classes", by a Python core contributor. In it, the presenter suggests that if you have built a class with a single method that is named like a class (e.g.Runnable.run()), then what you've done is modeled a function as a class, and you should just stop. Since in Python, functions are "first-class", there isno reason to do this!

Mixins are sometimes OK

One way to do class-based re-use without going overboard on type hierarchies is to use Mixins. Don't overuse these, though. "Flat is better than nested" applies to type hierarchies, too, so you should avoid introducing needless required layers of hierarchy just to decompose behavior.

Mixins are not actually a Python language feature, but are possible thanks to its support for multiple inheritance. You can create base classes that "inject" functionality into your subclass without forming an "important" part of a type hierarchy, simply by listing that base class as the first entry in thebases list. An example:

classAPIHandler(AuthMixin,RequestHandler):"""Handle HTTP/JSON requests with security."""

The order matters, so may as well remember the rule:bases forms a hierarchy bottom-to-top. One readability benefit here is that everything you need to know about this class is contained in theclass definition itself: "it mixes in auth behavior and is a specialized Tornado RequestHandler."

Be careful with frameworks

Python has a slew of frameworks for web, databases, and more. One of the joys of the language is that it's easy to create your own frameworks. When using an open source framework, you should be careful not to couple your "core code" too closely to the framework itself.

When considering building your own framework for your code, you should err on the side of caution. The standard library has a lot of stuff built-in, PyPI has even more, and usually,YAGNI applies.

Respect metaprogramming

Python supports "metaprogramming" via a number of features, including decorators, context managers, descriptors, import hooks, metaclasses and AST transformations.

You should feel comfortable using and understanding these features -- they are a core part of the language and are fully supported by it. But you should realize that when you use these features, you are opening yourself up to complex failure scenarios. Thus, treat the creation of metaprogramming facilities for your code similarly to the decision to "build your own framework". They amount to the same thing. When and if you do it, make the facilities into their own modules and document them well!

Don't be afraid of "dunder" methods

Many people conflate Python's metaprogramming facilities with its support for "double-underscore" or "dunder" methods, such as__getattr__.

As described in the blog post,"Python double-under, double-wonder", there is nothing "special" about dunders. They are nothing more than a lightweight namespace the Python core developers picked for all of Python's internal protocols. After all,__init__ is a dunder, and there's nothing magic about it.

It's true that some dunders can create more confusing results than others -- for example, it's probably not a good idea to overload operators without good reason. But many of them, such as__repr__,__str__,__len__, and__call__ are really full parts of the language you should be leveraging in idiomatic Python code. Don't shy away!

A Little Zen for Your Code Style

Barry Warsaw, one of the core Python developers, once said that it frustrated him that "The Zen of Python" (PEP 20) is used as a style guide for Python code, since it was originally written as a poem about Python'sinternal design. That is, the design of the language and language implementation itself. One can acknowledge that, but a few of the lines from PEP 20 serve as pretty good guidelines for idiomatic Python code, so we'll just go with it.

Beautiful is better than ugly

This one is subjective, but what it usually amounts to is this: will the person who inherits this code from you be impressed or disappointed? What if that person is you, three years later?

Explicit is better than implicit

Sometimes in the name of refactoring out repetition in our code, we also get a little bit abstract with it. It should be possible to translate the code into plain English and basically understand what's going on. There shouldn't be an excessive amount of "magic".

Flat is better than nested

This one is really easy to understand. The best functions have no nesting, neither by loops norif statements. Second best is one level of nesting. Two or more levels of nesting, and you should probably start refactoring to smaller functions.

Also, don't be afraid to refactor a nested if statement into a multi-part boolean conditional. For example:

# badifresponse:ifresponse.get("data"):returnlen(response["data"])

is better written as:

# goodifresponseandresponse.get("data"):returnlen(response["data"])

Readability counts

Don't be afraid to add line-comments with#. Don't go overboard on these or over-document, but a little explanation, line-by-line, often helps a whole lot. Don't be afraid to pick a slightly longer name because it's more descriptive. No one wins any points for shortening "response" to "rsp". Use doctest-style examples to illustrate edge cases in docstrings. Keep it simple!

Errors should never pass silently

The biggest offender here is the bareexcept: pass clause. Never use these. Suppressingall exceptions is simply dangerous. Scope your exception handling to single lines of code, and always scope yourexcept handler to a specific type. Also, get comfortable with thelogging module andlog.exception(...).

If the implementation is hard to explain, it's a bad idea

This is a general software engineering principle -- but applies very well to Python code. Most Python functions and objects can have an easy-to-explain implementation. If it's hard to explain, it's probably a bad idea. Usually you can make a hard-to-explain function easier-to-explain via "divide and conquer" -- split it into several functions.

Testing is one honking great idea

OK, we took liberty on this one -- in "The Zen of Python", it's actually "namespaces" that's the honking great idea.

But seriously: beautiful code without tests is simply worse than even the ugliest tested code. At least the ugly code can be refactored to be beautiful, but the beautiful code can't be refactored to be verifiably correct, at least not without writing the tests! So, write tests! Please!

Six of One, Half a Dozen of the Other

This is a section for arguments we'd rather not settle. Don't rewrite other people's code because of this stuff. Feel free to use these forms interchangeably.

str.format vs overloaded format%

str.format is more robust, yet% with"%s %s" printf-style strings is more concise. Both will be around forever.

Remember to use unicode strings for your format pattern, if you need to preserve unicode:

u"%s %s"% (dt.datetime.utcnow().isoformat(),line)

If you do end up using%, you should consider the"%(name)s" syntax which allows you to use a dictionary rather than a tuple, e.g.

u"%(time)s %(line)s"% {"time":dt.datetime.utcnow().isoformat(),"line":line}

Also, don't re-invent the wheel. One thingstr.format does unequivocally better is support variousformatting modes, such as humanized numbers and percentages. Use them.

But use whichever one you please. We choose not to care.

if item vsif item is not None

This is unrelated to the earlier rule on== vsis forNone. In this case, we are actually taking advantage of Python's "truthiness rules" to our benefit inif item, e.g. as a shorthand "item is not None or empty string."

Truthiness is atad complicated in Python and certainly the latter is safer against some classes of bugs. The former, however, is very common in much Python code, and it's shorter. We choose not to care.

Implicit multi-line strings vs triple-quote"""

Python's compiler will automatically join multiple quoted strings together into a single string during the parse phase if it finds nothing in between them, e.g.

msg= ("Hello, wayward traveler!\n""What shall we do today?\n""=>")print(msg)

This is roughly equivalent to:

msg="""Hello, wayward traveler!What shall we do today?=>"""print(msg)

In the former's case, you keep the indentation clean, but need the ugly newline characters. In the latter case, you don't need the newlines, but break indentation. We choose not to care.

Usingraise with classes vs instances

It turns out Python lets you pass either an exceptionclass or an exceptioninstance to theraise statement. For example, these two lines are roughly equivalent:

raiseValueErrorraiseValueError()

Essentially, Python turns thefirst line into the second automatically. You should probably prefer the second form, if for no other reason than toactually provide a useful argument, like a helpful message about why theValueError occurred. But these two linesare equivalent and you shouldn't rewrite one style into the other just because. We choose not to care.

Standard Tools and Project Structure

We've made some choices on "best-of-breed" tools for things, as well as the very minimal starting structure for a proper Python project.

The Standard Library

  • import datetime as dt: always importdatetime this way
  • dt.datetime.utcnow(): preferred to.now(), which does local time
  • import json: the standard for data interchange
  • from collections import namedtuple: use for lightweight data types
  • from collections import defaultdict: use for counting/grouping
  • from collections import deque: a fast double-ended queue
  • from itertools import groupby, chain: for declarative style
  • from functools import wraps: use for writing well-behaved decorators
  • argparse: for "robust" CLI tool building
  • fileinput: to create quick UNIX pipe-friendly tools
  • log = logging.getLogger(__name__): good enough for logging
  • from __future__ import absolute_import: fixes import aliasing

Common Third-Party Libraries

  • python-dateutil for datetime parsing and calendars
  • pytz for timezone handling
  • tldextract for better URL handling
  • msgpack-python for a more compact encoding than JSON
  • futures for Future/pool concurrency primitives
  • docopt for quick throwaway CLI tools
  • py.test for unit tests, along withmock andhypothesis

Local Development Project Skeleton

For all Python packages and libraries:

  • no__init__.py in root folder: give your package a folder name!
  • mypackage/__init__.py preferred tosrc/mypackage/__init__.py
  • mypackage/lib/__init__.py preferred tolib/__init__.py
  • mypackage/settings.py preferred tosettings.py
  • README.rst describes the repo for a newcomer; use reST
  • setup.py for simple facilities likesetup.py develop
  • requirements.txt describes package dependencies forpip
  • dev-requirements.txt additional dependencies for tests/local
  • Makefile for simple (!!!) build/lint/test/run steps

Also, alwayspin your requirements.

Some Inspiration

The following links may give you some inspiration about the core of writing Python code with great style and taste.

Go forth and be Pythonic!

$ python>>> import antigravity

Contributors

  • Andrew Montalenti (twitter): original author
  • Vincent Driessen (twitter): edits and suggestions
  • William Feng (github): translation to zh-cn

Like good Python style? Then perhaps you'll enjoy this style guide author'spast blog posts on Python.

About

Goes beyond PEP8 to discuss what makes Python code feel great. A Strunk & White for Python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors7


[8]ページ先頭

©2009-2025 Movatter.jp