Table of Contents
The Pythonlogging
package is a a lightweight but extensible package for keeping better track of what your own code does. Using it gives you much more flexibility than just littering your code with superfluousprint()
calls.
However, Python’slogging
package can be complicated in certain spots. Handlers, loggers, levels,namespaces, filters: it’s not easy to keep track of all of these pieces and how they interact.
One way to tie up the loose ends in your understanding oflogging
is to peek under the hood to itsCPython source code. The Python code behindlogging
is concise and modular, and reading through it can help you get thataha moment.
This article is meant to complement the loggingHOWTO document as well asLogging in Python, which is a walkthrough on how to use the package.
By the end of this article, you’ll be familiar with the following:
logging
levels and how they worklogging
logging
from an OOP perspectivelogging
For the most part, we’ll go line-by-line down the coremodule in Python’slogging
package in order to build a picture of how it’s laid out.
Free Bonus:5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.
Because thelogging
source code is central to this article, you can assume that any code block or link is based on a specific commit in the Python 3.7 CPython repository, namelycommitd730719
. You can find thelogging
package itself in theLib/
directory within the CPython source.
Within thelogging
package, most of the heavy lifting occurs withinlogging/__init__.py
, which is the file you’ll spend the most time on here:
cpython/│├── Lib/│ ├── logging/│ │ ├── __init__.py│ │ ├── config.py│ │ └── handlers.py│ ├── ...├── Modules/├── Include/...... [truncated]
With that, let’s jump in.
Before we get to the heavyweight classes, the top hundred lines or so of__init__.py
introduce a few subtle but important concepts.
int
!Objects likelogging.INFO
orlogging.DEBUG
can seem a bit opaque. What are these variables internally, and how are they defined?
In fact, the uppercase constants from Python’slogging
are just integers, forming an enum-like collection of numerical levels:
CRITICAL=50FATAL=CRITICALERROR=40WARNING=30WARN=WARNINGINFO=20DEBUG=10NOTSET=0
Why not just use thestrings"INFO"
or"DEBUG"
? Levels areint
constants to allow for the simple, unambiguous comparison of one level with another. They are given names as well to lend them semantic meaning. Saying that a message has a severity of 50 may not be immediately clear, but saying that it has a level ofCRITICAL
lets you know that you’ve got a flashing red light somewhere in your program.
Now, technically, you can pass just thestr
form of a level in some places, such aslogger.setLevel("DEBUG")
. Internally, this will call_checkLevel()
, which ultimately does adict
lookup for the correspondingint
:
_nameToLevel={'CRITICAL':CRITICAL,'FATAL':FATAL,'ERROR':ERROR,'WARN':WARNING,'WARNING':WARNING,'INFO':INFO,'DEBUG':DEBUG,'NOTSET':NOTSET,}def_checkLevel(level):ifisinstance(level,int):rv=levelelifstr(level)==level:iflevelnotin_nameToLevel:raiseValueError("Unknown level:%r"%level)rv=_nameToLevel[level]else:raiseTypeError("Level not an integer or a valid string:%r"%level)returnrv
Which should you prefer? I’m not too opinionated on this, but it’s notable that thelogging
docs consistently use the formlogging.DEBUG
rather than"DEBUG"
or10
. Also, passing thestr
form isn’t an option in Python 2, and somelogging
methods such aslogger.isEnabledFor()
will accept only anint
, not itsstr
cousin.
A few lines down, you’ll find the following shortcode block, which is sneakily critical to the whole package:
importthreading_lock=threading.RLock()def_acquireLock():if_lock:_lock.acquire()def_releaseLock():if_lock:_lock.release()
The_lock
object is areentrant lock that sits in the global namespace of thelogging/__init__.py
module. It makes pretty much every object and operation in the entirelogging
package thread-safe, enabling threads to do read and write operations without the threat of a race condition. You can see in the module source code that_acquireLock()
and_releaseLock()
are ubiquitous to the module and its classes.
There’s something not accounted for here, though: what about process safety? The short answer is that thelogging
module isnot process safe. This isn’t inherently a fault oflogging
—generally, two processes can’t write to same file without a lot of proactive effort on behalf of the programmer first.
This means that you’ll want to be careful before using classes such as alogging.FileHandler
with multiprocessing involved. If two processes want to read from and write to the same underlying file concurrently, then you can run into a nasty bug halfway through a long-running routine.
If you want to get around this limitation, there’s a thoroughrecipe in the official Logging Cookbook. Because this entails a decent amount of setup, one alternative is to have each process log to a separate file based on its process ID, which you can grab withos.getpid()
.
Now that we’ve covered some preliminary setup code, let’s take a high-level look at howlogging
is laid out. Thelogging
package uses a healthy dose ofOOP andinheritance. Here’s a partial look at the method resolution order (MRO) for some of the most important classes in the package:
object│├── LogRecord├── Filterer│ ├── Logger│ │ └── RootLogger│ └── Handler│ ├── StreamHandler│ └── NullHandler├── Filter└── Manager
The tree diagram above doesn’t cover all of the classes in the module, just those that are most worth highlighting.
Note: You can use the dunder attributelogging.StreamHandler.__mro__
to see the chain of inheritance. A definitiveguide to the MRO can be found in the Python 2 docs, though it is applicable to Python 3 as well.
This litany of classes is typically one source of confusion because there’s a lot going on, and it’s all jargon-heavy.Filter
versusFilterer
?Logger
versusHandler
? It can be challenging to keep track of everything, much less visualize how it fits together. A picture is worth a thousand words, so here’s a diagram of a scenario where one logger with two handlers attached to it writes a log message with levellogging.INFO
:
In Python code, everything above would look like this:
importloggingimportsyslogger=logging.getLogger("pylog")logger.setLevel(logging.DEBUG)h1=logging.FileHandler(filename="/tmp/records.log")h1.setLevel(logging.INFO)h2=logging.StreamHandler(sys.stderr)h2.setLevel(logging.ERROR)logger.addHandler(h1)logger.addHandler(h2)logger.info("testing%d..%d..%d..",1,2,3)
There’s a more detailed map of this flow in theLogging HOWTO. What’s shown above is a simplified scenario.
Your code defines just oneLogger
instance,logger
, along with twoHandler
instances,h1
andh2
.
When you calllogger.info("testing %d.. %d.. %d..", 1, 2, 3)
, thelogger
object serves as a filter because it also has alevel
associated with it. Only if the message level is severe enough will the logger do anything with the message. Because the logger has levelDEBUG
, and the message carries a higherINFO
level, it gets the go-ahead to move on.
Internally,logger
callslogger.makeRecord()
to put the message string"testing %d.. %d.. %d.."
and its arguments(1, 2, 3)
into a bona fide class instance of aLogRecord
, which is just a container for the message and its metadata.
Thelogger
object looks around for its handlers (instances ofHandler
), which may be tied directly tologger
itself or to its parents (a concept that we’ll touch on later). In this example, it finds two handlers:
INFO
that dumps log data to a file at/tmp/records.log
sys.stderr
but only if the incoming message is at levelERROR
or higherAt this point, there’s another round of tests that kicks in. Because theLogRecord
and its message only carry levelINFO
, the record gets written to Handler 1 (green arrow), but not to Handler 2’sstderr
stream (red arrow). For Handlers, writing theLogRecord
to their stream is calledemitting it, which is captured in their.emit()
.
Next, let’s further dissect everything from above.
LogRecord
ClassWhat is aLogRecord
? When you log a message, an instance of theLogRecord
class is the object you send to be logged. It’s created for you by aLogger
instance and encapsulates all the pertinent info about that event. Internally, it’s little more than a wrapper around adict
that contains attributes for the record. ALogger
instance sends aLogRecord
instance to zero or moreHandler
instances.
TheLogRecord
contains some metadata, such as the following:
Here’s a peek into the metadata that it carries with it, which you can introspect by stepping through alogging.error()
call with thepdb
module:
>>>importlogging>>>importpdb>>>deff(x):...logging.error("bad vibes")...returnx/0...>>>pdb.run("f(1)")
After stepping through some higher-level functions, you end upat line 1517:
(Pdb)l1514 exc_info = (type(exc_info), exc_info, exc_info.__traceback__)1515 elif not isinstance(exc_info, tuple):1516 exc_info = sys.exc_info()1517 record = self.makeRecord(self.name, level, fn, lno, msg, args,1518 exc_info, func, extra, sinfo)1519 -> self.handle(record)15201521 def handle(self, record):1522 """1523 Call the handlers for the specified record.1524(Pdb)from pprint import pprint(Pdb)pprint(vars(record)){'args': (), 'created': 1550671851.660067, 'exc_info': None, 'exc_text': None, 'filename': '<stdin>', 'funcName': 'f', 'levelname': 'ERROR', 'levelno': 40, 'lineno': 2, 'module': '<stdin>', 'msecs': 660.067081451416, 'msg': 'bad vibes', 'name': 'root', 'pathname': '<stdin>', 'process': 2360, 'processName': 'MainProcess', 'relativeCreated': 295145.5490589142, 'stack_info': None, 'thread': 4372293056, 'threadName': 'MainThread'}
ALogRecord
, internally, contains a trove of metadata that’s used in one way or another.
You’ll rarely need to deal with aLogRecord
directly, since theLogger
andHandler
do this for you. It’s still worthwhile to know what information is wrapped up in aLogRecord
, because this is where all that useful info, like the timestamp, come from when you see record log messages.
Note: Below theLogRecord
class, you’ll also find thesetLogRecordFactory()
,getLogRecordFactory()
, andmakeLogRecord()
factory functions. You won’t need these unless you want to use a custom class instead ofLogRecord
to encapsulate log messages and their metadata.
Logger
andHandler
ClassesTheLogger
andHandler
classes are both central to howlogging
works, and they interact with each other frequently. ALogger
, aHandler
, and aLogRecord
each have a.level
associated with them.
TheLogger
takes theLogRecord
and passes it off to theHandler
, but only if the effective level of theLogRecord
is equal to or higher than that of theLogger
. The same goes for theLogRecord
versusHandler
test. This is calledlevel-based filtering, whichLogger
andHandler
implement in slightly different ways.
In other words, there is an (at least) two-step test applied before the message that you log gets to go anywhere. In order to be fully passed from a logger to handler and then logged to the end stream (which could besys.stdout
, a file, or an email via SMTP), aLogRecord
must have a level at least as high asboth the logger and handler.
PEP 282 describes how this works:
Each
Logger
object keeps track of a log level (or threshold) that it is interested in, and discards log requests below that level. (Source)
So where does this level-based filtering actually occur for bothLogger
andHandler
?
For theLogger
class, it’s a reasonable first assumption that the logger would compare its.level
attribute to the level of theLogRecord
, and be done there. However, it’s slightly more involved than that.
Level-based filtering for loggers occurs in.isEnabledFor()
, which in turn calls.getEffectiveLevel()
.Always uselogger.getEffectiveLevel()
rather than just consultinglogger.level
. The reason has to do with the organization ofLogger
objects in a hierarchical namespace. (You’ll see more on this later.)
By default, aLogger
instance has a level of0
(NOTSET
). However, loggers also haveparent loggers, one of which is the root logger, which functions as the parent of all other loggers. ALogger
will walk upwards in its hierarchy and get its effective level vis-à-vis its parent (which ultimately may beroot
if no other parents are found).
Here’swhere this happens in theLogger
class:
classLogger(Filterer):# ...defgetEffectiveLevel(self):logger=selfwhilelogger:iflogger.level:returnlogger.levellogger=logger.parentreturnNOTSETdefisEnabledFor(self,level):try:returnself._cache[level]exceptKeyError:_acquireLock()ifself.manager.disable>=level:is_enabled=self._cache[level]=Falseelse:is_enabled=self._cache[level]=level>=self.getEffectiveLevel()_releaseLock()returnis_enabled
Correspondingly, here’s an example that calls the source code you see above:
>>>importlogging>>>logger=logging.getLogger("app")>>>logger.level# No!0>>>logger.getEffectiveLevel()30>>>logger.parent<RootLogger root (WARNING)>>>>logger.parent.level30
Here’s the takeaway: don’t rely on.level
. If you haven’t explicitly set a level on yourlogger
object, and you’re depending on.level
for some reason, then your logging setup will likely behave differently than you expected it to.
What aboutHandler
? For handlers, the level-to-level comparison is simpler, though it actually happensin.callHandlers()
from theLogger
class:
classLogger(Filterer):# ...defcallHandlers(self,record):c=selffound=0whilec:forhdlrinc.handlers:found=found+1ifrecord.levelno>=hdlr.level:hdlr.handle(record)
For a givenLogRecord
instance (namedrecord
in the source code above), a logger checks with each of its registered handlers and does a quick check on the.level
attribute of thatHandler
instance. If the.levelno
of theLogRecord
is greater than or equal to that of the handler, only then does the record get passed on. Adocstring inlogging
refers to this as “conditionally emit[ting] the specified logging record.”
The most important attribute for aHandler
subclass instance is its.stream
attribute. This is the final destination where logs get written to and can be pretty much any file-like object. Here’s an example withio.StringIO
, which is an in-memory stream (buffer) for text I/O.
First, set up aLogger
instance with a level ofDEBUG
. You’ll see that, by default, it has no direct handlers:
>>>importio>>>importlogging>>>logger=logging.getLogger("abc")>>>logger.setLevel(logging.DEBUG)>>>print(logger.handlers)[]
Next, you can subclasslogging.StreamHandler
to make the.flush()
call a no-op. We would want to flushsys.stderr
orsys.stdout
, but not the in-memory buffer in this case:
classIOHandler(logging.StreamHandler):defflush(self):pass# No-op
Now, declare the buffer object itself and tie it in as the.stream
for your custom handler with a level ofINFO
, and then tie that handler into the logger:
>>>stream=io.StringIO()>>>h=IOHandler(stream)>>>h.setLevel(logging.INFO)>>>logger.addHandler(h)>>>logger.debug("extraneous info")>>>logger.warning("you've been warned")>>>logger.critical("SOS")>>>try:...print(stream.getvalue())...finally:...stream.close()...you've been warnedSOS
This last chunk is another illustration of level-based filtering.
Three messages with levelsDEBUG
,WARNING
, andCRITICAL
are passed through the chain. At first, it may look as if they don’t go anywhere, but two of them do. All three of them make it out of the gates fromlogger
(which has levelDEBUG
).
However, only two of them get emitted by the handler because it has a higher level ofINFO
, which exceedsDEBUG
. Finally, you get the entire contents of the buffer as astr
and close the buffer to explicitly free up system resources.
Filter
andFilterer
ClassesAbove, we asked the question, “Where does level-based filtering happen?” In answering this question, it’s easy to get distracted by theFilter
andFilterer
classes. Paradoxically, level-based filtering forLogger
andHandler
instances occurs without the help of either of theFilter
orFilterer
classes.
Filter
andFilterer
are designed to let you add additional function-based filters on top of the level-based filtering that is done by default. I like to think of it asà la carte filtering.
Filterer
is the base class forLogger
andHandler
because both of these classes are eligible for receiving additional custom filters that you specify. You add instances ofFilter
to them withlogger.addFilter()
orhandler.addFilter()
, which is whatself.filters
refers to in the following method:
classFilterer(object):# ...deffilter(self,record):rv=Trueforfinself.filters:ifhasattr(f,'filter'):result=f.filter(record)else:result=f(record)ifnotresult:rv=Falsebreakreturnrv
Given arecord
(which is aLogRecord
instance),.filter()
returnsTrue
orFalse
depending on whether that record gets the okay from this class’s filters.
Here is .handle()
in turn, for theLogger
andHandler
classes:
classLogger(Filterer):# ...defhandle(self,record):if(notself.disabled)andself.filter(record):self.callHandlers(record)# ...classHandler(Filterer):# ...defhandle(self,record):rv=self.filter(record)ifrv:self.acquire()try:self.emit(record)finally:self.release()returnrv
NeitherLogger
norHandler
come with any additional filters by default, but here’s a quick example of how you could add one:
>>>importlogging>>>logger=logging.getLogger("rp")>>>logger.setLevel(logging.INFO)>>>logger.addHandler(logging.StreamHandler())>>>logger.filters# Initially empty[]>>>classShortMsgFilter(logging.Filter):..."""Only allow records that contain long messages (> 25 chars)."""...deffilter(self,record):...msg=record.msg...ifisinstance(msg,str):...returnlen(msg)>25...returnFalse...>>>logger.addFilter(ShortMsgFilter())>>>logger.filters[<__main__.ShortMsgFilter object at 0x10c28b208>]>>>logger.info("Reeeeaaaaallllllly long message")# Length: 31Reeeeaaaaallllllly long message>>>logger.info("Done")# Length: <25, no output
Above, you define a classShortMsgFilter
and override its.filter()
. In.addHandler()
, you could also just pass a callable, such as a function orlambda or a class that defines.__call__()
.
Manager
ClassThere’s one more behind-the-scenes actor oflogging
that is worth touching on: theManager
class. What matters most is not theManager
class but a single instance of it that acts as a container for the growing hierarchy of loggers that are defined across packages. You’ll see in the next section how just a single instance of this class is central to gluing the module together and allowing its parts to talk to each other.
When it comes toLogger
instances, one stands out. It’s called the root logger:
classRootLogger(Logger):def__init__(self,level):Logger.__init__(self,"root",level)# ...root=RootLogger(WARNING)Logger.root=rootLogger.manager=Manager(Logger.root)
The last three lines of this code block are one of the ingenious tricks employed by thelogging
package. Here are a few points:
The root logger is just a no-frills Python object with the identifierroot
. It has a level oflogging.WARNING
and a.name
of"root"
. As far as the classRootLogger
is concerned, this unique name is all that’s special about it.
Theroot
object in turn becomes aclass attribute for theLogger
class. This means that all instances ofLogger
, and theLogger
class itself, all have a.root
attribute that is the root logger. This is another example of a singleton-like pattern being enforced in thelogging
package.
AManager
instance is set as the.manager
class attribute forLogger
. This eventually comes into play inlogging.getLogger("name")
. The.manager
does all the facilitation of searching for existing loggers with the name"name"
and creating them if they don’t exist.
Everything is a child ofroot
in the logger namespace, and I mean everything. That includes loggers that you specify yourself as well as those from third-party libraries that you import.
Remember earlier how the.getEffectiveLevel()
for ourlogger
instances was 30 (WARNING
) even though we had not explicitly set it? That’s because the root logger sits at the top of the hierarchy, and its level is a fallback if any nested loggers have a null level ofNOTSET
:
>>>root=logging.getLogger()# Or getLogger("")>>>root<RootLogger root (WARNING)>>>>root.parentisNoneTrue>>>root.rootisroot# Self-referentialTrue>>>rootislogging.rootTrue>>>root.getEffectiveLevel()30
The same logic applies to the search for a logger’s handlers. The search is effectively a reverse-order search up the tree of a logger’s parents.
The logger hierarchy may seem neat in theory, but how beneficial is it in practice?
Let’s take a break from exploring thelogging
code and foray into writing our own mini-application—one that takes advantage of the logger hierarchy in a way that reduces boilerplate code and keeps things scalable if the project’s codebase grows.
Here’s the project structure:
project/│└── project/ ├── __init__.py ├── utils.py └── base.py
Don’t worry about the application’s main functions inutils.py
andbase.py
. What we’re paying more attention to here is the interaction inlogging
objects between the modules inproject/
.
In this case, say that you want to design a multipronged logging setup:
Each module gets alogger
with multiple handlers.
Some of the handlers are shared between differentlogger
instances in different modules. These handlers only care about level-based filtering, not the module where the log record emanated from. There is a handler forDEBUG
messages, one forINFO
, one forWARNING
, and so on.
Eachlogger
is also tied to one more additional handler that only receivesLogRecord
instances from that lonelogger
. You can call this a module-based file handler.
Visually, what we’re shooting for would look something like this:
The two turquoise objects are instances ofLogger
, established withlogging.getLogger(__name__)
for each module in a package. Everything else is aHandler
instance.
The thinking behind this design is that it’s neatly compartmentalized. You can conveniently look at messages coming from a single logger, or look at messages of a certain level and above coming from any logger or module.
The properties of the logger hierarchy make it suitable for setting up this multipronged logger-handler layout. What does that mean? Here’s a concise explanation from the Django documentation:
Why is the hierarchy important? Well, because loggers can be set to propagate their logging calls to their parents. In this way, you can define a single set of handlers at the root of a logger tree, and capture all logging calls in the subtree of loggers. A logging handler defined in the
project
namespace will catch all logging messages issued on theproject.interesting
andproject.interesting.stuff
loggers. (Source)
The termpropagate refers to how a logger keeps walking up its chain of parents looking for handlers. The.propagate
attribute isTrue
for aLogger
instance by default:
>>>logger=logging.getLogger(__name__)>>>logger.propagateTrue
In.callHandlers()
, ifpropagate
isTrue
, each successive parent gets reassigned to the local variablec
until the hierarchy is exhausted:
classLogger(Filterer):# ...defcallHandlers(self,record):c=selffound=0whilec:forhdlrinc.handlers:found=found+1ifrecord.levelno>=hdlr.level:hdlr.handle(record)ifnotc.propagate:c=Noneelse:c=c.parent
Here’s what this means: because the__name__
dunder variable within a package’s__init__.py
module is just the name of the package, a logger there becomes a parent to any loggers present in other modules in the same package.
Here are the resulting.name
attributes from assigning tologger
withlogging.getLogger(__name__)
:
Module | .name Attribute |
---|---|
project/__init__.py | 'project' |
project/utils.py | 'project.utils' |
project/base.py | 'project.base' |
Because the'project.utils'
and'project.base'
loggers are children of'project'
, they will latch onto not only their own direct handlers but whatever handlers are attached to'project'
.
Let’s build out the modules. First comes__init__.py
:
# __init__.pyimportlogginglogger=logging.getLogger(__name__)logger.setLevel(logging.DEBUG)levels=("DEBUG","INFO","WARNING","ERROR","CRITICAL")forlevelinlevels:handler=logging.FileHandler(f"/tmp/level-{level.lower()}.log")handler.setLevel(getattr(logging,level))logger.addHandler(handler)defadd_module_handler(logger,level=logging.DEBUG):handler=logging.FileHandler(f"/tmp/module-{logger.name.replace('.','-')}.log")handler.setLevel(level)logger.addHandler(handler)
This module is imported when theproject
package is imported. You add a handler for each level inDEBUG
throughCRITICAL
, then attach it to a single logger at the top of the hierarchy.
You also define a utility function that adds one moreFileHandler
to a logger, where thefilename
of the handler corresponds to the module name where the logger is defined. (This assumes the logger is defined with__name__
.)
You can then add some minimal boilerplate logger setup inbase.py
andutils.py
. Notice that you only need to add one additional handler withadd_module_handler()
from__init__.py
. You don’t need to worry about the level-oriented handlers because they are already added to their parent logger named'project'
:
# base.pyimportloggingfromprojectimportadd_module_handlerlogger=logging.getLogger(__name__)add_module_handler(logger)deffunc1():logger.debug("debug called from base.func1()")logger.critical("critical called from base.func1()")
Here’sutils.py
:
# utils.pyimportloggingfromprojectimportadd_module_handlerlogger=logging.getLogger(__name__)add_module_handler(logger)deffunc2():logger.debug("debug called from utils.func2()")logger.critical("critical called from utils.func2()")
Let’s see how all of this works together from a fresh Python session:
>>>frompprintimportpprint>>>importproject>>>fromprojectimportbase,utils>>>project.logger<Logger project (DEBUG)>>>>base.logger,utils.logger(<Logger project.base (DEBUG)>, <Logger project.utils (DEBUG)>)>>>base.logger.handlers[<FileHandler /tmp/module-project-base.log (DEBUG)>]>>>pprint(base.logger.parent.handlers)[<FileHandler /tmp/level-debug.log (DEBUG)>, <FileHandler /tmp/level-info.log (INFO)>, <FileHandler /tmp/level-warning.log (WARNING)>, <FileHandler /tmp/level-error.log (ERROR)>, <FileHandler /tmp/level-critical.log (CRITICAL)>]>>>base.func1()>>>utils.func2()
You’ll see in the resulting log files that our filtration system works as intended. Module-oriented handlers direct one logger to a specific file, while level-oriented handlers direct multiple loggers to a different file:
$cat/tmp/level-debug.logdebug called from base.func1()critical called from base.func1()debug called from utils.func2()critical called from utils.func2()$cat/tmp/level-critical.logcritical called from base.func1()critical called from utils.func2()$cat/tmp/module-project-base.logdebug called from base.func1()critical called from base.func1()$cat/tmp/module-project-utils.logdebug called from utils.func2()critical called from utils.func2()
A drawback worth mentioning is that this design introduces a lot of redundancy. OneLogRecord
instance may go to no less than six files. That’s also a non-negligible amount of file I/O that may add up in a performance-critical application.
Now that you’ve seen a practical example, let’s switch gears and delve into a possible source of confusion inlogging
.
There are two common situations withlogging
when it’s easy to get tripped up:
Each of these has a reason or two commonly associated with it.
You logged a message that seemingly went nowhere, and you’re not sure why.
Don’t forget that theeffective level of a logger for which you don’t otherwise set a custom level isWARNING
, because a logger will walk up its hierarchy until it finds the root logger with its ownWARNING
level:
>>>importlogging>>>logger=logging.getLogger("xyz")>>>logger.debug("mind numbing info here")>>>logger.critical("storm is coming")storm is coming
Because of this default, the.debug()
call goes nowhere.
Instead of being suppressed, a log message appeared in a place that you didn’t expect it to.
When you defined yourlogger
above, you didn’t add any handlers to it. So, why is it writing to the console?
The reason for this is thatlogging
sneakily uses a handler calledlastResort
that writes tosys.stderr
if no other handlers are found:
class_StderrHandler(StreamHandler):# ...@propertydefstream(self):returnsys.stderr_defaultLastResort=_StderrHandler(WARNING)lastResort=_defaultLastResort
This kicks in when a logger goes to find its handlers:
classLogger(Filterer):# ...defcallHandlers(self,record):c=selffound=0whilec:forhdlrinc.handlers:found=found+1ifrecord.levelno>=hdlr.level:hdlr.handle(record)ifnotc.propagate:c=Noneelse:c=c.parentif(found==0):iflastResort:ifrecord.levelno>=lastResort.level:lastResort.handle(record)
If the logger gives up on its search for handlers (both its own direct handlers and attributes of parent loggers), then it picks up thelastResort
handler and uses that.
There’s one more subtle detail worth knowing about. This section has largely talked about the instance methods (methods that a class defines) rather than the module-level functions of thelogging
package that carry the same name.
If you use the functions, such aslogging.info()
rather thanlogger.info()
, then something slightly different happens internally. The function callslogging.basicConfig()
, which adds aStreamHandler
that writes tosys.stderr
. In the end, the behavior is virtually the same:
>>>importlogging>>>root=logging.getLogger("")>>>root.handlers[]>>>root.hasHandlers()False>>>logging.basicConfig()>>>root.handlers[<StreamHandler <stderr> (NOTSET)>]>>>root.hasHandlers()True
It’s time to switch gears and take a closer look at how messages themselves are joined with their data. While it’s been supplanted bystr.format()
andf-strings, you’ve probably used Python’s percent-style formatting to do something like this:
>>>print("To iterate is%s, to recurse%s"%("human","divine"))To iterate is human, to recurse divine
As a result, you may be tempted to do the same thing in alogging
call:
>>># Bad! Check out a more efficient alternative below.>>>logging.warning("To iterate is%s, to recurse%s"%("human","divine"))WARNING:root:To iterate is human, to recurse divine
This uses the entire format string and its arguments as themsg
argument tologging.warning()
.
Here is the recommended alternative,straight from thelogging
docs:
>>># Better: formatting doesn't occur until it really needs to.>>>logging.warning("To iterate is%s, to recurse%s","human","divine")WARNING:root:To iterate is human, to recurse divine
It looks a little weird, right? This seems to defy the conventions of how percent-style string formatting works, but it’s a more efficient function call because the format string gets formattedlazily rather thangreedily. Here’s what that means.
The method signature forLogger.warning()
looks like this:
defwarning(self,msg,*args,**kwargs)
The same applies to the other methods, such as.debug()
. When you callwarning("To iterate is %s, to recurse %s", "human", "divine")
, both"human"
and"divine"
get caught as*args
and, within the scope of the method’s body,args
is equal to("human", "divine")
.
Contrast this to the first call above:
logging.warning("To iterate is%s, to recurse%s"%("human","divine"))
In this form, everything in the parentheses gets immediately merged together into"To iterate is human, to recurse divine"
and passed asmsg
, whileargs
is an empty tuple.
Why does this matter? Repeated logging calls can degrade runtime performance slightly, but thelogging
package does its very best to control that and keep it in check. By not merging the format string with its arguments right away,logging
is delaying the string formatting until theLogRecord
is requested by aHandler
.
This happens inLogRecord.getMessage()
, so only afterlogging
deems that theLogRecord
will actually be passed to a handler does it become its fully merged self.
All that is to say that thelogging
package makes some very fine-tuned performance optimizations in the right places. This may seem like minutia, but if you’re making the samelogging.debug()
call a million times inside a loop, and theargs
are function calls, then the lazy nature of howlogging
does string formatting can make a difference.
Before doing any merging ofmsg
andargs
, aLogger
instance will check its.isEnabledFor()
to see if that merging should be done in the first place.
Towards the bottom oflogging/__init__.py
sit the module-level functions that are advertised up front in the public API oflogging
. You already saw theLogger
methods such as.debug()
,.info()
, and.warning()
. The top-level functions are wrappers around the corresponding methods of the same name, but they have two important features:
They always call their corresponding method from the root logger,root
.
Before calling the root logger methods, they calllogging.basicConfig()
with no arguments ifroot
doesn’t have any handlers. As you saw earlier, it is this call that sets asys.stdout
handler for the root logger.
For illustration, here’slogging.error()
:
deferror(msg,*args,**kwargs):iflen(root.handlers)==0:basicConfig()root.error(msg,*args,**kwargs)
You’ll find the same pattern forlogging.debug()
,logging.info()
, and the others as well. Tracing the chain of commands is interesting. Eventually, you’ll end up at the same place, which is where the internalLogger._log()
is called.
The calls todebug()
,info()
,warning()
, and the other level-based functions all route to here._log()
primarily has two purposes:
Callself.makeRecord()
: Make aLogRecord
instance from themsg
and other arguments you pass to it.
Callself.handle()
: This determines what actually gets done with the record. Where does it get sent? Does it make it there or get filtered out?
Here’s that entire process in one diagram:
You can also trace the call stack withpdb
.
>>>importlogging>>>importpdb>>>pdb.run('logging.warning("%s-%s", "uh", "oh")')> <string>(1)<module>()(Pdb) s--Call--> lib/python3.7/logging/__init__.py(1971)warning()-> def warning(msg, *args, **kwargs):(Pdb) s> lib/python3.7/logging/__init__.py(1977)warning()-> if len(root.handlers) == 0:(Pdb) unt> lib/python3.7/logging/__init__.py(1978)warning()-> basicConfig()(Pdb) unt> lib/python3.7/logging/__init__.py(1979)warning()-> root.warning(msg, *args, **kwargs)(Pdb) s--Call--> lib/python3.7/logging/__init__.py(1385)warning()-> def warning(self, msg, *args, **kwargs):(Pdb) l1380 logger.info("Houston, we have a %s", "interesting problem", exc_info=1)1381 """1382 if self.isEnabledFor(INFO):1383 self._log(INFO, msg, args, **kwargs)13841385 -> def warning(self, msg, *args, **kwargs):1386 """1387 Log 'msg % args' with severity 'WARNING'.13881389 To pass exception information, use the keyword argument exc_info with1390 a true value, e.g.(Pdb) s> lib/python3.7/logging/__init__.py(1394)warning()-> if self.isEnabledFor(WARNING):(Pdb) unt> lib/python3.7/logging/__init__.py(1395)warning()-> self._log(WARNING, msg, args, **kwargs)(Pdb) s--Call--> lib/python3.7/logging/__init__.py(1496)_log()-> def _log(self, level, msg, args, exc_info=None, extra=None, stack_info=False):(Pdb) s> lib/python3.7/logging/__init__.py(1501)_log()-> sinfo = None(Pdb) unt 1517> lib/python3.7/logging/__init__.py(1517)_log()-> record = self.makeRecord(self.name, level, fn, lno, msg, args,(Pdb) s> lib/python3.7/logging/__init__.py(1518)_log()-> exc_info, func, extra, sinfo)(Pdb) s--Call--> lib/python3.7/logging/__init__.py(1481)makeRecord()-> def makeRecord(self, name, level, fn, lno, msg, args, exc_info,(Pdb) p name'root'(Pdb) p level30(Pdb) p msg'%s-%s'(Pdb) p args('uh', 'oh')(Pdb) up> lib/python3.7/logging/__init__.py(1518)_log()-> exc_info, func, extra, sinfo)(Pdb) unt> lib/python3.7/logging/__init__.py(1519)_log()-> self.handle(record)(Pdb) nWARNING:root:uh-oh
getLogger()
Really Do?Also hiding in this section of the source code is the top-levelgetLogger()
, which wrapsLogger.manager.getLogger()
:
defgetLogger(name=None):ifname:returnLogger.manager.getLogger(name)else:returnroot
This is the entry point for enforcing the singleton logger design:
If you specify aname
, then the underlying.getLogger()
does adict
lookup on the stringname
. What this comes down to is a lookup in theloggerDict
oflogging.Manager
. This is a dictionary of all registered loggers, including the intermediatePlaceHolder
instances that are generated when you reference a logger far down in the hierarchy before referencing its parents.
Otherwise,root
is returned. There is only oneroot
—the instance ofRootLogger
discussed above.
This feature is what lies behind a trick that can let you peek into all of the registered loggers:
>>>importlogging>>>logging.Logger.manager.loggerDict{}>>>frompprintimportpprint>>>importasyncio>>>pprint(logging.Logger.manager.loggerDict){'asyncio': <Logger asyncio (WARNING)>, 'concurrent': <logging.PlaceHolder object at 0x10d153710>, 'concurrent.futures': <Logger concurrent.futures (WARNING)>}
Whoa, hold on a minute. What’s happening here? It looks like something changed internally to thelogging
package as a result of an import of another library, and that’s exactly what happened.
Firstly, recall thatLogger.manager
is a class attribute, where an instance ofManager
is tacked onto theLogger
class. Themanager
is designed to track and manage all of the singleton instances ofLogger
. These are housed in.loggerDict
.
Now, when you initially importlogging
, this dictionary is empty. But after you importasyncio
, the same dictionary gets populated with three loggers. This is an example of one module setting the attributes of another module in-place. Sure enough, inside ofasyncio/log.py
, you’ll find the following:
importlogginglogger=logging.getLogger(__package__)# "asyncio"
The key-value pair isset inLogger.getLogger()
so that themanager
can oversee the entire namespace of loggers. This means that the objectasyncio.log.logger
gets registered in the logger dictionary that belongs to thelogging
package. Something similar happens in theconcurrent.futures
package as well, which is imported byasyncio
.
You can see the power of the singleton design in an equivalence test:
>>>obj1=logging.getLogger("asyncio")>>>obj2=logging.Logger.manager.loggerDict["asyncio"]>>>obj1isobj2True
This comparison illustrates (glossing over a few details) whatgetLogger()
ultimately does.
NullHandler
?That brings us to the final hundred or so lines in thelogging/__init__.py
source, whereNullHandler
is defined. Here’s the definition in all its glory:
classNullHandler(Handler):defhandle(self,record):passdefemit(self,record):passdefcreateLock(self):self.lock=None
TheNullHandler
is all about the distinctions between logging in a library versus an application. Let’s see what that means.
Alibrary is an extensible, generalizable Python package that is intended for other users to install and set up. It is built by a developer with the express purpose of being distributed to users. Examples include popular open-source projects likeNumPy,dateutil
, andcryptography
.
Anapplication (or app, or program) is designed for a more specific purpose and a much smaller set of users (possibly just one user). It’s a program or set of programs highly tailored by the user to do a limited set of things. An example of an application is a Django app that sits behind a web page. Applications commonly use (import
) libraries and the tools they contain.
When it comes to logging, there are different best practices in a library versus an app.
That’s whereNullHandler
fits in. It’s basically a do-nothing stub class.
If you’re writing a Python library, you really need to do this one minimalist piece of setup in your package’s__init__.py
:
# Place this in your library's uppermost `__init__.py`# Nothing else!importlogginglogging.getLogger(__name__).addHandler(NullHandler())
This serves two critical purposes.
Firstly, a library logger that is declared withlogger = logging.getLogger(__name__)
(without any further configuration) will log tosys.stderr
by default, even if that’s not what the end user wants. This could be described as an opt-out approach, where the end user of the library has to go in and disable logging to their console if they don’t want it.
Common wisdom says to use an opt-in approach instead: don’t emit any log messages by default, and let the end users of the library determine if they want to further configure the library’s loggers and add handlers to them. Here’s that philosophy worded more bluntly by the author of thelogging
package, Vinay Sajip:
A third party library which uses
logging
should not spew logging output by default which may not be wanted by a developer/user of an application which uses it. (Source)
This leaves it up to the library user, not library developer, to incrementally call methods such aslogger.addHandler()
orlogger.setLevel()
.
The second reason thatNullHandler
exists is more archaic. In Python 2.7 and earlier, trying to log aLogRecord
from a logger that has no handler set wouldraise a warning. Adding the no-op classNullHandler
will avert this.
Here’s what specifically happens in the linelogging.getLogger(__name__).addHandler(NullHandler())
from above:
Python gets (creates) theLogger
instance with the same name as your package. If you’re designing thecalculus
package, within__init__.py
, then__name__
will be equal to'calculus'
.
ANullHandler
instance gets attached to this logger. That means that Python will not default to using thelastResort
handler.
Keep in mind that any logger created in any of the other.py
modules of the package will be children of this logger in the logger hierarchy. Because this handler also belongs to them, they won’t need to use thelastResort
handler and won’t default to logging to standard error (stderr
).
As a quick example, let’s say your library has the following structure:
calculus/│├── __init__.py└── integration.py
Inintegration.py
, as the library developer you are free to do the following:
# calculus/integration.pyimportlogginglogger=logging.getLogger(__name__)deffunc(x):logger.warning("Look!")# Do stuffreturnNone
Now, a user comes along and installs your library from PyPI viapip install calculus
. They usefrom calculus.integration import func
in some application code. This user is free to manipulate and configure thelogger
object from the library like any other Python object, to their heart’s content.
One thing that you may be wary of is the danger of exceptions that stem from your calls tologging
. If you have alogging.error()
call that is designed to give you some more verbose debugging information, but that call itself for some reason raises an exception, that would be the height of irony, right?
Cleverly, if thelogging
package encounters an exception that has to do with logging itself, then it will print thetraceback but not raise the exception itself.
Here’s an example that deals with a common typo: passing two arguments to a format string that is only expecting one argument. The important distinction is that what you see below isnot an exception being raised, but rather a prettified printed traceback of the internal exception, which itself was suppressed:
>>>logging.critical("This%s has too many arguments","msg","other")--- Logging error ---Traceback (most recent call last): File"lib/python3.7/logging/__init__.py", line1034, inemitmsg=self.format(record) File"lib/python3.7/logging/__init__.py", line880, informatreturnfmt.format(record) File"lib/python3.7/logging/__init__.py", line619, informatrecord.message=record.getMessage() File"lib/python3.7/logging/__init__.py", line380, ingetMessagemsg=msg%self.argsTypeError:not all arguments converted during string formattingCall stack: File"<stdin>", line1, in<module>Message:'This %s has too many arguments'Arguments: ('msg', 'other')
This lets your program gracefully carry on with its actual program flow. The rationale is that you wouldn’t want an uncaught exception to come from alogging
call itself and stop a program dead in its tracks.
Tracebacks can be messy, but this one is informative and relatively straightforward. What enables the suppression of exceptions related tologging
isHandler.handleError()
. When the handler calls.emit()
, which is the method where it attempts to log the record, it falls back to.handleError()
if something goes awry. Here’s the implementation of.emit()
for theStreamHandler
class:
defemit(self,record):try:msg=self.format(record)stream=self.streamstream.write(msg+self.terminator)self.flush()exceptException:self.handleError(record)
Any exception related to the formatting and writing gets caught rather than being raised, andhandleError
gracefully writes the traceback tosys.stderr
.
Speaking of exceptions and their tracebacks, what about cases where your program encounters them but should log the exception and keep chugging along in its execution?
Let’s walk through a couple of ways to do this.
Here’s a contrived example of a lottery simulator using code that isn’t Pythonic on purpose. You’re developing an online lottery game where users can wager on their lucky number:
importrandomclassLottery(object):def__init__(self,n):self.n=ndefmake_tickets(self):foriinrange(self.n):yieldidefdraw(self):pool=self.make_tickets()random.shuffle(pool)returnnext(pool)
Behind the frontend application sits the critical code below. You want to make sure that you keep track of any errors caused by the site that may make a user lose their money. The first (suboptimal) way is to uselogging.error()
and log thestr
form of the exception instance itself:
try:lucky_number=int(input("Enter your ticket number: "))drawn=Lottery(n=20).draw()iflucky_number==drawn:print("Winner chicken dinner!")exceptExceptionase:# NOTE: See below for a better way to do this.logging.error("Could not draw ticket:%s",e)
This will only get you the actual exception message, rather than the traceback. You check the logs on your website’s server and find this cryptic message:
ERROR:root:Could not draw ticket: object of type 'generator' has no len()
Hmm. As the application developer, you’ve got a serious problem, and a user got ripped off as a result. But maybe this exception message itself isn’t very informative. Wouldn’t it be nice to see the lineage of the traceback that led to this exception?
The proper solution is to uselogging.exception()
, which logs a message with levelERROR
and also displays the exception traceback. Replace the two final lines above with these:
exceptException:logging.exception("Could not draw ticket")
Now you get a better indication of what’s going on:
ERROR:root:Could not draw ticketTraceback (most recent call last): File"<stdin>", line3, in<module> File"<stdin>", line9, indraw File"lib/python3.7/random.py", line275, inshuffleforiinreversed(range(1,len(x))):TypeError:object of type 'generator' has no len()
Usingexception()
saves you from having to reference the exception yourself becauselogging
pulls it in withsys.exc_info()
.
This makes things clearer that the problem stems fromrandom.shuffle()
, which needs to know the length of the object it is shuffling. Because ourLottery
class passes a generator toshuffle()
, it gets held up and raises before the pool can be shuffled, much less generate a winning ticket.
In large, full-blown applications, you’ll findlogging.exception()
to be even more useful when deep, multi-library tracebacks are involved, and you can’t step into them with a live debugger likepdb
.
The code forlogging.Logger.exception()
, and hencelogging.exception()
, is just a single line:
defexception(self,msg,*args,exc_info=True,**kwargs):self.error(msg,*args,exc_info=exc_info,**kwargs)
That is,logging.exception()
just callslogging.error()
withexc_info=True
, which is otherwiseFalse
by default. If you want to log an exception traceback but at a level different thanlogging.ERROR
, just call that function or method withexc_info=True
.
Keep in mind thatexception()
should only be called in the context of an exception handler, inside of anexcept
block:
foriindata:try:result=my_longwinded_nested_function(i)exceptValueError:# We are in the context of exception handler now.# If it's unclear exactly *why* we couldn't process# `i`, then log the traceback and move on rather than# ditching completely.logger.exception("Could not process%s",i)continue
Use this pattern sparingly rather than as a means to suppress any exception. It can be most helpful when you’re debugging a long function call stack where you’re otherwise seeing an ambiguous, unclear, and hard-to-track error.
Pat yourself on the back, because you’ve just walked through almost 2,000 lines of dense source code. You’re now better equipped to deal with thelogging
package!
Keep in mind that this tutorial has been far from exhaustive in covering all of the classes found in thelogging
package. There’s even more machinery that glues everything together. If you’d like to learn more, then you can look into theFormatter
classes and the separate moduleslogging/config.py
andlogging/handlers.py
.
🐍 Python Tricks 💌
Get a short & sweetPython Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.
AboutBrad Solomon
Brad is a software engineer and a member of the Real Python Tutorial Team.
» More about BradMasterReal-World Python Skills With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
MasterReal-World Python Skills
With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
What Do You Think?
What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.
Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students.Get tips for asking good questions andget answers to common questions in our support portal.
Keep Learning
Already have an account?Sign-In
Almost there! Complete this form and click the button below to gain instant access:
5 Thoughts On Python Mastery