Recommended Video Course
Hands-On Python 3 Concurrency With the asyncio Module
Table of Contents
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding:Hands-On Python 3 Concurrency With the asyncio Module
Async IO is a concurrent programming design that has received dedicated support in Python, evolving rapidly from Python 3.4 through 3.7, andprobably beyond.
You may be thinking with dread, “Concurrency, parallelism, threading, multiprocessing. That’s a lot to grasp already. Where does async IO fit in?”
This tutorial is built to help you answer that question, giving you a firmer grasp of Python’s approach to async IO.
Here’s what you’ll cover:
Asynchronous IO (async IO): a language-agnostic paradigm (model) that has implementations across a host of programming languages
async
/await
: two newPython keywords that are used to define coroutines
asyncio
: the Python package that provides a foundation and API for running and managing coroutines
Coroutines (specialized generator functions) are the heart of async IO in Python, and we’ll dive into them later on.
Note: In this article, I use the termasync IO to denote the language-agnostic design of asynchronous IO, whileasyncio
refers to the Python package.
Before you get started, you’ll need to make sure you’re set up to useasyncio
and other libraries found in this tutorial.
Free Bonus:5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.
Take the Quiz: Test your knowledge with our interactive “Async IO in Python: A Complete Walkthrough” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Async IO in Python: A Complete WalkthroughIn this quiz, you'll test your understanding of async IO in Python. With this knowledge, you'll be able to understand the language-agnostic paradigm of asynchronous IO, use the async/await keywords to define coroutines, and use the asyncio package to run and manage coroutines.
You’ll need Python 3.7 or above to follow this article in its entirety, as well as theaiohttp
andaiofiles
packages:
$python3.7-mvenv./py37async$source./py37async/bin/activate# Windows: .\py37async\Scripts\activate.bat$pipinstall--upgradepipaiohttpaiofiles# Optional: aiodns
For help with installing Python 3.7 and setting up a virtual environment, check outPython 3 Installation & Setup Guide orVirtual Environments Primer.
With that, let’s jump in.
Async IO is a bit lesser known than its tried-and-true cousins, multiprocessing andthreading. This section will give you a fuller picture of what async IO is and how it fits into its surrounding landscape.
Concurrency and parallelism are expansive subjects that are not easy to wade into. While this article focuses on async IO and its implementation in Python, it’s worth taking a minute to compare async IO to its counterparts in order to have context about how async IO fits into the larger, sometimes dizzying puzzle.
Parallelism consists of performing multiple operations at the same time.Multiprocessing is a means to effect parallelism, and it entails spreading tasks over a computer’s central processing units (CPUs, or cores). Multiprocessing is well-suited for CPU-bound tasks: tightly boundfor
loops and mathematical computations usually fall into this category.
Concurrency is a slightly broader term than parallelism. It suggests that multiple tasks have the ability to run in an overlapping manner. (There’s a saying that concurrency does not imply parallelism.)
Threading is a concurrent execution model whereby multiplethreads take turns executing tasks. One process can contain multiple threads. Python has a complicated relationship with threading thanks to itsGIL, but that’s beyond the scope of this article.
What’s important to know about threading is that it’s better for IO-bound tasks. While a CPU-bound task is characterized by the computer’s cores continually working hard from start to finish, an IO-bound job is dominated by a lot of waiting on input/output to complete.
To recap the above, concurrency encompasses both multiprocessing (ideal for CPU-bound tasks) and threading (suited for IO-bound tasks). Multiprocessing is a form of parallelism, with parallelism being a specific type (subset) of concurrency. The Python standard library has offered longstandingsupport for both of these through itsmultiprocessing
,threading
, andconcurrent.futures
packages.
Now it’s time to bring a new member to the mix. Over the last few years, a separate design has been more comprehensively built intoCPython: asynchronous IO, enabled through the standard library’sasyncio
package and the newasync
andawait
language keywords. To be clear, async IO is not a newly invented concept, and it has existed or is being built into other languages and runtime environments, such asGo,C#, orScala.
Theasyncio
package is billed by the Python documentation asa library to write concurrent code. However, async IO is not threading, nor is it multiprocessing. It is not built on top of either of these.
In fact, async IO is a single-threaded, single-process design: it usescooperative multitasking, a term that you’ll flesh out by the end of this tutorial. It has been said in other words that async IO gives a feeling of concurrency despite using a single thread in a single process. Coroutines (a central feature of async IO) can be scheduled concurrently, but they are not inherently concurrent.
To reiterate, async IO is a style of concurrent programming, but it is not parallelism. It’s more closely aligned with threading than with multiprocessing but is very much distinct from both of these and is a standalone member in concurrency’s bag of tricks.
That leaves one more term. What does it mean for something to beasynchronous? This isn’t a rigorous definition, but for our purposes here, I can think of two properties:
Here’s a diagram to put it all together. The white terms represent concepts, and the green terms represent ways in which they are implemented or effected:
I’ll stop there on the comparisons between concurrent programming models. This tutorial is focused on the subcomponent that is async IO, how to use it, and theAPIs that have sprung up around it. For a thorough exploration of threading versus multiprocessing versus async IO, pause here and check out Jim Anderson’soverview of concurrency in Python. Jim is way funnier than me and has sat in more meetings than me, to boot.
Async IO may at first seem counterintuitive and paradoxical. How does something that facilitates concurrent code use a single thread and a single CPU core? I’ve never been very good at conjuring up examples, so I’d like to paraphrase one from Miguel Grinberg’s 2017PyCon talk, which explains everything quite beautifully:
Chess master Judit Polgár hosts a chess exhibition in which she plays multiple amateur players. She has two ways of conducting the exhibition: synchronously and asynchronously.
Assumptions:
- 24 opponents
- Judit makes each chess move in 5 seconds
- Opponents each take 55 seconds to make a move
- Games average 30 pair-moves (60 moves total)
Synchronous version: Judit plays one game at a time, never two at the same time, until the game is complete. Each game takes(55 + 5) * 30 == 1800 seconds, or 30 minutes. The entire exhibition takes24 * 30 == 720 minutes, or12 hours.
Asynchronous version: Judit moves from table to table, making one move at each table. She leaves the table and lets the opponent make their next move during the wait time. One move on all 24 games takes Judit24 * 5 == 120 seconds, or 2 minutes. The entire exhibition is now cut down to120 * 30 == 3600 seconds, or just1 hour.(Source)
There is only one Judit Polgár, who has only two hands and makes only one move at a time by herself. But playing asynchronously cuts the exhibition time down from 12 hours to one. So, cooperative multitasking is a fancy way of saying that a program’s event loop (more on that later) communicates with multiple tasks to let each take turns running at the optimal time.
Async IO takes long waiting periods in which functions would otherwise be blocking and allows other functions to run during that downtime. (A function that blocks effectively forbids others from running from the time that it starts until the time that it returns.)
I’ve heard it said, “Use async IO when you can; use threading when you must.” The truth is that building durable multithreaded code can be hard and error-prone. Async IO avoids some of the potential speedbumps that you might otherwise encounter with a threaded design.
But that’s not to say that async IO in Python is easy. Be warned: when you venture a bit below the surface level, async programming can be difficult too! Python’s async model is built around concepts such as callbacks, events, transports, protocols, and futures—just the terminology can be intimidating. The fact that its API has been changing continually makes it no easier.
Luckily,asyncio
has matured to a point where most of its features are no longer provisional, while its documentation has received a huge overhaul and some quality resources on the subject are starting to emerge as well.
asyncio
Package andasync
/await
Now that you have some background on async IO as a design, let’s explore Python’s implementation. Python’sasyncio
package (introduced in Python 3.4) and its two keywords,async
andawait
, serve different purposes but come together to help you declare, build, execute, and manage asynchronous code.
async
/await
Syntax and Native CoroutinesA Word of Caution: Be careful what you read out there on the Internet. Python’s async IO API has evolved rapidly from Python 3.4 to Python 3.7. Some old patterns are no longer used, and some things that were at first disallowed are now allowed through new introductions.
At the heart of async IO are coroutines. A coroutine is a specialized version of a Python generator function. Let’s start with a baseline definition and then build off of it as you progress here: a coroutine is a function that can suspend its execution before reachingreturn
, and it can indirectly pass control to another coroutine for some time.
Later, you’ll dive a lot deeper into how exactly the traditional generator is repurposed into a coroutine. For now, the easiest way to pick up how coroutines work is to start making some.
Let’s take the immersive approach and write some async IO code. This short program is theHello World
of async IO but goes a long way towards illustrating its core functionality:
#!/usr/bin/env python3# countasync.pyimportasyncioasyncdefcount():print("One")awaitasyncio.sleep(1)print("Two")asyncdefmain():awaitasyncio.gather(count(),count(),count())if__name__=="__main__":importtimes=time.perf_counter()asyncio.run(main())elapsed=time.perf_counter()-sprint(f"{__file__} executed in{elapsed:0.2f} seconds.")
When you execute this file, take note of what looks different than if you were to define the functions with justdef
andtime.sleep()
:
$python3countasync.pyOneOneOneTwoTwoTwocountasync.py executed in 1.01 seconds.
The order of this output is the heart of async IO. Talking to each of the calls tocount()
is a single event loop, or coordinator. When each task reachesawait asyncio.sleep(1)
, the function yells up to the event loop and gives control back to it, saying, “I’m going to be sleeping for 1 second. Go ahead and let something else meaningful be done in the meantime.”
Contrast this to the synchronous version:
#!/usr/bin/env python3# countsync.pyimporttimedefcount():print("One")time.sleep(1)print("Two")defmain():for_inrange(3):count()if__name__=="__main__":s=time.perf_counter()main()elapsed=time.perf_counter()-sprint(f"{__file__} executed in{elapsed:0.2f} seconds.")
When executed, there is a slight but critical change in order and execution time:
$python3countsync.pyOneTwoOneTwoOneTwocountsync.py executed in 3.01 seconds.
While usingtime.sleep()
andasyncio.sleep()
may seem banal, they are used as stand-ins for any time-intensive processes that involve wait time. (The most mundane thing you can wait on is asleep()
call that does basically nothing.) That is,time.sleep()
can represent any time-consuming blocking function call, whileasyncio.sleep()
is used to stand in for a non-blocking call (but one that also takes some time to complete).
As you’ll see in the next section, the benefit of awaiting something, includingasyncio.sleep()
, is that the surrounding function can temporarily cede control to another function that’s more readily able to do something immediately. In contrast,time.sleep()
or any other blocking call is incompatible with asynchronous Python code, because it will stop everything in its tracks for the duration of the sleep time.
At this point, a more formal definition ofasync
,await
, and the coroutine functions that they create are in order. This section is a little dense, but getting a hold ofasync
/await
is instrumental, so come back to this if you need to:
The syntaxasync def
introduces either anative coroutine or anasynchronous generator. The expressionsasync with
andasync for
are also valid, and you’ll see them later on.
The keywordawait
passes function control back to the event loop. (It suspends the execution of the surrounding coroutine.) If Python encounters anawait f()
expression in the scope ofg()
, this is howawait
tells the event loop, “Suspend execution ofg()
until whatever I’m waiting on—the result off()
—is returned. In the meantime, go let something else run.”
In code, that second bullet point looks roughly like this:
asyncdefg():# Pause here and come back to g() when f() is readyr=awaitf()returnr
There’s also a strict set of rules around when and how you can and cannot useasync
/await
. These can be handy whether you are still picking up the syntax or already have exposure to usingasync
/await
:
A function that you introduce withasync def
is a coroutine. It may useawait
,return
, oryield
, but all of these are optional. Declaringasync def noop(): pass
is valid:
Usingawait
and/orreturn
creates a coroutine function. To call a coroutine function, you mustawait
it to get its results.
It is less common (and only recently legal in Python) to useyield
in anasync def
block. This creates anasynchronous generator, which you iterate over withasync for
. Forget about async generators for the time being and focus on getting down the syntax for coroutine functions, which useawait
and/orreturn
.
Anything defined withasync def
may not useyield from
, which will raise aSyntaxError
.
Just like it’s aSyntaxError
to useyield
outside of adef
function, it is aSyntaxError
to useawait
outside of anasync def
coroutine. You can only useawait
in the body of coroutines.
Here are some terse examples meant to summarize the above few rules:
asyncdeff(x):y=awaitz(x)# OK - `await` and `return` allowed in coroutinesreturnyasyncdefg(x):yieldx# OK - this is an async generatorasyncdefm(x):yield fromgen(x)# No - SyntaxErrordefm(x):y=awaitz(x)# Still no - SyntaxError (no `async def` here)returny
Finally, when you useawait f()
, it’s required thatf()
be an object that isawaitable. Well, that’s not very helpful, is it? For now, just know that an awaitable object is either (1) another coroutine or (2) an object defining an.__await__()
dunder method that returns an iterator. If you’re writing a program, for the large majority of purposes, you should only need to worry about case #1.
That brings us to one more technical distinction that you may see pop up: an older way of marking a function as a coroutine is to decorate a normaldef
function with@asyncio.coroutine
. The result is agenerator-based coroutine. This construction has been outdated since theasync
/await
syntax was put in place in Python 3.5.
These two coroutines are essentially equivalent (both are awaitable), but the first isgenerator-based, while the second is anative coroutine:
importasyncio@asyncio.coroutinedefpy34_coro():"""Generator-based coroutine, older syntax"""yield fromstuff()asyncdefpy35_coro():"""Native coroutine, modern syntax"""awaitstuff()
If you’re writing any code yourself, prefer native coroutines for the sake of being explicit rather than implicit. Generator-based coroutines will beremoved in Python 3.10.
Towards the latter half of this tutorial, we’ll touch on generator-based coroutines for explanation’s sake only. The reason thatasync
/await
were introduced is to make coroutines a standalone feature of Python that can be easily differentiated from a normal generator function, thus reducing ambiguity.
Don’t get bogged down in generator-based coroutines, which have beendeliberately outdated byasync
/await
. They have their own small set of rules (for instance,await
cannot be used in a generator-based coroutine) that are largely irrelevant if you stick to theasync
/await
syntax.
Without further ado, let’s take on a few more involved examples.
Here’s one example of how async IO cuts down on wait time: given a coroutinemakerandom()
that keeps producing random integers in the range [0, 10], until one of them exceeds a threshold, you want to let multiple calls of this coroutine not need to wait for each other to complete in succession. You can largely follow the patterns from the two scripts above, with slight changes:
#!/usr/bin/env python3# rand.pyimportasyncioimportrandom# ANSI colorsc=("\033[0m",# End of color"\033[36m",# Cyan"\033[91m",# Red"\033[35m",# Magenta)asyncdefmakerandom(idx:int,threshold:int=6)->int:print(c[idx+1]+f"Initiated makerandom({idx}).")i=random.randint(0,10)whilei<=threshold:print(c[idx+1]+f"makerandom({idx}) =={i} too low; retrying.")awaitasyncio.sleep(idx+1)i=random.randint(0,10)print(c[idx+1]+f"---> Finished: makerandom({idx}) =={i}"+c[0])returniasyncdefmain():res=awaitasyncio.gather(*(makerandom(i,10-i-1)foriinrange(3)))returnresif__name__=="__main__":random.seed(444)r1,r2,r3=asyncio.run(main())print()print(f"r1:{r1}, r2:{r2}, r3:{r3}")
The colorized output says a lot more than I can and gives you a sense for how this script is carried out:
This program uses one main coroutine,makerandom()
, and runs it concurrently across 3 different inputs. Most programs will contain small, modular coroutines and one wrapper function that serves to chain each of the smaller coroutines together.main()
is then used to gather tasks (futures) by mapping the central coroutine across some iterable or pool.
In this miniature example, the pool isrange(3)
. In a fuller example presented later, it is a set of URLs that need to be requested, parsed, and processed concurrently, andmain()
encapsulates that entire routine for each URL.
While “making random integers” (which is CPU-bound more than anything) is maybe not the greatest choice as a candidate forasyncio
, it’s the presence ofasyncio.sleep()
in the example that is designed to mimic an IO-bound process where there is uncertain wait time involved. For example, theasyncio.sleep()
call might represent sending and receiving not-so-random integers between two clients in a message application.
Async IO comes with its own set of possible script designs, which you’ll get introduced to in this section.
A key feature of coroutines is that they can be chained together. (Remember, a coroutine object is awaitable, so another coroutine canawait
it.) This allows you to break programs into smaller, manageable, recyclable coroutines:
#!/usr/bin/env python3# chained.pyimportasyncioimportrandomimporttimeasyncdefpart1(n:int)->str:i=random.randint(0,10)print(f"part1({n}) sleeping for{i} seconds.")awaitasyncio.sleep(i)result=f"result{n}-1"print(f"Returning part1({n}) =={result}.")returnresultasyncdefpart2(n:int,arg:str)->str:i=random.randint(0,10)print(f"part2{n,arg} sleeping for{i} seconds.")awaitasyncio.sleep(i)result=f"result{n}-2 derived from{arg}"print(f"Returning part2{n,arg} =={result}.")returnresultasyncdefchain(n:int)->None:start=time.perf_counter()p1=awaitpart1(n)p2=awaitpart2(n,p1)end=time.perf_counter()-startprint(f"-->Chained result{n} =>{p2} (took{end:0.2f} seconds).")asyncdefmain(*args):awaitasyncio.gather(*(chain(n)forninargs))if__name__=="__main__":importsysrandom.seed(444)args=[1,2,3]iflen(sys.argv)==1elsemap(int,sys.argv[1:])start=time.perf_counter()asyncio.run(main(*args))end=time.perf_counter()-startprint(f"Program finished in{end:0.2f} seconds.")
Pay careful attention to the output, wherepart1()
sleeps for a variable amount of time, andpart2()
begins working with the results as they become available:
$python3chained.py963part1(9) sleeping for 4 seconds.part1(6) sleeping for 4 seconds.part1(3) sleeping for 0 seconds.Returning part1(3) == result3-1.part2(3, 'result3-1') sleeping for 4 seconds.Returning part1(9) == result9-1.part2(9, 'result9-1') sleeping for 7 seconds.Returning part1(6) == result6-1.part2(6, 'result6-1') sleeping for 4 seconds.Returning part2(3, 'result3-1') == result3-2 derived from result3-1.-->Chained result3 => result3-2 derived from result3-1 (took 4.00 seconds).Returning part2(6, 'result6-1') == result6-2 derived from result6-1.-->Chained result6 => result6-2 derived from result6-1 (took 8.01 seconds).Returning part2(9, 'result9-1') == result9-2 derived from result9-1.-->Chained result9 => result9-2 derived from result9-1 (took 11.01 seconds).Program finished in 11.01 seconds.
In this setup, the runtime ofmain()
will be equal to the maximum runtime of the tasks that it gathers together and schedules.
Theasyncio
package providesqueue classes that are designed to be similar to classes of thequeue
module. In our examples so far, we haven’t really had a need for a queue structure. Inchained.py
, each task (future) is composed of a set of coroutines that explicitly await each other and pass through a single input per chain.
There is an alternative structure that can also work with async IO: a number of producers, which are not associated with each other, add items to a queue. Each producer may add multiple items to the queue at staggered, random, unannounced times. A group of consumers pull items from the queue as they show up, greedily and without waiting for any other signal.
In this design, there is no chaining of any individual consumer to a producer. The consumers don’t know the number of producers, or even the cumulative number of items that will be added to the queue, in advance.
It takes an individual producer or consumer a variable amount of time to put and extract items from the queue, respectively. The queue serves as a throughput that can communicate with the producers and consumers without them talking to each other directly.
Note: While queues are often used in threaded programs because of the thread-safety ofqueue.Queue()
, you shouldn’t need to concern yourself with thread safety when it comes to async IO. (The exception is when you’re combining the two, but that isn’t done in this tutorial.)
One use-case for queues (as is the case here) is for the queue to act as a transmitter for producers and consumers that aren’t otherwise directly chained or associated with each other.
The synchronous version of this program would look pretty dismal: a group of blocking producers serially add items to the queue, one producer at a time. Only after all producers are done can the queue be processed, by one consumer at a time processing item-by-item. There is a ton of latency in this design. Items may sit idly in the queue rather than be picked up and processed immediately.
An asynchronous version,asyncq.py
, is below. The challenging part of this workflow is that there needs to be a signal to the consumers that production is done. Otherwise,await q.get()
will hang indefinitely, because the queue will have been fully processed, but consumers won’t have any idea that production is complete.
(Big thanks for some help from a StackOverflowuser for helping to straighten outmain()
: the key is toawait q.join()
, which blocks until all items in the queue have been received and processed, and then to cancel the consumer tasks, which would otherwise hang up and wait endlessly for additional queue items to appear.)
Here is the full script:
#!/usr/bin/env python3# asyncq.pyimportasyncioimportitertoolsasitimportosimportrandomimporttimeasyncdefmakeitem(size:int=5)->str:returnos.urandom(size).hex()asyncdefrandsleep(caller=None)->None:i=random.randint(0,10)ifcaller:print(f"{caller} sleeping for{i} seconds.")awaitasyncio.sleep(i)asyncdefproduce(name:int,q:asyncio.Queue)->None:n=random.randint(0,10)for_init.repeat(None,n):# Synchronous loop for each single producerawaitrandsleep(caller=f"Producer{name}")i=awaitmakeitem()t=time.perf_counter()awaitq.put((i,t))print(f"Producer{name} added <{i}> to queue.")asyncdefconsume(name:int,q:asyncio.Queue)->None:whileTrue:awaitrandsleep(caller=f"Consumer{name}")i,t=awaitq.get()now=time.perf_counter()print(f"Consumer{name} got element <{i}>"f" in{now-t:0.5f} seconds.")q.task_done()asyncdefmain(nprod:int,ncon:int):q=asyncio.Queue()producers=[asyncio.create_task(produce(n,q))forninrange(nprod)]consumers=[asyncio.create_task(consume(n,q))forninrange(ncon)]awaitasyncio.gather(*producers)awaitq.join()# Implicitly awaits consumers, tooforcinconsumers:c.cancel()if__name__=="__main__":importargparserandom.seed(444)parser=argparse.ArgumentParser()parser.add_argument("-p","--nprod",type=int,default=5)parser.add_argument("-c","--ncon",type=int,default=10)ns=parser.parse_args()start=time.perf_counter()asyncio.run(main(**ns.__dict__))elapsed=time.perf_counter()-startprint(f"Program completed in{elapsed:0.5f} seconds.")
The first few coroutines are helper functions that return a random string, a fractional-second performance counter, and a random integer. A producer puts anywhere from 1 to 5 items into the queue. Each item is a tuple of(i, t)
wherei
is a random string andt
is the time at which the producer attempts to put the tuple into the queue.
When a consumer pulls an item out, it simply calculates the elapsed time that the item sat in the queue using the timestamp that the item was put in with.
Keep in mind thatasyncio.sleep()
is used to mimic some other, more complex coroutine that would eat up time and block all other execution if it were a regular blocking function.
Here is a test run with two producers and five consumers:
$python3asyncq.py-p2-c5Producer 0 sleeping for 3 seconds.Producer 1 sleeping for 3 seconds.Consumer 0 sleeping for 4 seconds.Consumer 1 sleeping for 3 seconds.Consumer 2 sleeping for 3 seconds.Consumer 3 sleeping for 5 seconds.Consumer 4 sleeping for 4 seconds.Producer 0 added <377b1e8f82> to queue.Producer 0 sleeping for 5 seconds.Producer 1 added <413b8802f8> to queue.Consumer 1 got element <377b1e8f82> in 0.00013 seconds.Consumer 1 sleeping for 3 seconds.Consumer 2 got element <413b8802f8> in 0.00009 seconds.Consumer 2 sleeping for 4 seconds.Producer 0 added <06c055b3ab> to queue.Producer 0 sleeping for 1 seconds.Consumer 0 got element <06c055b3ab> in 0.00021 seconds.Consumer 0 sleeping for 4 seconds.Producer 0 added <17a8613276> to queue.Consumer 4 got element <17a8613276> in 0.00022 seconds.Consumer 4 sleeping for 5 seconds.Program completed in 9.00954 seconds.
In this case, the items process in fractions of a second. A delay can be due to two reasons:
With regards to the second reason, luckily, it is perfectly normal to scale to hundreds or thousands of consumers. You should have no problem withpython3 asyncq.py -p 5 -c 100
. The point here is that, theoretically, you could have different users on different systems controlling the management of producers and consumers, with the queue serving as the central throughput.
So far, you’ve been thrown right into the fire and seen three related examples ofasyncio
calling coroutines defined withasync
andawait
. If you’re not completely following or just want to get deeper into the mechanics of how modern coroutines came to be in Python, you’ll start from square one with the next section.
Earlier, you saw an example of the old-style generator-based coroutines, which have been outdated by more explicit native coroutines. The example is worth re-showing with a small tweak:
importasyncio@asyncio.coroutinedefpy34_coro():"""Generator-based coroutine"""# No need to build these yourself, but be aware of what they ares=yield fromstuff()returnsasyncdefpy35_coro():"""Native coroutine, modern syntax"""s=awaitstuff()returnsasyncdefstuff():return0x10,0x20,0x30
As an experiment, what happens if you callpy34_coro()
orpy35_coro()
on its own, withoutawait
, or without any calls toasyncio.run()
or otherasyncio
“porcelain” functions? Calling a coroutine in isolation returns a coroutine object:
>>>py35_coro()<coroutine object py35_coro at 0x10126dcc8>
This isn’t very interesting on its surface. The result of calling a coroutine on its own is an awaitablecoroutine object.
Time for a quiz: what other feature of Python looks like this? (What feature of Python doesn’t actually “do much” when it’s called on its own?)
Hopefully you’re thinking ofgenerators as an answer to this question, because coroutines are enhanced generators under the hood. The behavior is similar in this regard:
>>>defgen():...yield0x10,0x20,0x30...>>>g=gen()>>>g# Nothing much happens - need to iterate with `.__next__()`<generator object gen at 0x1012705e8>>>>next(g)(16, 32, 48)
Generator functions are, as it so happens, the foundation of async IO (regardless of whether you declare coroutines withasync def
rather than the older@asyncio.coroutine
wrapper). Technically,await
is more closely analogous toyield from
than it is toyield
. (But remember thatyield from x()
is just syntactic sugar to replacefor i in x(): yield i
.)
One critical feature of generators as it pertains to async IO is that they can effectively be stopped and restarted at will. For example, you canbreak
out of iterating over a generator object and then resume iteration on the remaining values later. When agenerator function reachesyield
, it yields that value, but then it sits idle until it is told to yield its subsequent value.
This can be fleshed out through an example:
>>>fromitertoolsimportcycle>>>defendless():..."""Yields 9, 8, 7, 6, 9, 8, 7, 6, ... forever"""...yield fromcycle((9,8,7,6))>>>e=endless()>>>total=0>>>foriine:...iftotal<30:...print(i,end=" ")...total+=i...else:...print()...# Pause execution. We can resume later....break9 8 7 6 9 8 7 6 9 8 7 6 9 8>>># Resume>>>next(e),next(e),next(e)(6, 9, 8)
Theawait
keyword behaves similarly, marking a break point at which the coroutine suspends itself and lets other coroutines work. “Suspended,” in this case, means a coroutine that has temporarily ceded control but not totally exited or finished. Keep in mind thatyield
, and by extensionyield from
andawait
, mark a break point in a generator’s execution.
This is the fundamental difference between functions and generators. A function is all-or-nothing. Once it starts, it won’t stop until it hits areturn
, then pushes that value to the caller (the function that calls it). A generator, on the other hand, pauses each time it hits ayield
and goes no further. Not only can it push this value to calling stack, but it can keep a hold of its local variables when you resume it by callingnext()
on it.
There’s a second and lesser-known feature of generators that also matters. You can send a value into a generator as well through its.send()
method. This allows generators (and coroutines) to call (await
) each other without blocking. I won’t get any further into the nuts and bolts of this feature, because it matters mainly for the implementation of coroutines behind the scenes, but you shouldn’t ever really need to use it directly yourself.
If you’re interested in exploring more, you can start atPEP 342, where coroutines were formally introduced. Brett Cannon’sHow the Heck Does Async-Await Work in Python is also a good read, as is thePYMOTW writeup onasyncio
. Lastly, there’s David Beazley’sCurious Course on Coroutines and Concurrency, which dives deep into the mechanism by which coroutines run.
Let’s try to condense all of the above articles into a few sentences: there is a particularly unconventional mechanism by which these coroutines actually get run. Their result is an attribute of the exception object that gets thrown when their.send()
method is called. There’s some more wonky detail to all of this, but it probably won’t help you use this part of the language in practice, so let’s move on for now.
To tie things together, here are some key points on the topic of coroutines as generators:
Coroutines arerepurposed generators that take advantage of the peculiarities of generator methods.
Old generator-based coroutines useyield from
to wait for a coroutine result. Modern Python syntax in native coroutines simply replacesyield from
withawait
as the means of waiting on a coroutine result. Theawait
is analogous toyield from
, and it often helps to think of it as such.
The use ofawait
is a signal that marks a break point. It lets a coroutine temporarily suspend execution and permits the program to come back to it later.
async for
and Async Generators + ComprehensionsAlong with plainasync
/await
, Python also enablesasync for
to iterate over anasynchronous iterator. The purpose of an asynchronous iterator is for it to be able to call asynchronous code at each stage when it is iterated over.
A natural extension of this concept is anasynchronous generator. Recall that you can useawait
,return
, oryield
in a native coroutine. Usingyield
within a coroutine became possible in Python 3.6 (via PEP 525), which introduced asynchronous generators with the purpose of allowingawait
andyield
to be used in the same coroutine function body:
>>>asyncdefmygen(u:int=10):..."""Yield powers of 2."""...i=0...whilei<u:...yield2**i...i+=1...awaitasyncio.sleep(0.1)
Last but not least, Python enablesasynchronous comprehension withasync for
. Like its synchronous cousin, this is largely syntactic sugar:
>>>asyncdefmain():...# This does *not* introduce concurrent execution...# It is meant to show syntax only...g=[iasyncforiinmygen()]...f=[jasyncforjinmygen()ifnot(j//3%5)]...returng,f...>>>g,f=asyncio.run(main())>>>g[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]>>>f[1, 2, 16, 32, 256, 512]
This is a crucial distinction:neither asynchronous generators nor comprehensions make the iteration concurrent. All that they do is provide the look-and-feel of their synchronous counterparts, but with the ability for the loop in question to give up control to the event loop for some other coroutine to run.
In other words, asynchronous iterators and asynchronous generators are not designed to concurrently map some function over a sequence or iterator. They’re merely designed to let the enclosing coroutine allow other tasks to take their turn. Theasync for
andasync with
statements are only needed to the extent that using plainfor
orwith
would “break” the nature ofawait
in the coroutine. This distinction between asynchronicity and concurrency is a key one to grasp.
asyncio.run()
You can think of an event loop as something like awhile True
loop that monitors coroutines, taking feedback on what’s idle, and looking around for things that can be executed in the meantime. It is able to wake up an idle coroutine when whatever that coroutine is waiting on becomes available.
Thus far, the entire management of the event loop has been implicitly handled by one function call:
asyncio.run(main())# Python 3.7+
asyncio.run()
, introduced in Python 3.7, is responsible for getting the event loop, running tasks until they are marked as complete, and then closing the event loop.
There’s a more long-winded way of managing theasyncio
event loop, withget_event_loop()
. The typical pattern looks like this:
loop=asyncio.get_event_loop()try:loop.run_until_complete(main())finally:loop.close()
You’ll probably seeloop.get_event_loop()
floating around in older examples, but unless you have a specific need to fine-tune control over the event loop management,asyncio.run()
should be sufficient for most programs.
If you do need to interact with the event loop within a Python program,loop
is a good-old-fashioned Python object that supports introspection withloop.is_running()
andloop.is_closed()
. You can manipulate it if you need to get more fine-tuned control, such as inscheduling a callback by passing the loop as an argument.
What is more crucial is understanding a bit beneath the surface about the mechanics of the event loop. Here are a few points worth stressing about the event loop.
#1: Coroutines don’t do much on their own until they are tied to the event loop.
You saw this point before in the explanation on generators, but it’s worth restating. If you have a main coroutine that awaits others, simply calling it in isolation has little effect:
>>>importasyncio>>>asyncdefmain():...print("Hello ...")...awaitasyncio.sleep(1)...print("World!")>>>routine=main()>>>routine<coroutine object main at 0x1027a6150>
Remember to useasyncio.run()
to actually force execution by scheduling themain()
coroutine (future object) for execution on the event loop:
>>>asyncio.run(routine)Hello ...World!
(Other coroutines can be executed withawait
. It is typical to wrap justmain()
inasyncio.run()
, and chained coroutines withawait
will be called from there.)
#2: By default, an async IO event loop runs in a single thread and on a single CPU core. Usually, running one single-threaded event loop in one CPU core is more than sufficient. It is also possible to run event loops across multiple cores. Check out thistalk by John Reese for more, and be warned that your laptop may spontaneously combust.
#3. Event loops are pluggable. That is, you could, if you really wanted, write your own event loop implementation and have it run tasks just the same. This is wonderfully demonstrated in theuvloop
package, which is an implementation of the event loop in Cython.
That is what is meant by the term “pluggable event loop”: you can use any working implementation of an event loop, unrelated to the structure of the coroutines themselves. Theasyncio
package itself ships withtwo different event loop implementations, with the default being based on theselectors
module. (The second implementation is built for Windows only.)
You’ve made it this far, and now it’s time for the fun and painless part. In this section, you’ll build a web-scraping URL collector,areq.py
, usingaiohttp
, a blazingly fast async HTTP client/server framework. (We just need the client part.) Such a tool could be used to map connections between a cluster of sites, with the links forming adirected graph.
Note: You may be wondering why Python’srequests
package isn’t compatible with async IO.requests
is built on top ofurllib3
, which in turn uses Python’shttp
andsocket
modules.
By default, socket operations are blocking. This means that Python won’t likeawait requests.get(url)
because.get()
is not awaitable. In contrast, almost everything inaiohttp
is an awaitable coroutine, such assession.request()
andresponse.text()
. It’s a great package otherwise, but you’re doing yourself a disservice by usingrequests
in asynchronous code.
The high-level program structure will look like this:
Read a sequence of URLs from a local file,urls.txt
.
Send GET requests for the URLs and decode the resulting content. If this fails, stop there for a URL.
Search for the URLs withinhref
tags in the HTML of the responses.
Write the results tofoundurls.txt
.
Do all of the above as asynchronously and concurrently as possible. (Useaiohttp
for the requests, andaiofiles
for the file-appends. These are two primary examples of IO that are well-suited for the async IO model.)
Here are the contents ofurls.txt
. It’s not huge, and contains mostly highly trafficked sites:
$caturls.txthttps://regex101.com/https://docs.python.org/3/this-url-will-404.htmlhttps://www.nytimes.com/guides/https://www.mediamatters.org/https://1.1.1.1/https://www.politico.com/tipsheets/morning-moneyhttps://www.bloomberg.com/markets/economicshttps://www.ietf.org/rfc/rfc2616.txt
The second URL in the list should return a 404 response, which you’ll need to handle gracefully. If you’re running an expanded version of this program, you’ll probably need to deal with much hairier problems than this, such a server disconnections and endless redirects.
The requests themselves should be made using a single session, to take advantage of reusage of the session’s internal connection pool.
Let’s take a look at the full program. We’ll walk through things step-by-step after:
#!/usr/bin/env python3# areq.py"""Asynchronously get links embedded in multiple pages' HMTL."""importasyncioimportloggingimportreimportsysfromtypingimportIOimporturllib.errorimporturllib.parseimportaiofilesimportaiohttpfromaiohttpimportClientSessionlogging.basicConfig(format="%(asctime)s%(levelname)s:%(name)s:%(message)s",level=logging.DEBUG,datefmt="%H:%M:%S",stream=sys.stderr,)logger=logging.getLogger("areq")logging.getLogger("chardet.charsetprober").disabled=TrueHREF_RE=re.compile(r'href="(.*?)"')asyncdeffetch_html(url:str,session:ClientSession,**kwargs)->str:"""GET request wrapper to fetch page HTML. kwargs are passed to `session.request()`. """resp=awaitsession.request(method="GET",url=url,**kwargs)resp.raise_for_status()logger.info("Got response [%s] for URL:%s",resp.status,url)html=awaitresp.text()returnhtmlasyncdefparse(url:str,session:ClientSession,**kwargs)->set:"""Find HREFs in the HTML of `url`."""found=set()try:html=awaitfetch_html(url=url,session=session,**kwargs)except(aiohttp.ClientError,aiohttp.http_exceptions.HttpProcessingError,)ase:logger.error("aiohttp exception for%s [%s]:%s",url,getattr(e,"status",None),getattr(e,"message",None),)returnfoundexceptExceptionase:logger.exception("Non-aiohttp exception occured:%s",getattr(e,"__dict__",{}))returnfoundelse:forlinkinHREF_RE.findall(html):try:abslink=urllib.parse.urljoin(url,link)except(urllib.error.URLError,ValueError):logger.exception("Error parsing URL:%s",link)passelse:found.add(abslink)logger.info("Found%d links for%s",len(found),url)returnfoundasyncdefwrite_one(file:IO,url:str,**kwargs)->None:"""Write the found HREFs from `url` to `file`."""res=awaitparse(url=url,**kwargs)ifnotres:returnNoneasyncwithaiofiles.open(file,"a")asf:forpinres:awaitf.write(f"{url}\t{p}\n")logger.info("Wrote results for source URL:%s",url)asyncdefbulk_crawl_and_write(file:IO,urls:set,**kwargs)->None:"""Crawl & write concurrently to `file` for multiple `urls`."""asyncwithClientSession()assession:tasks=[]forurlinurls:tasks.append(write_one(file=file,url=url,session=session,**kwargs))awaitasyncio.gather(*tasks)if__name__=="__main__":importpathlibimportsysassertsys.version_info>=(3,7),"Script requires Python 3.7+."here=pathlib.Path(__file__).parentwithopen(here.joinpath("urls.txt"))asinfile:urls=set(map(str.strip,infile))outpath=here.joinpath("foundurls.txt")withopen(outpath,"w")asoutfile:outfile.write("source_url\tparsed_url\n")asyncio.run(bulk_crawl_and_write(file=outpath,urls=urls))
This script is longer than our initial toy programs, so let’s break it down.
The constantHREF_RE
is aregular expression to extract what we’re ultimately searching for,href
tags within HTML:
>>>HREF_RE.search('Go to <a href="https://realpython.com/">Real Python</a>')<re.Match object; span=(15, 45), match='href="https://realpython.com/"'>
The coroutinefetch_html()
is a wrapper around a GET request to make the request and decode the resulting page HTML. It makes the request, awaits the response, and raises right away in the case of a non-200 status:
resp=awaitsession.request(method="GET",url=url,**kwargs)resp.raise_for_status()
If the status is okay,fetch_html()
returns the page HTML (astr
). Notably, there is no exception handling done in this function. The logic is to propagate that exception to the caller and let it be handled there:
html=awaitresp.text()
Weawait
session.request()
andresp.text()
because they’re awaitable coroutines. The request/response cycle would otherwise be the long-tailed, time-hogging portion of the application, but with async IO,fetch_html()
lets the event loop work on other readily available jobs such as parsing and writing URLs that have already been fetched.
Next in the chain of coroutines comesparse()
, which waits onfetch_html()
for a given URL, and then extracts all of thehref
tags from that page’s HTML, making sure that each is valid and formatting it as an absolute path.
Admittedly, the second portion ofparse()
is blocking, but it consists of a quick regex match and ensuring that the links discovered are made into absolute paths.
In this specific case, this synchronous code should be quick and inconspicuous. But just remember that any line within a given coroutine will block other coroutines unless that line usesyield
,await
, orreturn
. If the parsing was a more intensive process, you might want to consider running this portion in its own process withloop.run_in_executor()
.
Next, the coroutinewrite()
takes a file object and a single URL, and waits onparse()
to return aset
of the parsed URLs, writing each to the file asynchronously along with its source URL through use ofaiofiles
, a package for async file IO.
Lastly,bulk_crawl_and_write()
serves as the main entry point into the script’s chain of coroutines. It uses a single session, and a task is created for each URL that is ultimately read fromurls.txt
.
Here are a few additional points that deserve mention:
The defaultClientSession
has anadapter with a maximum of 100 open connections. To change that, pass an instance ofasyncio.connector.TCPConnector
toClientSession
. You can also specify limits on a per-host basis.
You can specify maxtimeouts for both the session as a whole and for individual requests.
This script also usesasync with
, which works with anasynchronous context manager. I haven’t devoted a whole section to this concept because the transition from synchronous to asynchronous context managers is fairly straightforward. The latter has to define.__aenter__()
and.__aexit__()
rather than.__exit__()
and.__enter__()
. As you might expect,async with
can only be used inside a coroutine function declared withasync def
.
If you’d like to explore a bit more, thecompanion files for this tutorial up at GitHub have comments and docstrings attached as well.
Here’s the execution in all of its glory, asareq.py
gets, parses, and saves results for 9 URLs in under a second:
$python3areq.py21:33:22 DEBUG:asyncio: Using selector: KqueueSelector21:33:22 INFO:areq: Got response [200] for URL: https://www.mediamatters.org/21:33:22 INFO:areq: Found 115 links for https://www.mediamatters.org/21:33:22 INFO:areq: Got response [200] for URL: https://www.nytimes.com/guides/21:33:22 INFO:areq: Got response [200] for URL: https://www.politico.com/tipsheets/morning-money21:33:22 INFO:areq: Got response [200] for URL: https://www.ietf.org/rfc/rfc2616.txt21:33:22 ERROR:areq: aiohttp exception for https://docs.python.org/3/this-url-will-404.html [404]: Not Found21:33:22 INFO:areq: Found 120 links for https://www.nytimes.com/guides/21:33:22 INFO:areq: Found 143 links for https://www.politico.com/tipsheets/morning-money21:33:22 INFO:areq: Wrote results for source URL: https://www.mediamatters.org/21:33:22 INFO:areq: Found 0 links for https://www.ietf.org/rfc/rfc2616.txt21:33:22 INFO:areq: Got response [200] for URL: https://1.1.1.1/21:33:22 INFO:areq: Wrote results for source URL: https://www.nytimes.com/guides/21:33:22 INFO:areq: Wrote results for source URL: https://www.politico.com/tipsheets/morning-money21:33:22 INFO:areq: Got response [200] for URL: https://www.bloomberg.com/markets/economics21:33:22 INFO:areq: Found 3 links for https://www.bloomberg.com/markets/economics21:33:22 INFO:areq: Wrote results for source URL: https://www.bloomberg.com/markets/economics21:33:23 INFO:areq: Found 36 links for https://1.1.1.1/21:33:23 INFO:areq: Got response [200] for URL: https://regex101.com/21:33:23 INFO:areq: Found 23 links for https://regex101.com/21:33:23 INFO:areq: Wrote results for source URL: https://regex101.com/21:33:23 INFO:areq: Wrote results for source URL: https://1.1.1.1/
That’s not too shabby! As a sanity check, you can check the line-count on the output. In my case, it’s 626, though keep in mind this may fluctuate:
$wc-lfoundurls.txt 626 foundurls.txt$head-n3foundurls.txtsource_url parsed_urlhttps://www.bloomberg.com/markets/economics https://www.bloomberg.com/feedbackhttps://www.bloomberg.com/markets/economics https://www.bloomberg.com/notices/tos
Next Steps: If you’d like to up the ante, make this webcrawlerrecursive. You can useaio-redis
to keep track of which URLs have been crawled within the tree to avoid requesting them twice, and connect links with Python’snetworkx
library.
Remember to be nice. Sending 1000 concurrent requests to a small, unsuspecting website is bad, bad, bad. There are ways to limit how many concurrent requests you’re making in one batch, such as in using thesempahore objects ofasyncio
or using a patternlike this one. If you don’t heed this warning, you may get a massive batch ofTimeoutError
exceptions and only end up hurting your own program.
Now that you’ve seen a healthy dose of code, let’s step back for a minute and consider when async IO is an ideal option and how you can make the comparison to arrive at that conclusion or otherwise choose a different model of concurrency.
This tutorial is no place for an extended treatise on async IO versus threading versus multiprocessing. However, it’s useful to have an idea of when async IO is probably the best candidate of the three.
The battle over async IO versus multiprocessing is not really a battle at all. In fact, they can beused in concert. If you have multiple, fairly uniform CPU-bound tasks (a great example is agrid search in libraries such asscikit-learn
orkeras
), multiprocessing should be an obvious choice.
Simply puttingasync
before every function is a bad idea if all of the functions use blocking calls. (This can actually slow down your code.) But as mentioned previously, there are places where async IO and multiprocessing canlive in harmony.
The contest between async IO and threading is a little bit more direct. I mentioned in the introduction that “threading is hard.” The full story is that, even in cases where threading seems easy to implement, it can still lead to infamous impossible-to-trace bugs due to race conditions and memory usage, among other things.
Threading also tends to scale less elegantly than async IO, because threads are a system resource with a finite availability. Creating thousands of threads will fail on many machines, and I don’t recommend trying it in the first place. Creating thousands of async IO tasks is completely feasible.
Async IO shines when you have multiple IO-bound tasks where the tasks would otherwise be dominated by blocking IO-bound wait time, such as:
Network IO, whether your program is the server or the client side
Serverless designs, such as a peer-to-peer, multi-user network like a group chatroom
Read/write operations where you want to mimic a “fire-and-forget” style but worry less about holding a lock on whatever you’re reading and writing to
The biggest reason not to use it is thatawait
only supports a specific set of objects that define a specific set of methods. If you want to do async read operations with a certain DBMS, you’ll need to find not just a Python wrapper for that DBMS, but one that supports theasync
/await
syntax. Coroutines that contain synchronous calls block other coroutines and tasks from running.
For a shortlist of libraries that work withasync
/await
, see thelist at the end of this tutorial.
This tutorial focuses on async IO, theasync
/await
syntax, and usingasyncio
for event-loop management and specifying tasks.asyncio
certainly isn’t the only async IO library out there. This observation from Nathaniel J. Smith says a lot:
[In] a few years,
asyncio
might find itself relegated to becoming one of those stdlib libraries that savvy developers avoid, likeurllib2
.…
What I’m arguing, in effect, is that
asyncio
is a victim of its own success: when it was designed, it used the best approach possible; but since then, work inspired byasyncio
– like the addition ofasync
/await
– has shifted the landscape so that we can do even better, and nowasyncio
is hamstrung by its earlier commitments.(Source)
To that end, a few big-name alternatives that do whatasyncio
does, albeit with different APIs and different approaches, arecurio
andtrio
. Personally, I think that if you’re building a moderately sized, straightforward program, just usingasyncio
is plenty sufficient and understandable, and lets you avoid adding yet another large dependency outside of Python’s standard library.
But by all means, check outcurio
andtrio
, and you might find that they get the same thing done in a way that’s more intuitive for you as the user. Many of the package-agnostic concepts presented here should permeate to alternative async IO packages as well.
In these next few sections, you’ll cover some miscellaneous parts ofasyncio
andasync
/await
that haven’t fit neatly into the tutorial thus far, but are still important for building and understanding a full program.
asyncio
FunctionsIn addition toasyncio.run()
, you’ve seen a few other package-level functions such asasyncio.create_task()
andasyncio.gather()
.
You can usecreate_task()
to schedule the execution of a coroutine object, followed byasyncio.run()
:
>>>importasyncio>>>asyncdefcoro(seq)->list:..."""'IO' wait time is proportional to the max element."""...awaitasyncio.sleep(max(seq))...returnlist(reversed(seq))...>>>asyncdefmain():...# This is a bit redundant in the case of one task...# We could use `await coro([3, 2, 1])` on its own...t=asyncio.create_task(coro([3,2,1]))# Python 3.7+...awaitt...print(f't: type{type(t)}')...print(f't done:{t.done()}')...>>>t=asyncio.run(main())t: type <class '_asyncio.Task'>t done: True
There’s a subtlety to this pattern: if you don’tawait t
withinmain()
, it may finish beforemain()
itself signals that it is complete. Becauseasyncio.run(main())
callsloop.run_until_complete(main())
, the event loop is only concerned (withoutawait t
present) thatmain()
is done, not that the tasks that get created withinmain()
are done. Withoutawait t
, the loop’s other taskswill be cancelled, possibly before they are completed. If you need to get a list of currently pending tasks, you can useasyncio.Task.all_tasks()
.
Note:asyncio.create_task()
was introduced in Python 3.7. In Python 3.6 or lower, useasyncio.ensure_future()
in place ofcreate_task()
.
Separately, there’sasyncio.gather()
. While it doesn’t do anything tremendously special,gather()
is meant to neatly put a collection of coroutines (futures) into a single future. As a result, it returns a single future object, and, if youawait asyncio.gather()
and specify multiple tasks or coroutines, you’re waiting for all of them to be completed. (This somewhat parallelsqueue.join()
from our earlier example.) The result ofgather()
will be a list of the results across the inputs:
>>>importtime>>>asyncdefmain():...t=asyncio.create_task(coro([3,2,1]))...t2=asyncio.create_task(coro([10,5,0]))# Python 3.7+...print('Start:',time.strftime('%X'))...a=awaitasyncio.gather(t,t2)...print('End:',time.strftime('%X'))# Should be 10 seconds...print(f'Both tasks done:{all((t.done(),t2.done()))}')...returna...>>>a=asyncio.run(main())Start: 16:20:11End: 16:20:21Both tasks done: True>>>a[[1, 2, 3], [0, 5, 10]]
You probably noticed thatgather()
waits on the entire result set of the Futures or coroutines that you pass it. Alternatively, you can loop overasyncio.as_completed()
to get tasks as they are completed, in the order of completion. The function returns an iterator that yields tasks as they finish. Below, the result ofcoro([3, 2, 1])
will be available beforecoro([10, 5, 0])
is complete, which is not the case withgather()
:
>>>asyncdefmain():...t=asyncio.create_task(coro([3,2,1]))...t2=asyncio.create_task(coro([10,5,0]))...print('Start:',time.strftime('%X'))...forresinasyncio.as_completed((t,t2)):...compl=awaitres...print(f'res:{compl} completed at{time.strftime("%X")}')...print('End:',time.strftime('%X'))...print(f'Both tasks done:{all((t.done(),t2.done()))}')...>>>a=asyncio.run(main())Start: 09:49:07res: [1, 2, 3] completed at 09:49:10res: [0, 5, 10] completed at 09:49:17End: 09:49:17Both tasks done: True
Lastly, you may also seeasyncio.ensure_future()
. You should rarely need it, because it’s a lower-level plumbing API and largely replaced bycreate_task()
, which was introduced later.
await
While they behave somewhat similarly, theawait
keyword has significantly higher precedence thanyield
. This means that, because it is more tightly bound, there are a number of instances where you’d need parentheses in ayield from
statement that are not required in an analogousawait
statement. For more information, seeexamples ofawait
expressions from PEP 492.
You’re now equipped to useasync
/await
and the libraries built off of it. Here’s a recap of what you’ve covered:
Asynchronous IO as a language-agnostic model and a way to effect concurrency by letting coroutines indirectly communicate with each other
The specifics of Python’s newasync
andawait
keywords, used to mark and define coroutines
asyncio
, the Python package that provides the API to run and manage coroutines
Async IO in Python has evolved swiftly, and it can be hard to keep track of what came when. Here’s a list of Python minor-version changes and introductions related toasyncio
:
3.3: Theyield from
expression allows for generator delegation.
3.4:asyncio
was introduced in the Python standard library with provisional API status.
3.5:async
andawait
became a part of the Python grammar, used to signify and wait on coroutines. They were not yet reserved keywords. (You could still define functions or variables namedasync
andawait
.)
3.6: Asynchronous generators and asynchronous comprehensions were introduced. The API ofasyncio
was declared stable rather than provisional.
3.7:async
andawait
became reserved keywords. (They cannot be used as identifiers.) They are intended to replace theasyncio.coroutine()
decorator.asyncio.run()
was introduced to theasyncio
package, amonga bunch of other features.
If you want to be safe (and be able to useasyncio.run()
), go with Python 3.7 or above to get the full set of features.
Here’s a curated list of additional resources:
asyncio
packagesourceasyncio
yield from
async
/await
Worldasyncio
(4 posts)asyncio.semaphore
inasync
-await
functionA few PythonWhat’s New sections explain the motivation behind language changes in more detail:
yield from
and PEP 380)From David Beazley:
YouTube talks:
async
/await
Fromaio-libs:
aiohttp
: Asynchronous HTTP client/server frameworkaioredis
: Async IO Redis supportaiopg
: Async IO PostgreSQL supportaiomcache
: Async IO memcached clientaiokafka
: Async IO Kafka clientaiozmq
: Async IO ZeroMQ supportaiojobs
: Jobs scheduler for managing background tasksasync_lru
: SimpleLRU cache for async IOFrommagicstack:
From other hosts:
trio
: Friendlierasyncio
intended to showcase a radically simpler designaiofiles
: Async file IOasks
: Async requests-like http libraryasyncio-redis
: Async IO Redis supportaioprocessing
: Integratesmultiprocessing
module withasyncio
umongo
: Async IO MongoDB clientunsync
: Unsynchronizeasyncio
aiostream
: Likeitertools
, but asyncTake the Quiz: Test your knowledge with our interactive “Async IO in Python: A Complete Walkthrough” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Async IO in Python: A Complete WalkthroughIn this quiz, you'll test your understanding of async IO in Python. With this knowledge, you'll be able to understand the language-agnostic paradigm of asynchronous IO, use the async/await keywords to define coroutines, and use the asyncio package to run and manage coroutines.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding:Hands-On Python 3 Concurrency With the asyncio Module
🐍 Python Tricks 💌
Get a short & sweetPython Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.
AboutBrad Solomon
Brad is a software engineer and a member of the Real Python Tutorial Team.
» More about BradMasterReal-World Python Skills With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
MasterReal-World Python Skills
With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
What Do You Think?
What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.
Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students.Get tips for asking good questions andget answers to common questions in our support portal.
Keep Learning
Related Topics:intermediatepython
Recommended Video Course:Hands-On Python 3 Concurrency With the asyncio Module
Related Tutorials:
Already have an account?Sign-In
Almost there! Complete this form and click the button below to gain instant access:
5 Thoughts On Python Mastery