asyncio
的概念性總覽¶
ThisHOWTO article seeks to help you build a sturdy mentalmodel of howasyncio
fundamentally works, helping you understand thehow and why behind the recommended patterns.
You might be curious about some keyasyncio
concepts.You'll be comfortably able to answer these questions by the end of thisarticle:
What's happening behind the scenes when an object is awaited?
How does
asyncio
differentiate between a task which doesn't needCPU-time (such as a network request or file read) as opposed to a task thatdoes (such as computing n-factorial)?How to write an asynchronous variant of an operation, such asan async sleep or database request.
也參考
Theguide that inspired this HOWTO article, by Alexander Nordin.
This in-depthYouTube tutorial series on
asyncio
created by Python core team member, Łukasz Langa.500 Lines or Less: A Web Crawler With asyncio Coroutines by A.Jesse Jiryu Davis and Guido van Rossum.
A conceptual overview part 1: the high-level¶
In part 1, we'll cover the main, high-level building blocks ofasyncio
:the event loop, coroutine functions, coroutine objects, tasks andawait
.
事件迴圈¶
Everything inasyncio
happens relative to the event loop.It's the star of the show.It's like an orchestra conductor.It's behind the scenes managing resources.Some power is explicitly granted to it, but a lot of its ability to get thingsdone comes from the respect and cooperation of its worker bees.
In more technical terms, the event loop contains a collection of jobs to be run.Some jobs are added directly by you, and some indirectly byasyncio
.The event loop takes a job from its backlog of work and invokes it (or "givesit control"), similar to calling a function, and then that job runs.Once it pauses or completes, it returns control to the event loop.The event loop will then select another job from its pool and invoke it.You canroughly think of the collection of jobs as a queue: jobs are added andthen processed one at a time, generally (but not always) in order.This process repeats indefinitely with the event loop cycling endlesslyonwards.If there are no more jobs pending execution, the event loop is smart enough torest and avoid needlessly wasting CPU cycles, and will come back when there'smore work to be done.
Effective execution relies on jobs sharing well and cooperating; a greedy jobcould hog control and leave the other jobs to starve, rendering the overallevent loop approach rather useless.
importasyncio# This creates an event loop and indefinitely cycles through# its collection of jobs.event_loop=asyncio.new_event_loop()event_loop.run_forever()
非同步函式與協程¶
這是一個基本的、無聊的 Python 函式:
defhello_printer():print("Hi, I am a lowly, simple printer, though I have all I ""need in life --\nfresh paper and my dearly beloved octopus ""partner in crime.")
Calling a regular function invokes its logic or body:
>>>hello_printer()Hi, I am a lowly, simple printer, though I have all I need in life --fresh paper and my dearly beloved octopus partner in crime.
Theasync def, as opposed to just a plaindef
, makesthis an asynchronous function (or "coroutine function").Calling it creates and returns acoroutine object.
asyncdefloudmouth_penguin(magic_number:int):print("I am a super special talking penguin. Far cooler than that printer. "f"By the way, my lucky number is:{magic_number}.")
Calling the async function,loudmouth_penguin
, does not execute the print statement;instead, it creates a coroutine object:
>>>loudmouth_penguin(magic_number=3)<coroutine object loudmouth_penguin at 0x104ed2740>
The terms "coroutine function" and "coroutine object" are often conflatedas coroutine.That can be confusing!In this article, coroutine specifically refers to a coroutine object, or moreprecisely, an instance oftypes.CoroutineType
(native coroutine).Note that coroutines can also exist as instances ofcollections.abc.Coroutine
-- a distinction that matters for typechecking.
A coroutine represents the function's body or logic.A coroutine has to be explicitly started; again, merely creating the coroutinedoes not start it.Notably, the coroutine can be paused and resumed at various points within thefunction's body.That pausing and resuming ability is what allows for asynchronous behavior!
Coroutines and coroutine functions were built by leveraging the functionalityofgenerators andgenerator functions.Recall, a generator function is a function thatyield
s, like thisone:
defget_random_number():# 這是個很爛的隨機數產生器!print("Hi")yield1print("Hello")yield7print("Howdy")yield4...
Similar to a coroutine function, calling a generator function does not run it.Instead, it creates a generator object:
>>>get_random_number()<generator object get_random_number at 0x1048671c0>
You can proceed to the nextyield
of a generator by using thebuilt-in functionnext()
.In other words, the generator runs, then pauses.For example:
>>>generator=get_random_number()>>>next(generator)Hi1>>>next(generator)Hello7
Tasks¶
Roughly speaking,tasks are coroutines (not coroutinefunctions) tied to an event loop.A task also maintains a list of callback functions whose importance will becomeclear in a moment when we discussawait
.The recommended way to create tasks is viaasyncio.create_task()
.
Creating a task automatically schedules it for execution (by adding acallback to run it in the event loop's to-do list, that is, collection of jobs).
Since there's only one event loop (in each thread),asyncio
takes care ofassociating the task with the event loop for you. As such, there's no needto specify the event loop.
coroutine=loudmouth_penguin(magic_number=5)# This creates a Task object and schedules its execution via the event loop.task=asyncio.create_task(coroutine)
Earlier, we manually created the event loop and set it to run forever.In practice, it's recommended to use (and common to see)asyncio.run()
,which takes care of managing the event loop and ensuring the providedcoroutine finishes before advancing.For example, many async programs follow this setup:
importasyncioasyncdefmain():# Perform all sorts of wacky, wild asynchronous things......if__name__=="__main__":asyncio.run(main())# The program will not reach the following print statement until the# coroutine main() finishes.print("coroutine main() is done!")
It's important to be aware that the task itself is not added to the event loop,only a callback to the task is.This matters if the task object you created is garbage collected before it'scalled by the event loop.For example, consider this program:
1asyncdefhello(): 2print("hello!") 3 4asyncdefmain(): 5asyncio.create_task(hello()) 6# Other asynchronous instructions which run for a while 7# and cede control to the event loop... 8... 910asyncio.run(main())
Because there's no reference to the task object created on line 5, itmightbe garbage collected before the event loop invokes it.Later instructions in the coroutinemain()
hand control back to the eventloop so it can invoke other jobs.When the event loop eventually tries to run the task, it might fail anddiscover the task object does not exist!This can also happen even if a coroutine keeps a reference to a task butcompletes before that task finishes.When the coroutine exits, local variables go out of scope and may be subjectto garbage collection.In practice,asyncio
and Python's garbage collector work pretty hard toensure this sort of thing doesn't happen.But that's no reason to be reckless!
await¶
await
is a Python keyword that's commonly used in one of twodifferent ways:
awaittaskawaitcoroutine
In a crucial way, the behavior ofawait
depends on the type of objectbeing awaited.
Awaiting a task will cede control from the current task or coroutine tothe event loop.In the process of relinquishing control, a few important things happen.We'll use the following code example to illustrate:
asyncdefplant_a_tree():dig_the_hole_task=asyncio.create_task(dig_the_hole())awaitdig_the_hole_task# Other instructions associated with planting a tree....
In this example, imagine the event loop has passed control to the start of thecoroutineplant_a_tree()
.As seen above, the coroutine creates a task and then awaits it.Theawaitdig_the_hole_task
instruction adds a callback (which will resumeplant_a_tree()
) to thedig_the_hole_task
object's list of callbacks.And then, the instruction cedes control to the event loop.Some time later, the event loop will pass control todig_the_hole_task
and the task will finish whatever it needs to do.Once the task finishes, it will add its various callbacks to the event loop,in this case, a call to resumeplant_a_tree()
.
Generally speaking, when the awaited task finishes (dig_the_hole_task
),the original task or coroutine (plant_a_tree()
) is added back to the eventloops to-do list to be resumed.
This is a basic, yet reliable mental model.In practice, the control handoffs are slightly more complex, but not by much.In part 2, we'll walk through the details that make this possible.
Unlike tasks, awaiting a coroutine does not hand control back to the eventloop!Wrapping a coroutine in a task first, then awaiting that would cedecontrol.The behavior ofawaitcoroutine
is effectively the same as invoking aregular, synchronous Python function.Consider this program:
importasyncioasyncdefcoro_a():print("I am coro_a(). Hi!")asyncdefcoro_b():print("I am coro_b(). I sure hope no one hogs the event loop...")asyncdefmain():task_b=asyncio.create_task(coro_b())num_repeats=3for_inrange(num_repeats):awaitcoro_a()awaittask_basyncio.run(main())
The first statement in the coroutinemain()
createstask_b
and schedulesit for execution via the event loop.Then,coro_a()
is repeatedly awaited. Control never cedes to theevent loop which is why we see the output of all threecoro_a()
invocations beforecoro_b()
's output:
I am coro_a(). Hi!I am coro_a(). Hi!I am coro_a(). Hi!I am coro_b(). I sure hope no one hogs the event loop...
If we changeawaitcoro_a()
toawaitasyncio.create_task(coro_a())
, thebehavior changes.The coroutinemain()
cedes control to the event loop with that statement.The event loop then proceeds through its backlog of work, callingtask_b
and then the task which wrapscoro_a()
before resuming the coroutinemain()
.
I am coro_b(). I sure hope no one hogs the event loop...I am coro_a(). Hi!I am coro_a(). Hi!I am coro_a(). Hi!
This behavior ofawaitcoroutine
can trip a lot of people up!That example highlights how using onlyawaitcoroutine
couldunintentionally hog control from other tasks and effectively stall the eventloop.asyncio.run()
can help you detect such occurences via thedebug=True
flag which accordingly enablesdebug mode.Among other things, it will log any coroutines that monopolize execution for100ms or longer.
The design intentionally trades off some conceptual clarity around usage ofawait
for improved performance.Each time a task is awaited, control needs to be passed all the way up thecall stack to the event loop.That might sound minor, but in a large program with manyawait
's and a deepcallstack that overhead can add up to a meaningful performance drag.
A conceptual overview part 2: the nuts and bolts¶
Part 2 goes into detail on the mechanismsasyncio
uses to managecontrol flow.This is where the magic happens.You'll come away from this section knowing whatawait
does behind the scenesand how to make your own asynchronous operators.
The inner workings of coroutines¶
asyncio
leverages four components to pass around control.
coroutine.send(arg)
is the method used to start orresume a coroutine.If the coroutine was paused and is now being resumed, the argumentarg
will be sent in as the return value of theyield
statement which originallypaused it.If the coroutine is being used for the first time (as opposed to being resumed)arg
must beNone
.
1classRock: 2def__await__(self): 3value_sent_in=yield7 4print(f"Rock.__await__ resuming with value:{value_sent_in}.") 5returnvalue_sent_in 6 7asyncdefmain(): 8print("Beginning coroutine main().") 9rock=Rock()10print("Awaiting rock...")11value_from_rock=awaitrock12print(f"Coroutine received value:{value_from_rock} from rock.")13return231415coroutine=main()16intermediate_result=coroutine.send(None)17print(f"Coroutine paused and returned intermediate value:{intermediate_result}.")1819print(f"Resuming coroutine and sending in value: 42.")20try:21coroutine.send(42)22exceptStopIterationase:23returned_value=e.value24print(f"Coroutine main() finished and provided value:{returned_value}.")
yield, like usual, pauses execution and returns controlto the caller.In the example above, theyield
, on line 3, is called by...=awaitrock
on line 11.More broadly speaking,await
calls the__await__()
method ofthe given object.await
also does one more very special thing: it propagates (or "passesalong") anyyield
s it receives up the call-chain.In this case, that's back to...=coroutine.send(None)
on line 16.
The coroutine is resumed via thecoroutine.send(42)
call on line 21.The coroutine picks back up from where ityield
ed (or paused) on line 3and executes the remaining statements in its body.When a coroutine finishes, it raises aStopIteration
exception with thereturn value attached in thevalue
attribute.
That snippet produces this output:
Beginning coroutine main().Awaiting rock...Coroutine paused and returned intermediate value: 7.Resuming coroutine and sending in value: 42.Rock.__await__ resuming with value: 42.Coroutine received value: 42 from rock.Coroutine main() finished and provided value: 23.
It's worth pausing for a moment here and making sure you followed the variousways that control flow and values were passed. A lot of important ideas werecovered and it's worth ensuring your understanding is firm.
The only way to yield (or effectively cede control) from a coroutine is toawait
an object thatyield
s in its__await__
method.That might sound odd to you. You might be thinking:
1. What about a
yield
directly within the coroutine function? Thecoroutine function becomes anasync generator function, adifferent beast entirely.2. What about ayield from within the coroutine function to a (plain)generator?That causes the error:
SyntaxError:yieldfromnotallowedinacoroutine.
This was intentionally designed for the sake of simplicity -- mandating onlyone way of using coroutines.Initiallyyield
was barred as well, but was re-accepted to allow forasync generators.Despite that,yieldfrom
andawait
effectively do the same thing.
Futures¶
Afuture is an object meant to represent acomputation's status and result.The term is a nod to the idea of something still to come or not yet happened,and the object is a way to keep an eye on that something.
A future has a few important attributes. One is its state which can be either"pending", "cancelled" or "done".Another is its result, which is set when the state transitions to done.Unlike a coroutine, a future does not represent the actual computation to bedone; instead, it represents the status and result of that computation, kind oflike a status light (red, yellow or green) or indicator.
asyncio.Task
subclassesasyncio.Future
in order to gainthese various capabilities.The prior section said tasks store a list of callbacks, which wasn't entirelyaccurate.It's actually theFuture
class that implements this logic, whichTask
inherits.
Futures may also be used directly (not via tasks).Tasks mark themselves as done when their coroutine is complete.Futures are much more versatile and will be marked as done when you say so.In this way, they're the flexible interface for you to make your own conditionsfor waiting and resuming.
A homemade asyncio.sleep¶
We'll go through an example of how you could leverage a future to create yourown variant of asynchronous sleep (async_sleep
) which mimicsasyncio.sleep()
.
This snippet registers a few tasks with the event loop and then awaits acoroutine wrapped in a task:async_sleep(3)
.We want that task to finish only after three seconds have elapsed, but withoutpreventing other tasks from running.
asyncdefother_work():print("I like work. Work work.")asyncdefmain():# Add a few other tasks to the event loop, so there's something# to do while asynchronously sleeping.work_tasks=[asyncio.create_task(other_work()),asyncio.create_task(other_work()),asyncio.create_task(other_work())]print("Beginning asynchronous sleep at time: "f"{datetime.datetime.now().strftime("%H:%M:%S")}.")awaitasyncio.create_task(async_sleep(3))print("Done asynchronous sleep at time: "f"{datetime.datetime.now().strftime("%H:%M:%S")}.")# asyncio.gather effectively awaits each task in the collection.awaitasyncio.gather(*work_tasks)
Below, we use a future to enable custom control over when that task will bemarked as done.Iffuture.set_result()
(the methodresponsible for marking that future as done) is never called, then this taskwill never finish.We've also enlisted the help of another task, which we'll see in a moment, thatwill monitor how much time has elapsed and, accordingly, callfuture.set_result()
.
asyncdefasync_sleep(seconds:float):future=asyncio.Future()time_to_wake=time.time()+seconds# Add the watcher-task to the event loop.watcher_task=asyncio.create_task(_sleep_watcher(future,time_to_wake))# Block until the future is marked as done.awaitfuture
Below, we'll use a rather bare object,YieldToEventLoop()
, toyield
from__await__
in order to cede control to the event loop.This is effectively the same as callingasyncio.sleep(0)
, but this approachoffers more clarity, not to mention it's somewhat cheating to useasyncio.sleep
when showcasing how to implement it!
As usual, the event loop cycles through its tasks, giving them controland receiving control back when they pause or finish.Thewatcher_task
, which runs the coroutine_sleep_watcher(...)
, willbe invoked once per full cycle of the event loop.On each resumption, it'll check the time and if not enough has elapsed, thenit'll pause once again and hand control back to the event loop.Eventually, enough time will have elapsed, and_sleep_watcher(...)
willmark the future as done, and then itself finish too by breaking out of theinfinitewhile
loop.Given this helper task is only invoked once per cycle of the event loop,you'd be correct to note that this asynchronous sleep will sleepat leastthree seconds, rather than exactly three seconds.Note this is also of true ofasyncio.sleep
.
classYieldToEventLoop:def__await__(self):yieldasyncdef_sleep_watcher(future,time_to_wake):whileTrue:iftime.time()>=time_to_wake:# This marks the future as done.future.set_result(None)breakelse:awaitYieldToEventLoop()
Here is the full program's output:
$ python custom-async-sleep.pyBeginning asynchronous sleep at time: 14:52:22.I like work. Work work.I like work. Work work.I like work. Work work.Done asynchronous sleep at time: 14:52:25.
You might feel this implementation of asynchronous sleep was unnecessarilyconvoluted.And, well, it was.The example was meant to showcase the versatility of futures with a simpleexample that could be mimicked for more complex needs.For reference, you could implement it without futures, like so:
asyncdefsimpler_async_sleep(seconds):time_to_wake=time.time()+secondswhileTrue:iftime.time()>=time_to_wake:returnelse:awaitYieldToEventLoop()
But, that's all for now. Hopefully you're ready to more confidently dive intosome async programming or check out advanced topics in therestofthedocumentation
.