Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Description
Bug report
Bug description:
PR incoming! It's a 10 second fix.
TLDR
BaseSelectorEventLoop._accept_connection incorrectlyreturns early from itsfor _ in range(backlog) loop whenaccept(2) returns-ECONNABORTED (raised in Python asConnectionAbortedError), whereas it shouldcontinue. This was introduced in#27906 bythis commit, which whilst great, had a slight oversight in not separatingConnectionAbortedError from (BlockingIOError andInterruptedError) when putting them inside a loop ;) Ironically the commit was introduced to give a more contiguous timeslot for accepting sockets in an eventloop, and now with the fix to this issue it'll beeven more contiguous on OpenBSD, continuing past the aborted connections instead of the event loop having to re-poll the server socket and call_accept_connection again. All is good! :D
A brief explanation / reproduction ofECONNABORTED fromaccept(2), forAF_INET on OpenBSD
It's worth writing this up as there is not much documentation online aboutECONNABORTEDs occurrences fromaccept(2), and I have been intermittently in pursuit of this errno for over 2 years!
Some OS kernels including OpenBSD and Linux (tested and confirmed) continue queueing connections that were aborted before callingaccept(2). However the behaviouraccept's return value differs between OpenBSD and Linux!
Suppose the following sequence of TCP packets occurs when a client connects to a server, the client's kernel and server's kernel communicating over TCP/IP, and this happens before the server's userspace program callsaccept on its listening socket:
>SYN, <SYNACK, >ACK, >RST, ie a standard TCP 3WHS but followed by the client sending aRST.
- On OpenBSD when the server's userspace program calls
accepton the listening socket it receives-1, witherrno==ECONNABORTED - On Linux when the server's userspace program calls
accepton the listening socket it receives0, with noerrnoset, ie everything is fine. But of course when trying tosendon the socketEPIPEis either set aserrnoor delivered asSIGPIPE
One can test this with the following script
#!/usr/bin/env python3importsocketimporttimeimportstructADDR= ("127.0.0.1",3156)defconnect_disconnect_client(*,enable_rst:bool):client=socket.socket()ifenable_rst:# send an RST when we call close()client.setsockopt(socket.SOL_SOCKET,socket.SO_LINGER,struct.pack("ii",1,0))client.connect(ADDR)client.close()time.sleep(0.1)# let the FIN/RST reach the kernel's TCP/IP machinerydefmain()->None:server_server=socket.socket()server_server.bind(ADDR)server_server.listen(64)connect_disconnect_client(enable_rst=True)connect_disconnect_client(enable_rst=False)connect_disconnect_client(enable_rst=False)connect_disconnect_client(enable_rst=True)connect_disconnect_client(enable_rst=False)for_inrange(5):try:server_client,server_client_addr=server_server.accept()print("Okay")exceptConnectionAbortedErrorase:print(f"{e.strerror}")if__name__=="__main__":main()
On Linux the output is
OkayOkayOkayOkayOkayOn OpenBSD the output is
Software caused connection abortOkayOkaySoftware caused connection abortOkayObserve that both kernels kept the aborted connections queued. I used OpenBSD 7.4 onInstant Workstation to test this.
BaseSelectorEventLoop._accept_connection's fix
To demonstrateasyncio's issue, we create the following test script to connect five clients to abase_events.Server being served in aselector_events.BaseSelectorEventLoop. Two of the clients are going to be naughty and send anRST to abort their connection before it is accepted into userspace. We monkey patch in aprint() statement just to let us know whenBaseSelectorEventLoop._accept_connection is called. Ideally this should be once, since the server's defaultbacklog of100 is sufficient, but as we will see OpenBSD's raising ofConnectionAbortedError changes this:
#!/usr/bin/env python3importsocketimportasyncioimporttimeimportstructADDR= ("127.0.0.1",31415)defconnect_disconnect_client(*,enable_rst:bool):client=socket.socket()ifenable_rst:# send an RST when we call close()client.setsockopt(socket.SOL_SOCKET,socket.SO_LINGER,struct.pack("ii",1,0))client.connect(ADDR)client.close()time.sleep(0.1)# let the FIN/RST reach the kernel's TCP/IP machineryasyncdefhandler(reader:asyncio.StreamReader,writer:asyncio.StreamWriter):try:print("connected handler")finally:writer.close()# monkey patch in a print() statement just for debugging sakeimportasyncio.selector_events_accept_connection_old=asyncio.selector_events.BaseSelectorEventLoop._accept_connectiondef_accept_connection_new(*args,**kwargs):print("_accept_connection called")return_accept_connection_old(*args,**kwargs)asyncio.selector_events.BaseSelectorEventLoop._accept_connection=_accept_connection_newasyncdefamain()->None:server=awaitasyncio.start_server(handler,*ADDR)connect_disconnect_client(enable_rst=True)connect_disconnect_client(enable_rst=False)connect_disconnect_client(enable_rst=False)connect_disconnect_client(enable_rst=True)connect_disconnect_client(enable_rst=False)awaitserver.start_serving()# listen(3)awaitserver.serve_forever()defmain()->None:asyncio.run(amain())if__name__=="__main__":main()
On Linux the output is
_accept_connection calledconnected handlerconnected handlerconnected handlerconnected handlerconnected handlerOn OpenBSD the output is
_accept_connection called_accept_connection called_accept_connection calledconnected handlerconnected handlerconnected handlerThe first_accept_connection returns immediately because of client 1'sECONNABORTED. The second_accept_connection brings in clients 2 and 3, then returns because of 4'sECONNABORTED, and then the third_accept_connection returns due to client 5'sECONNABORTED.
With the PR patch incoming the OpenBSD behaviour / output is corrected to
_accept_connection calledconnected handlerconnected handlerconnected handlerAll connections are accepted in one single stroke of_accept_connection.
The Odyssey forECONNABORTED on Linux
This is just a personal addendum for the record.
I use Linux and I like collecting all thesignal(7)s anderrno(3)s, it reminds me in a way ofLego Star Wars; it's nice to have a complete collection. Part ofPython's exception hierarchy is
ConnectionError├── BrokenPipeError├── ConnectionAbortedError├── ConnectionRefusedError└── ConnectionResetErrorIn the past two years of me doing socket programming on Linux, forAF_INET andAF_UNIX I have easily been able to produceConnectionRefusedError,ConnectionResetError, andBrokenPipeError, but I have still never been able to produceConnectionAbortedError withaccept(). Looking at the Linux kernel's source code fornet/socket.c andnet/ipv4/ implementing sockets and TCP/IP I can only conclude thatECONNABORTED could possibly occur as a race condition betweenops->accept() andops->getname(), where there is a nanosecond when the socket is not protected by a spinlock.
I've tried various TCP situations includingTCP_FASTOPEN,TCP_NODELAY,O_NONBLOCKconnect()s, combined withSO_LINGER, trying to create the most disgusting TCP handshakes, all to no avail.SYN,SYNACK,RST gets dropped and does not getaccept()ed.
So to any similarly eclectically minded programmers out there who wish to know for the record how to getaccept(2) to produceECONNABORTED: just try the scripts above on OpenBSD and save your time lol!
This one's for you, OpenBSD friends, thanks for OpenSSH!
CPython versions tested on:
CPython main branch
Operating systems tested on:
Other
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status