-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_SSLProtocolTransport.close()
cannot close the connection until timeout or EOF
#471
Comments
MagicStack/uvloop#471 aio-libs/aiohttp#6762 Signed-off-by: Rongrong <[email protected]>
I'd done some deep-dive and found that the bug is a TLS/SSL-relevant bug. Non-TLS connections can be closed even if - await fetch('https://sgp-ping.vultr.com/vultr.com.1000MB.bin')
+ await fetch('http://sgp-ping.vultr.com/vultr.com.1000MB.bin') |
I also analyzed the web traffic with Wireshark. With |
Another deep-dive: async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
logging.debug(response)
logging.debug(response.content)
+ await asyncio.sleep(5) # at this point, only a little part of the response is read, then pends
+ return # after the connection is "closed", uvloop will resume reading the response |
A deeper dive shows the direct cause of the bug. async def reproduce(wait):
await fetch('https://sgp-ping.vultr.com/vultr.com.1000MB.bin')
- await asyncio.sleep(wait) # now you will see that the file is still being downloaded
+ await asyncio.sleep(35) # the download will last for exactly 30s, then abort
Lines 42 to 51 in 3e71ddc
Lines 559 to 574 in 3e71ddc
We can easily verify that the low-level connection is closed 30 seconds after uvloop/uvloop/includes/consts.pxi Line 24 in 3e71ddc
Now that Lines 600 to 618 in 3e71ddc
Here's a simple hotfix to stop class AiohttpUvloopTransportHotfix(contextlib.AbstractAsyncContextManager):
def __init__(self, response: aiohttp.ClientResponse):
self.response = response
async def __aexit__(self, exc_type, exc_val, exc_tb):
if UVLOOP_ENABLED:
self.response.connection.transport.abort()
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
async with AiohttpUvloopTransportHotfix(response):
... |
i have the same problem with httpS server using uvloop, but with asyncio it works as expected. |
Hi, @rojamit. I don't really so much familiar with aio servers. Could you please figure out some code to reproduce that? |
@Rongronggg9 right, its a https reverse proxy implemented with asyncio and httptools. problem occurs only with uvloop, and looks like 1:1 as yours, but on the server-side about code:
and... traffic still persist for about 10-30 seconds AFTER close() without uvloop - no problems like this notice: no new data_received calls after close() occurs, but traffic and CPU load |
_SSLProtocolTransport.close()
cannot close the connection until timeout or EOF
@Rongronggg9 hmm, is uvloop abandoned?... |
Hopefully not. Historically speaking, some of the recent versions were released right after a long-time radio silence. If someone figures out a PR to fix this, it could make the process faster. I am not really familiar with C-binding libraries so I can do little. |
Sorry for the late reply! With a quick look through the details (thanks @Rongronggg9 for the debugging!), this is probably because the remote peer didn't reply a proper TLS More specifically, |
Thanks for your explanation. I am still confused though.
If you replace the URL with
"If the transport has a buffer for outgoing data, buffered data will be flushed asynchronously. No more data will be received." But the reality is the opposite... The read operation resumed right after calling
|
@fantix same issue for the server-side: clients continue sending data after uvloop TLS server close()'s connection - no new protocol.data_received calls from uvloop, but traffic and CPU usage persists for about 30 sec. No issues with native asyncio loop instead. Looks like it's only related for requests with large payload? |
It is very regrettable that v0.17.0 is still buggy... |
Right, 0.17 didn't change this. This would probably be similar in CPython 3.11 too. I'll look into this later this weekend. |
Thanks! I just did more debugging and I found that
|
I've also tested the vanilla asyncio in CPython 3.11.0rc1/rc2 since it has "borrowed"1 the SSL implementation from But a fatal error is logged: DEBUG:asyncio:<asyncio.sslproto.SSLProtocol object at 0x7fdacf7879d0>: Fatal error on transport
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/sslproto.py", line 646, in _do_shutdown
self._sslobj.unwrap()
File "/usr/lib/python3.11/ssl.py", line 983, in unwrap
return self._sslobj.shutdown()
^^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLError: [SSL: APPLICATION_DATA_AFTER_CLOSE_NOTIFY] application data after close notify (_ssl.c:2672) I consider the fatal error is pretty OK: Footnotes |
So here locates the buggy code: cdef _do_flush(self, object context=None):
try:
self._do_read_into_void(context)
self._do_write()
self._process_outgoing()
self._control_ssl_reading()
except Exception as ex:
self._on_shutdown_complete(ex)
else:
if not self._get_write_buffer_size():
self._set_state(SHUTDOWN)
self._do_shutdown(context)
cdef _do_shutdown(self, object context=None):
try:
# we must skip all application data (if any) before unwrap
self._do_read_into_void(context) # <--- I am so sad because I am buggy
try:
self._sslobj.unwrap()
except ssl_SSLAgainErrors as exc:
self._process_outgoing()
else:
self._process_outgoing()
if not self._get_write_buffer_size():
self._on_shutdown_complete(None)
except Exception as ex:
self._on_shutdown_complete(ex) Being commented out, DEBUG:asyncio:<uvloop.loop.SSLProtocol object at 0x7f9cbfbcf5e0>: Error occurred during shutdown
Traceback (most recent call last):
File "uvloop/sslproto.pyx", line 624, in uvloop.loop.SSLProtocol._do_shutdown
File "/usr/lib/python3.11/ssl.py", line 983, in unwrap
return self._sslobj.shutdown()
^^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLError: [SSL: APPLICATION_DATA_AFTER_CLOSE_NOTIFY] application data after close notify (_ssl.c:2672) You may also inspect the source code of both uvloop and asyncio yourself. Vanilla asyncio in CPython 3.11 indeed does NOT read before unwrap in the shutdown stage. def _do_flush(self):
self._do_read()
self._set_state(SSLProtocolState.SHUTDOWN)
self._do_shutdown()
def _do_shutdown(self):
try:
if not self._eof_received:
self._sslobj.unwrap()
except SSLAgainErrors:
self._process_outgoing()
except ssl.SSLError as exc:
self._on_shutdown_complete(exc)
else:
self._process_outgoing()
self._call_eof_received()
self._on_shutdown_complete(None) https://github.com/python/cpython/blob/3.11/Lib/asyncio/sslproto.py#L643 |
After reading more refs, I finally realized that there do be such a probability that the "weird" behavior is caused by misbehaved servers (my case) or clients (#471 (comment)). Indeed, according to the spec (e.g. https://www.rfc-editor.org/rfc/rfc5246), waiting for So there are three solutions:
The violation of spec is so widespread that until now no other one realized that it is caused by a violation. My preference is 1. |
Thanks for the very detailed reply and sorry for the delay! I think we should do (1) for TLS 1.2 and lower to "conform with widespread implementation practice", and cut down the Python 3.11 behaves differently because #385 wasn't merged in, I'll probably create a PR to do #385 there, together with the proposed fix here. |
See MagicStack#471 for details.
At first, I consider it is OK. After doing some tests, I was so sad to find that Cloudflare still violates the spec even in TLS 1.3... It means that the situation of the violation does not become much better in the world of TLS 1.3.
For servers, they maintain a connection pool and recycle connections immediately (which relies on client-side one-way shutdowns), not willing to "waste" any bandwidth to reply to I think we should reconsider Option 3 ("add an option to switch between the two behavior"), both for uvloop and CPython asyncio. If already known that the @Dreamsorcerer @asvetlov @webknjaz Sorry for disturbing you. What do you think about the issue? |
I'm no expert with TLS, so I'm just relying on your investigations. :P My only question if we add a switch, who changes the switch? If it's the user, when would they know which option to use? |
That could be a question to be discussed. What's your preference for that? My preference is that everyone can change it, but HTTP libraries should have two-way shutdowns turned off by default. Option 3 is just my personal advice, not a result of the discussion. You may have other choices :) Even if both CPython asyncio and uvloop adopted #507, libraries on top of them could still call In fact, I am more concerned about aiohttp maintainers' opinion about the question that "comfort to the spec or save users' bandwidth?" At least for HTTP/1.0 or later, doing a two-way shutdown does not show additional advantages according to my research, but wastes a lot of bandwidth if the other side only supports one-way shutdowns. Most users do not expect such behavior. |
Actually I got a new idea. We can implement the
If the status quo (of CPython, both <= 3.10 and == 3.11) is already taking this risk of truncation, I think it is okay to keep it that way. Users who do care about this risk could just |
Proposed shutdown flow: So basically user could now 1) call
|
@fantix Wow, your flow graph is so clear. I second your proposal! There is a slight question, however. How can a user "return different values in |
哈哈,哪里哪里🤓谢谢 You may notice 2 |
Understood. Then I have no more questions and thanks for your explanation :) |
MagicStack/uvloop#471 aio-libs/aiohttp#6762 Signed-off-by: Rongrong <[email protected]>
PYTHONASYNCIODEBUG
in env?: yesDesciption
If
uvloop
is enabled, as long as EOF is not reached, even if the response, connection, and session are all closed, the server from which aiohttp requested resources, will still keep sending packets to the aiohttp-client for some time. However, the bug is only reproducible with uvloop enabled.To reproduce
Log and screenshot
The bug has also been reported to
aiohttp
: aio-libs/aiohttp#6762The text was updated successfully, but these errors were encountered: