[RFC, not for merge] Net: Websocket: introduce non-blocking receive frame API #4709
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First of all, I'd like to emphasize that it's an Request For Comments PR, as I would love to hear your opinion on the changes as a whole, to make sure it's an community approved / way-to-go approach / changes.
Changes description:
Currently, the Receive Frame of WebsocketImpl is implemented in a way, that it expects for a single websocket frame to be received in a blocking fashion: untill frame gets received, context is blocked till completion.
This is an issue with poor-network-connection clients that connect to server in a way, that clients with high packet loss / clients that send only part of a WSS frame could potentially lock-up processing thread for a very long time.
This fix implements a new set of coroutines, that mimic the behavior of previous implementation, however are aimed to use internal buffer for caching partially received frame, and return one whenever frame assembly is completed.
This proves to unblock the caller thread from any hussle with bad connections, and enables the true async
implementation of socket processing.
Real life use case / why I made the changes:
The reason why I did these changes in the first place, is to address an issue we've had with a SW that we were using: Poco library is the backbone of the OpenWiFi uCentral Gateway (later on referrenced as 'GW') software (link that is used as a Cloud controller for uCentral / OpenWifi capable and compatible Access Point devices.
The GW itself is built on top of Poco in the Async-programming fashion.
The part of the GW where we had an issue, is the Observer / Reactor part, that get's triggered upon EPOLLIN events arriving at the underlying websocket FD, which later on makes a call to a blocking receiveFrame call - link 2.
After thorough investigation, we found out, that underlying receiveFrame calls receiveBytes->receiveHeader / receiveMask / receivePayload and finally WebSocketImpl::receiveNBytes.
The problem with receiveNBytes function, is that it's a loop-based function, that tries to receiveSomeBytes up untill the size of received data is not satisfied with payloadSize of websocket frame.
E.g. this function expects the underlying socket to return complete websocket frame / payload in a loop. Untill exact number of requested bytes of data is not being read the function does not return control to the caller.
A different approach - proposed with this RFC / PR - would be to expose a new API call, that would utilize internal buffering on each recv() call, and in case if bufferized data is sufficient to compose a single WSS frame, it would then return complete frame back to the caller.
Final words:
While these changes address specific issue we've had in our production deployments - APs with bad internet connection caused control / websocket processing threads to hang - the use case does not explicitly limits changes usage to just better handling of connections with bad internet whatsoever.
These changes fundamentally challenge the synchronous base nature of the underlying receiveFrame function.
CC: @phwhite @stephb9959 @i-chvets @SviatoslavBoichuk @serhiy-p @taraschornyiplv