-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream query response for Parquet format #25955
Comments
There is an That isn't to say we shouldn't update hyper, but since the current hyper version also supports streaming responses I don't think the hyper upgrade is necessarily blocking this work |
@mgattozzi was grappling with this in the issue and related PR linked above. Perhaps there is an approach we haven't tried yet. I thought this could be done by defining a type that implements |
The issue @alamb is that Body does not implement Write so you can't just pass it into the writer. Body is used by Hyper to pull bytes from to send. Hyper wants exclusive ownership of Body. If it's given a stream then that works. What we really need is for the parquet writer to be able to create a stream of Bytes for hyper to consume, not a buffer we write all the data too and then pass to hyper which is what the sync and async versions of the writer do. I don't think upgrading to hyper 1.0 solves anything here. Hopefully this makes sense. I don't think it's impossible though. |
Problem statement
When a query is made via the HTTP API (
/api/v3/query_sql
,/api/v3/query_influxql
) withparquet
as the output format, the record batches in the response are buffered into memory before being serialized to parquet.This could lead to OOMs for large query responses.
Additional context
The parquet writer is not compatible with writing to a streamed
Body
using the current version of hyper that we are using.We either need to upgrade to hyper 1.x, to see if its trait-based approach to
Body
can be used alongside the existing arrow writers, or there may be some upstream work inarrow-rs
required to make this possible.Related:
Proposed solution
We should first start by upgrading
hyper
to 1.x. This may need to be done upstream in IOx before it can be brought in here, since we are leveraging code from IOx that uses hyper 0.14.x.Then we can see if implementing a streamed writer for parquet with the new
Body
trait is possible, or if we need to open some issues upstream inarrow-rs
.The text was updated successfully, but these errors were encountered: