Could not find example or guideline for the partitioning using parquet.NET? Is partitioning supported? #527

himensuthar · 2024-07-05T12:48:55Z

himensuthar
Jul 5, 2024

Hi, I see Apache Spark having the inbuilt method for the partioning the parquet file using partitionBy method. as below

df.write.partitionBy("column_name").parquet("path_to_output_directory")

I can partition data manually by creating subdirectories based on the partition keys and storing Parquet files accordingly.
but, I am looking similar functinality as Apache Spak into Parquet.NET, but could not found example or guideline ?

aloneguid · 2024-07-06T07:53:06Z

aloneguid
Jul 6, 2024
Maintainer

Spark is a compute engine, and this is a library. The way Spark does partitioning is applying some logic and writing to separate parquet files using java library. In short the answer is no, it's outside of the scope of parquet libraries to do that, regardless of the platform or language.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not find example or guideline for the partitioning using parquet.NET? Is partitioning supported? #527

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Could not find example or guideline for the partitioning using parquet.NET? Is partitioning supported? #527

himensuthar Jul 5, 2024

Replies: 1 comment

aloneguid Jul 6, 2024 Maintainer

himensuthar
Jul 5, 2024

aloneguid
Jul 6, 2024
Maintainer