Dynamical schema using SchemaModel #1067
-
Question about panderaIn #465 two solutions are proposed to handle dynamic (only-known-at-runtime) column names. However, both solutions use DataFrameSchema and I would like to know if there is a way to handle such a case using SchemaModel instead. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
hi @alejandro-yousef so at the end of the day class MySchema(pa.SchemaModel):
dynamic_column: pa.typing.Series[float]
def fn(data: pd.DataFrame, runtime_column_name: str):
... # do stuff
return (
SchemaModel.to_schema()
# use any of the schema transformation methods:
# https://pandera.readthedocs.io/en/stable/dataframe_schemas.html#dataframeschema-transformations
.update_column(dynamic_column={"name": runtime_column_name}) # add run-time columns
.validate(data)
) If you wanted something fancier you can extend the class MySchema(pa.SchemaModel):
dynamic_column: pa.typing.Series[float]
def dynamic_validate(cls, check_obj: pd.DataFrame, runtime_column_name: str, **kwargs):
schema = cls.to_schema().update_column(dynamic_column={"name": runtime_column_name})
return schema.validate(check_obj, **kwargs)
def fn(data: pd.DataFrame, runtime_column_name: str):
... # do stuff
return MySchema.dynamic_validate(data, runtime_column_name) This can be coupled with regex column matching if you have a bunch of dynamic column names at runtime. |
Beta Was this translation helpful? Give feedback.
-
Thanks @cosmicBboy ! Good to know that SchemaModel gets converted to a DataFrameSchema anyway |
Beta Was this translation helpful? Give feedback.
hi @alejandro-yousef so at the end of the day
SchemaModel
is converted to aDataFrameSchema
when you callSchemaModel.validate
, so you can do something like:If you wanted something fancier you can extend the
Schema…