Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Adds foreign_type_info attribute to table class and adds unit tests. #2126

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

chalmerlowe
Copy link
Collaborator

This PR adds support for the foreign_type_info attribute on the Table class and the associated tests.

@chalmerlowe chalmerlowe requested review from a team as code owners February 4, 2025 13:19
@chalmerlowe chalmerlowe requested a review from hongalex February 4, 2025 13:19
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Feb 4, 2025
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 4, 2025
@chalmerlowe chalmerlowe removed the request for review from hongalex February 4, 2025 13:20
@chalmerlowe chalmerlowe assigned tswast and Linchin and unassigned PhongChuong Feb 4, 2025
For details, see:
https://cloud.google.com/bigquery/docs/external-tables
https://cloud.google.com/bigquery/docs/datasets-intro#external_datasets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need some logic in the setter for schema to avoid overwriting the schema property entirely. Instead, it'll need to be responsible for just schema.fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure if this format will render well in the docs. We might just move all the contents under NOTE: to after Table's schema.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet addressed Tim's comment here.
I spoke with Linchin about the note and with the revision I added, Sphinx should be able to handle the note with no problem.

https://cloud.google.com/bigquery/docs/datasets-intro#external_datasets
"""

prop = self._properties.get(self._PROPERTY_TO_API_FIELD["foreign_type_info"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we are exposing this at the table level, it needs to be fetched from schema, still, right?

This is exactly the sort of thing the _get_sub_prop (

def _get_sub_prop(container, keys, default=None):
) and _set_sub_prop (
def _set_sub_prop(container, keys, value):
) helpers are intended to be used for.

We even use it in other Table properties, such as project:

return _helpers._get_sub_prop(
self._properties, self._PROPERTY_TO_API_FIELD["project"]
)

Suggested change
prop = self._properties.get(self._PROPERTY_TO_API_FIELD["foreign_type_info"])
prop = _helpers._get_sub_prop(self._properties, self._PROPERTY_TO_API_FIELD["foreign_type_info"])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete.

@@ -411,6 +412,7 @@ class Table(_TableBase):
"max_staleness": "maxStaleness",
"resource_tags": "resourceTags",
"external_catalog_table_options": "externalCatalogTableOptions",
"foreign_type_info": "foreignTypeInfo",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should adjust this to the format used for _helpers._get_sub_prop and _helpers._set_sub_prop. Likewise, let's replace schema with something compatible with that.

Suggested change
"foreign_type_info": "foreignTypeInfo",
"foreign_type_info": ["schema", "foreignTypeInfo"],
# TODO: remove "schema" from above (between time_partitioning and "snapshot_definition"
"schema": ["schema", "fields"],

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially complete. Added a new value for schema, but not yet done with foreign_type_info.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete.

@@ -5993,6 +5994,71 @@ def test_external_catalog_table_options_from_api_repr(self):
assert result == expected


class TestForeignTypeInfo:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see a test where we do Table.from_api_repr and Table.to_api_repr so that we can visually compare that the correct schema.foreignTypeInfo field of the REST API object is set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet addressed.

table = self._make_one(self.TABLEREF)
assert table.foreign_type_info is None

def test_foreign_type_info_valid_inputs(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add test cases for the setter for other supported types, i.e., dict and None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet addressed.

@@ -978,10 +978,12 @@ def _build_resource_from_properties(obj, filter_fields):
"""
partial = {}
for filter_field in filter_fields:
api_field = obj._PROPERTY_TO_API_FIELD.get(filter_field)
api_field = _get_sub_prop(obj._PROPERTY_TO_API_FIELD, filter_field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused with this part. The key is the property name of the class (e.g. what is the name of the @property in the Table class). I don't think that should ever by a list.

Comment on lines 985 to 987
if isinstance(api_field, list):
api_field = api_field[0]
partial[api_field] = obj._properties.get(api_field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use _get_sub_prop and _set_sub_prop here? Since some overlap, I fear that the schema dictionary from schema.fields will overwrite the dictionary from schema.foreignTypeInfo or vice versa.

Suggested change
if isinstance(api_field, list):
api_field = api_field[0]
partial[api_field] = obj._properties.get(api_field)
_set_sub_prop(partial, api_field, _get_sub_prop(obj._properties, api_field))

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants