KEMBAR78
Feat: Adds foreign_type_info attribute to table class and adds unit tests. by chalmerlowe · Pull Request #2126 · googleapis/python-bigquery · GitHub
Skip to content

Conversation

@chalmerlowe
Copy link
Collaborator

This PR adds support for the foreign_type_info attribute on the Table class and the associated tests.

@chalmerlowe chalmerlowe requested review from a team as code owners February 4, 2025 13:19
@chalmerlowe chalmerlowe requested a review from hongalex February 4, 2025 13:19
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Feb 4, 2025
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 4, 2025
@chalmerlowe chalmerlowe removed the request for review from hongalex February 4, 2025 13:20
@chalmerlowe chalmerlowe assigned tswast and Linchin and unassigned PhongChuong Feb 4, 2025
For details, see:
https://cloud.google.com/bigquery/docs/external-tables
https://cloud.google.com/bigquery/docs/datasets-intro#external_datasets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need some logic in the setter for schema to avoid overwriting the schema property entirely. Instead, it'll need to be responsible for just schema.fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure if this format will render well in the docs. We might just move all the contents under NOTE: to after Table's schema.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet addressed Tim's comment here.
I spoke with Linchin about the note and with the revision I added, Sphinx should be able to handle the note with no problem.

Copy link
Collaborator Author

@chalmerlowe chalmerlowe Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tswast

fields and schema are two separate items:

  • two different attributes (with setters and getters)
  • two separate nodes on the .properties dict

It is possible for a user to supply an api_resource (a dict) that will overwrite both at the same time, such as: ._properties["schema"] = {fields: [], foreign_type_info: {type_system: "hello world"}. At that point, the end user should expect both to be overwritten.

Due to the nested separation of fields and schema in the ._properties dict and how we write content to it (either with setters OR directly into ._properties), I am not aware of any means where by setting one or the other will cause it's opposite to be accidentally overwritten.

I don't think any additional checks are required.

Also, in test_table.py::test_to_api_repr_w_schema_and_foreign_type_info test, it is broken into several steps. Two of those steps are specifically focused on ensuring that if either item is set the other does not change.

https://cloud.google.com/bigquery/docs/datasets-intro#external_datasets
"""

prop = self._properties.get(self._PROPERTY_TO_API_FIELD["foreign_type_info"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we are exposing this at the table level, it needs to be fetched from schema, still, right?

This is exactly the sort of thing the _get_sub_prop (

def _get_sub_prop(container, keys, default=None):
) and _set_sub_prop (
def _set_sub_prop(container, keys, value):
) helpers are intended to be used for.

We even use it in other Table properties, such as project:

return _helpers._get_sub_prop(
self._properties, self._PROPERTY_TO_API_FIELD["project"]
)

Suggested change
prop = self._properties.get(self._PROPERTY_TO_API_FIELD["foreign_type_info"])
prop = _helpers._get_sub_prop(self._properties, self._PROPERTY_TO_API_FIELD["foreign_type_info"])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete.

"max_staleness": "maxStaleness",
"resource_tags": "resourceTags",
"external_catalog_table_options": "externalCatalogTableOptions",
"foreign_type_info": "foreignTypeInfo",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should adjust this to the format used for _helpers._get_sub_prop and _helpers._set_sub_prop. Likewise, let's replace schema with something compatible with that.

Suggested change
"foreign_type_info": "foreignTypeInfo",
"foreign_type_info": ["schema", "foreignTypeInfo"],
# TODO: remove "schema" from above (between time_partitioning and "snapshot_definition"
"schema": ["schema", "fields"],

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially complete. Added a new value for schema, but not yet done with foreign_type_info.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete.

assert result == expected


class TestForeignTypeInfo:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see a test where we do Table.from_api_repr and Table.to_api_repr so that we can visually compare that the correct schema.foreignTypeInfo field of the REST API object is set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet addressed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

table = self._make_one(self.TABLEREF)
assert table.foreign_type_info is None

def test_foreign_type_info_valid_inputs(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add test cases for the setter for other supported types, i.e., dict and None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet addressed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete.
This test is now parametrized so that it tests under three input conditions:

  • dict
  • None
  • using a ForeignTypeInfo object

Comment on lines 985 to 987
if isinstance(api_field, list):
api_field = api_field[0]
partial[api_field] = obj._properties.get(api_field)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use _get_sub_prop and _set_sub_prop here? Since some overlap, I fear that the schema dictionary from schema.fields will overwrite the dictionary from schema.foreignTypeInfo or vice versa.

Suggested change
if isinstance(api_field, list):
api_field = api_field[0]
partial[api_field] = obj._properties.get(api_field)
_set_sub_prop(partial, api_field, _get_sub_prop(obj._properties, api_field))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Feb 11, 2025
Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Linchin Linchin self-requested a review February 20, 2025 18:35
Copy link
Contributor

@Linchin Linchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@chalmerlowe chalmerlowe enabled auto-merge (squash) February 21, 2025 16:46
@chalmerlowe chalmerlowe disabled auto-merge February 21, 2025 16:47
@chalmerlowe chalmerlowe enabled auto-merge (squash) February 21, 2025 17:10
@chalmerlowe chalmerlowe merged commit 2c19681 into main Feb 21, 2025
15 of 23 checks passed
@chalmerlowe chalmerlowe deleted the feat-b358215039-adds-schema-mods branch February 21, 2025 17:45
chalmerlowe added a commit that referenced this pull request Feb 21, 2025
…ests. (#2126)

* adds foreign_type_info attribute to table

* feat: Adds foreign_type_info attribute and tests

* updates docstrings for foreign_type_info

* Updates property handling, especially as regards set/get_sub_prop

* Removes extraneous comments and debug expressions

* Refactors build_resource_from_properties w get/set_sub_prop

* updates to foreign_type_info, tests and wiring

* Adds logic to detect non-Sequence schema.fields value

* updates assorted tests and logic
chalmerlowe added a commit that referenced this pull request Feb 28, 2025
* Initial batch of changes to remove 3.7 and 3.8

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* more updates to remove 3.7 and 3.8

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* updates samples/geography/reqs

* updates samples/magics/reqs

* updates samples/notebooks/reqs

* updates linting

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* updates conf due to linting issue

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* updates reqs.txt, fix mypy, lint, and debug in noxfile

* Updates owlbot to correct spacing issue in conf.py

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* updates owlbot imports

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* removes kokoro samples configs for 3.7 & 3.8

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* removes owlbots attempt to restore kokoro samples configs

* removes kokoro system-3.8.cfg

* edits repo sync settings

* updates assorted noxfiles for samples and pyproject.toml

* update test-samples-impl.sh

* updates install_deps template

* Edits to the contributing documentation

* deps: use pandas-gbq to determine schema in `load_table_from_dataframe` (#2095)

* feat: use pandas-gbq to determine schema in `load_table_from_dataframe`

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* fix some unit tests

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* bump minimum pandas-gbq to 0.26.1

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* drop pandas-gbq from python 3.7 extras

* relax warning message text assertion

* use consistent time zone presense/absense in time datetime system test

* Update google/cloud/bigquery/_pandas_helpers.py

* Update google/cloud/bigquery/_pandas_helpers.py

Co-authored-by: Chalmer Lowe <chalmerlowe@google.com>

* remove pandas-gbq from at least 1 unit test and system test session

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Chalmer Lowe <chalmerlowe@google.com>

* Feat: Adds foreign_type_info attribute to table class and adds unit tests. (#2126)

* adds foreign_type_info attribute to table

* feat: Adds foreign_type_info attribute and tests

* updates docstrings for foreign_type_info

* Updates property handling, especially as regards set/get_sub_prop

* Removes extraneous comments and debug expressions

* Refactors build_resource_from_properties w get/set_sub_prop

* updates to foreign_type_info, tests and wiring

* Adds logic to detect non-Sequence schema.fields value

* updates assorted tests and logic

* deps: updates required checks list in github (#2136)

* deps: updates required checks list in github

* deps: updates snippet and system checks in github to remove 3.9

* changes the order of two items in the list.

* updates linting

* reverts pandas back to 1.1.0

* Revert changes related to pandas <1.5

* Revert noxfile.py changes related to pandas <1.5

* Revert constraints-3.9 changes related to pandas <1.5

* Revert test_query_pandas.py changes related to pandas <1.5

* Revert test__pandas_helpers.py changes related to pandas <1.5

* Revert test__versions_helpers.py changes related to pandas <1.5

* Revert tnoxfile.py changes related to pandas <1.5

* Revert test__versions_helpers.py changes related to pandas <1.5

* Revert test_table.py changes related to pandas <1.5

* Update noxfile changes related to pandas <1.5

* Update pyproject.toml changes related to pandas <1.5

* Update constraints-3.9.txt changes related to pandas <1.5

* Update test_legacy_types.py changes related to pandas <1.5

* Updates magics.py as part of reverting from pandas 1.5

* Updates noxfile.py in reverting from pandas 1.5

* Updates pyproject.toml in reverting from pandas 1.5

* Updates constraints.txt in reverting from pandas 1.5

* Updates test_magics in reverting from pandas 1.5

* Updates test_table in reverting from pandas 1.5

* Updates in tests re: reverting from pandas 1.5

* Updates pyproject to match constraints.txt

* updates pyproject.toml to mirror constraints

* remove limit on virtualenv

* updates owlbot.py for test-samples-impl.sh

* updates to owlbot.py

* updates to test-samples-impl.sh

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* further updates to owlbot.py

* removes unneeded files

* adds presubmit.cfg back in

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Tim Sweña (Swast) <swast@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants