-
Notifications
You must be signed in to change notification settings - Fork 322
feat: adds condition class and assoc. unit tests #2159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds condition class and assoc. unit tests #2159
Conversation
| """str: The expression string for the condition.""" | ||
|
|
||
| # Cast assumes expression is always set due to __init__ validation | ||
| return typing.cast(str, self._properties.get("expression")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my education, why is typing.cast() necessary for expression, but not title or description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both mypy and pytype struggle with correctly assessing return types. In this case, both checkers think the return type should be Optional[Any] when we expect (and validate for) a str.
Despite the fact that we indicate via typehinting:
- that
expressionis astrin the__init__()method - the
expressiongetter returns astr - the
expressionsetter method signature requires that value be input as astr - the setter method has internal checks to ensure that the value stored in the
_propertiesdict is astr
neither mypy nor pytype will believe it.
It is not inherently clear why they believe our typehinting for title and description but not expression. I have come across this failure multiple times in adding objects to our repos.
There are approximately seven similar examples of this problem elsewhere in this file (that predate my taking over the repo) (and there are examples in other files in this codebase).
These are the error messages we see:
mypy
google/cloud/bigquery/dataset.py:1140: error: Incompatible return value type (got "Optional[Any]", expected "str") [return-value]
pytype
/repo/python-bigquery/google/cloud/bigquery/dataset.py:1140:1: error: in expression: bad option 'None' in return type [bad-return-type]
Expected: str
Actually returned: Optional[Any]
| condition = Condition(expression=self.EXPRESSION) | ||
| expected_api_repr = { | ||
| "expression": self.EXPRESSION, | ||
| "title": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with Condition, do we need to distinguish between title and description being not set versus being an empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a pair of asserts to test_setters to confirm the use case that assigning empty strings is allowed and succeeds.
I think there are two issues here:
-
this test is ensuring that if I try to create a
Conditionobject using ONLY the requiredexpressionargument and leave the other arguments (title,description) blank, will it create the expected outcome: i.e. a create an object withNonevalues assigned to bothtitleanddescription? -
Do we need to test how the setter for
titleanddescriptionhandle a range of values (None,empty string, andnon-empty string, something besides the above, etc)
- In
test_setterswe confirm that we can reassign a new string to eithertitleordescriptionAND we confirm that we can assign a new value ofNone(WAI) - In
test_constructor_and_getters_fullwe confirm that we can set anon-empty string(WAI) - In
test_validation_setterswe confirm that something besides a string will fail (WAI)
Our logic does not hinge on whether a value assigned to title or description is Falsey OR not. We do not do a boolean check NOR do empty strings trigger some logic so there is no difference between assigning an empty string vs a non-empty string. I am not convinced that we need the test for empty strings, but I don't believe it will hurt us.
|
|
||
| def to_api_repr(self) -> Dict[str, Any]: | ||
| """Construct the API resource representation of this Condition.""" | ||
| return self._properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to make a deep copy of the dict, similar to other classes, such as Table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we have a compelling reason to use deepcopy, I am disinclined. Thoughts?
In a previous experimental PR, Tim left a comment about some of the pitfalls of using deep copy, if not necessary.
It might be a bit wasteful to make deepcopy here and in `from_api_repr`. Indeed it's safer, but could add a lot of overhead. IIRC we actually removed some `deepcopy` calls from `SchemaField` because it was slowing down customers who build dynamic schemas in their code.
In all cases we are returning a dict (thus there should not be a significant risk of some underlying nested value being changed by another expression in the code)
I was curious about the overall cost of a deepcopy so I did a fairly simple experiment:
In [1]: import copy
In [2]: api_repr = {
...: "expression": "some_value",
...: "title": "some_value",
...: "unexpected_field": "some_value",
...: }
In [3]: _properties['condition'] = api_repr
From fastest to slowest:
Simply assign an alias to the dict
In [4]: %timeit new_prop = _properties
14.4 ns ± 0.0149 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
# NOTE: when we use `return _properties`, it is the same as using an alias.
Use the builtin copy method in the dict class
This creates a shallow copy of the dict
In [11]: %timeit new_prop = _properties.copy()
58.6 ns ± 0.228 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Create a copy.copy
This also creates a shallow copy of the dict
In [9]: %timeit new_prop = copy.copy(_properties)
145 ns ± 0.401 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Create a copy.deepcopy
In [5]: %timeit new_prop = copy.deepcopy(_properties)
3 µs ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Adds a Condition class and associated suite of tests.
🦕