-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Add partial overload checks #5476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add partial overload checks #5476
Conversation
@ilevkivskyi -- So, just to summarize the status of all these PRs in one place... The following PRs are independent from one another and can be reviewed in any order:
This PR builds on top of code from 1 and 2. Once those two PRs are merged in, I'll rebase this PR to shrink the diff size. I've left #5280, the old PR for adding partial overlaps, open for now since it has some extra context/probably includes some feedback that I've forgotten to incorporate. Feel free to close it whenever though if you want to minimize clutter -- I can re-find it pretty easily. I've tested this PR on our internal codebases -- so far, I believe this PR introduces one regression that I've as of yet been unable to reproduce and fix. (Basically, the code is asserting some field is None, and mypy gets confused for some reason). I'll continue working on tracking that final regression and fixing it over the next few days while I'm waiting on your code reviews, but if you have time to take a look/can figure out how to produce a smaller repo, I'd be extremely grateful. (And sorry for dumping all of this code on you all of a sudden -- if you want me to break up the above PRs into even smaller ones, let me know, and I'll be happy to do it.) |
I would prefer to close it, just to not get lost :-)
No problem at all, thanks for working on this. The PR sizes are just right for my taste. (Btw, there is already a merge conflict, you can probably rebase after merging the subtype cache.) |
4b6c58a
to
b9c8d22
Compare
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with... 1. TypedDicts 2. Tuples 3. Unions 4. TypeVars 5. Generic types containing variations of the above. It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code. This pull request builds on top of python#5474 and python#5475 -- once those two PRs are merged, I'll rebase this diff if necessary. This pull request also supercedes python#5475 -- that PR contains basically the same code as these three PRs, just smushed together.
b9c8d22
to
af2109c
Compare
@ilevkivskyi -- the rebase is done now, fyi |
OK, thanks! It is a bit late here. I will finish this tomorrow morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good! Here are few comments.
mypy/checker.py
Outdated
we do not check for partial overlaps. This lets us vary our error messages | ||
depending on the severity of the overlap. | ||
See 'is_unsafe_partially_overlapping_overload_signatures' for the full checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there two functions only for more tuned error messages? If yes, then I think it is not worth it. This is already complex part of the code and the error messages aren't much cleaner because of this.
mypy/checker.py
Outdated
# (We already have a rudimentary implementation of 'is_partially_overlapping', but it only | ||
# attempts to handle the obvious cases -- see its docstring for more info.) | ||
# if "foo" in signature.name or "bar" in signature.name or "chain_call" in signature.name: | ||
# print("in first") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugging print?
mypy/checker.py
Outdated
The caller can then unify on all type variables whether or not the callable is originally | ||
from a class or not.""" | ||
type_list = typ.arg_types + [typ.ret_type] | ||
# old_type_list = list(type_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray comment again?
mypy/meet.py
Outdated
# Note: doing 'return typ.items()' makes mypy | ||
# infer a too-specific return type of List[CallableType] | ||
out = [] # type: List[Type] | ||
out.extend(typ.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use return list(typ.iems())
(i.e. make a copy so that mypy is happy about invariance).
mypy/meet.py
Outdated
|
||
# We should never encounter these types, but if we do, we handle | ||
# them in the same way we handle 'Any'. | ||
illegal_types = (UnboundType, PartialType, ErasedType, DeletedType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PartialType
should be an assert False, "Something bad about partial types"
, for other it is OK to be Any
mypy/meet.py
Outdated
ignore_promotions: bool = False) -> bool: | ||
"""Returns true if left and right are overlapping tuples. | ||
Precondition: is_tuple(left) and is_tuple(right) are both true.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already have asserts below, better add an assert message there instead.
if out is None: | ||
pass | ||
return out | ||
[builtins fixtures/isinstance.pyi] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I am not sure what these two tests test. Maybe add a note or somehow modify them so that it is more clear what is a potential failure?
def f(self, x): ... | ||
|
||
# TODO: This shouldn't trigger an error message. | ||
# Related to testTypeCheckOverloadImplementationTypeVarDifferingUsage2? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please create a follow-up issue for this kind of problems?
# We ought to have handled every case by now: we conclude the | ||
# two types are not overlapping, either completely or partially. | ||
|
||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be safer to have different defaults for overloads and isinstance()
checks. For the former, it is better to have False
to avoid false positives. But for isinstance()
it is probably better to have True
for the cases where it is not clear (to avoid inferring some branches as unreachable, as this may cause problems for mypyc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I persuade you to keep this a return False
? Basically the way is_overlapping_types
works is by steadily working through each pair of types and comparing them against each other.
So, if the left and the right are both TypedDict, we compare those two directly and return either True or False, if the left and the right are both Instance, we do the same...
This final return False
case is meant to handle the situation where left and right are genuinely two unrelated types -- returning anything other then False here would lead to incorrect results in those cases.
It's possible that there are some overlap cases we've missed (e.g. can CallableType overlap with Type? Can Type overlap with Instance?), but I'd much rather prefer we handle those missing cases directly and fix is_overlapping_types
as they come up instead of silently letting the function continue to be imprecise.
to avoid inferring some branches as unreachable, as this may cause problems for mypyc
I think maybe a better way of solving this is to make mypy report branches that are accidentally unreachable -- it's fairly easy to accidentally add an isinstance check that'll never be reached, I think. (Of course, we wouldn't report branches that are intentionally unreachable due to system platform guards or things like 'if False' -- we also probably shouldn't report branches that just contain an assert/just throw an exception).
I'm hoping that such a check could just be enabled by default -- I can't really imagine people intentionally adding logic to a no-op branch.
(I'd be willing to work on adding this sort of check if you think it's valuable enough).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could I persuade you to keep this a
return False
?
Not just me :-)
Your proposal with the additional error may generate false positives. Consider this example:
class A: ...
class B: ...
if isinstance(x, A):
...
if isinstance(x, B):
# this is not reachable ...or is it?
I think we already discussed some time ago that there are special cases (like multiple inheritance) where we are deliberately unsafe. All such cases should be treated on per-use-case basis. For example, when checking for overloads, we treat A
and B
as non-overlapping, while when inferring reachability of a branch we might want to do the opposite.
Even if you will not add the default
argument for such corner cases, I am quite sure it will be added later (the above example is unreachable on master, but this may be changed soon).
@msullivan what do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok. In that case then, maybe a better approach might be to add a allow_multiple_inheritance
flag to this function? And when it's true, we'd skip all of the logic in the if isinstance(left, Instance) and isinstance(right, Instance)
check and just directly return True.
(I haven't had a chance to test this on our internal codebases yet, but it doesn't seem like it'd break too many things, based on what happens when I run mypy's own tests.)
As it stands, just swapping the return False
line at the very bottom with return default
won't correctly handle the multiple inheritance case -- the "left and right are instances" check sometimes returns False.
More broadly, I'd prefer whitelisting specific cases where we want to relax the overlap checks -- I think the code would be more maintainable that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a better approach might be to add a
allow_multiple_inheritance
flag to this function?
[snip]
More broadly, I'd prefer whitelisting specific cases where we want to relax the overlap checks -- I think the code would be more maintainable that way
You think it is more maintainable, I think it looks like "premature maintainability optimization" :-)
Currently all edge cases should be considered non-overlapping for overloads, but overlapping for inferring reachability. For example, empty containers: List[int]
and List[Type[int]]
are considered non-overlapping for overloads but should be overlapping for reachability.
Anyway, it looks like this is an independent issue, so could you please add a TODO item above the final return False
, and add a permalink to it in #3603 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently all edge cases should be considered non-overlapping for overloads, but overlapping for inferring reachability. For example, empty containers: List[int] and List[Type[int]] are considered non-overlapping for overloads but should be overlapping for reachability.
Well, List[int]
and List[Type[int]]
are both represented by Instance objects underneath -- so I think special-casing the "left and right are both Instances" tactic would work?
In any case, I'll make a TODO -- we can worry about this later once it comes up again :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I think special-casing the "left and right are both Instances" tactic would work?
It may be more subtle because it will recurse to checking arguments that are Instance
vs TypeType
. Also there are other cases if you want non-instances, e.g. Instance
vs CallableType
should be always overlapping for the purpose of isinstance()
and callable()
checks (that @msullivan implemented a while ago), but should not be overlapping for overloads.
In any case, I'll make a TODO -- we can worry about this later once it comes up again :)
OK.
mypy/meet.py
Outdated
return UnionType.make_simplified_union([narrow_declared_type(x, narrowed) | ||
for x in declared.relevant_items()]) | ||
elif not is_overlapping_types(declared, narrowed, use_promotions=True): | ||
elif not is_overlapping_types(declared, narrowed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test for this, or you jut noticed the inconsistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, what happened here is that that I changed the behavior of is_overlapping_types
so it does use promotions by default. So, removing this param isn't actually causing any difference in behavior.
@ilevkivskyi -- ok, sorry for the delay! I can confirm that this is now (finally) ready for a second look, and can also confirm that it triggers no new errors in our internal codebases. Specific changes I made:
I've also left a few responses to your comments up above. |
Thanks for the updates! I will make another pass of review tomorrow (most likely). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now modulo the TODO comment I mentioned, feel free to merge after adding it.
Btw, there is a merge conflict, could you please double check this is still OK with internal codebases after fixing the conflict (just in case).
@ilevkivskyi -- it looks like everything checks out internally. Before I merge though, I do have one final concern: while I can confirm this diff doesn't introduce any new errors, it's still possible that this diff will start making mypy mark previously reachable branches as being unreachable. How big of a concern is this/do you think this is something we should try testing for first in some way? |
I think this is OK. I am not too concerned about rare false negatives, we are already aware of the problem, and the issues are filed. I think you can go ahead and merge this. |
This pull request adds more robust support for detecting partially overlapping types. Specifically, it detects overlaps with...
It also swaps out the code for detecting overlaps with operators and removes some associated (and now unused) code.
This pull request builds on top of #5474 and #5475 -- once those two PRs are merged, I'll rebase this diff if necessary.
This pull request also supercedes #5475 -- that PR contains basically the same code as these three PRs, just smushed together.