-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Plugin for ctypes.Array #4869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin for ctypes.Array #4869
Conversation
This provides accurate return types for ctypes.Array.__getitem__ and __iter__. __init__ is partially implemented but not yet working.
Hm, it looks like a forgot to press "Comment" a week ago, sorry. Trying to remember, I think my conclusion was: behavior of |
... and thanks for working on this! :-) |
No worries :)
OK, that means the following bugs exist at the moment:
Is that correct? Should I put these points into a separate issue (or multiple issues)? |
Yes, I think one common issue for these would be OK. |
This work has been in progress for a mighty long time. Is it still relevant? |
# Conflicts: # mypy/plugin.py
As far as I can tell, this is still blocked by #4964. I merged master into my branch and checked, and the method signature hooks aren't looked up correctly for I'm aware of #5379, but I think that doesn't help in this case. The Because of this, |
Are interested in fixing #4964 first (or some parts of it)? This is an important issue and you have a good context of what is needed in real world. |
@ilevkivskyi I think I have a possible fix for the remaining #4964 issues, see #5700. Based on my local testing, those fixes should be enough to make the ctypes plugin work correctly. |
Class instantiation intentionally does not call plugin hooks on __init__, instead function hooks for the class name are called. Since plugin hooks cannot modify the signature of a function call (there is no "function signature hook", unlike for methods), we instead perform manual argument type checking in a regular function hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks very good. I also tried this agains our internal codebases, and it doesn't generate any errors.
Here I have some comments (mostly minor).
def get_function_hook(self, fullname: str | ||
) -> Optional[Callable[[FunctionContext], Type]]: | ||
from mypy.plugins import ctypes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate, but it is not your fault. We already have an issue to refactor plugin modules to avoid these imports, see #5117. Could you please also add a comment (or comments?) about this also here?
Another idea is to move the ctypes
plugin to plugin.py
since all other stdlib plugins live there. In future we might add more stdlib plugins (e.g. for functools
) so this file will grow, but for now this is probably a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea is to move the
ctypes
plugin toplugin.py
since all other stdlib plugins live there
That would be possible, but I'm not sure if it's worth it here. The circular import issues aren't too big (the only affected parts are the argument type annotations of the hooks). And since the ctypes
plugin is a bit larger than the other hooks in plugin.py
, I think it makes more sense to have it in a separate module. (It's not the only stdlib plugin to have its own module - dataclasses
is separated as well, and it's roughly the same size as the ctypes
plugin.)
'Array constructor argument {} of type "{}"' | ||
' is not convertible to the array element type "{}"' | ||
.format(arg_num, arg_type, et), | ||
ctx.context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also support calling Array
with *args
/**kwargs
/etc? If yes, I would add a test for this, if no, I would add a comment explaining this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it doesn't work with *args
at the moment, but it definitely should. **kwargs
should always be an error, since Array
doesn't take keyword arguments. (Yes, **{}
is technically allowed, but I really hope nobody does that.)
I'm not sure how to handle this in a function hook though - I don't see a way to find out if an argument is a regular arg, *args
, or **kwargs
.
In any case, I've added a (currently failing) test for *args
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the logic in function hook is sub-optimal, this is probably the most problematic hook currently, several people who tried it, found the API inconvenient. You also get corresponding expressions in ctx.args[0]
for every type in ctx.arg_types[0]
, maybe you can try checking if it is a StarExpr(...)
and then somehow "unpack" type if it maps to iterable supertype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you can try checking if it is a
StarExpr(...)
This doesn't seem to work - if I print out ctx.args[0][0]
for a call like intarr4(*int_values)
, I get NameExpr(int_values [__main__.int_values])
, with no StarExpr
around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... This is unfortunate, there is a trivial fix to apply_function_plugin()
to pass arg_kinds
to FunctionContext
, but this looks like a partial solution of a bigger problem. Could you please add a TODO
comment here? We can fix it later after we decide what to do with the FunctionContext
API.
class slice: pass | ||
class str: pass | ||
class tuple(Generic[T]): pass | ||
class unicode: pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two fixtures needed, or you can slightly tweak one of existing fixtures to accommodate your needs? If yes, then I think it is better than adding two new fixtures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closest one would be floatdict.pyi
, if I add definitions for unicode
and slice
there, I can use it for all of the ctypes
tests.
The previous implementation was broken in a few ways, and the correct behavior can be easily achieved using map_instance_to_supertype.
This replaces the previous ctypes-specific fixtures.
This seems odd - shouldn't |
You are totally allowed to update that signature to be more reasonable :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updates! I have one more (hopefully last) round of comments here about few remaining corner cases (sorry for being pedantic here).
mypy/plugins/ctypes.py
Outdated
""" | ||
# Every type can be converted from itself (obviously). | ||
allowed_types = [tp] | ||
if isinstance(tp, Instance): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, should we care about unions here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should, but this isn't as straightforward as with the attribute hooks. The type returned from _autoconvertible_to_cdata
is not used as a return type, but an argument type, for example in Array.__setitem__
. This makes the rules for union types a little more complicated here.
For example, if arr
has the type Array[Union[c_int, c_uint]]
, arr[i] = x
is only valid if x
can be converted to both c_int
and c_uint
, i. e. the type of x
should be the intersection of _autoconvertible_to_cdata(c_int)
and _autoconvertible_to_cdata(c_uint)
.
As far as I know, mypy doesn't support intersection types, and I can't do an easy intersection calculation using sets, since Type
objects are not hashable. So I guess I'd have to write a manual intersection algorithm based on is_subtype
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy types are hashable (we have a subtype cache for example that uses this), but I am not sure this is what you need. Also I am not sure we need an intersection here, it is indeed safe, but I am afraid it may cause false positives. I am fine with a less strict check (especially if it is easy to implement). IOW I don't worry about false negatives because we don't allow unions here, I worry about false positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mypy types are hashable
Ah, my mistake - I only looked at the Type
base class and saw that it had no __hash__
, but almost all of the concrete type classes define __hash__
.
IOW I don't worry about false negatives because we don't allow unions here, I worry about false positives.
In that case I can simply add another "union-loop" thing. But tbh even if this is changed later, I don't think it will give many unexpected errors - an Array[Union[...]]
type can only come from an explicit annotation, in which case the user perhaps wants this kind of strict checking.
mypy/plugins/ctypes.py
Outdated
if et is not None: | ||
types = [] | ||
for tp in union_items(et): | ||
if isinstance(tp, Instance) and tp.type.fullname() == 'ctypes.c_char': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should double-check that Any
is still valid here and elsewhere. Could you please add tests for this and update the code if necessary?
mypy/plugins/ctypes.py
Outdated
"""Callback to provide an accurate type for ctypes.Array.raw.""" | ||
et = _get_array_element_type(ctx.type) | ||
if et is not None: | ||
if isinstance(et, Instance) and et.type.fullname() == 'ctypes.c_char': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Union[c_char, Any]
be still valid here? I think it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I'm wondering though, what would a union like Union[c_char, Any]
be useful for? Every c_char
is an Any
, so why doesn't the union get simplified to Any
at some point? (The make_simplified_union
docstring points out that Any
s in unions are not simplified, but doesn't explain why.)
'Array constructor argument {} of type "{}"' | ||
' is not convertible to the array element type "{}"' | ||
.format(arg_num, arg_type, et), | ||
ctx.context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the logic in function hook is sub-optimal, this is probably the most problematic hook currently, several people who tried it, found the API inconvenient. You also get corresponding expressions in ctx.args[0]
for every type in ctx.arg_types[0]
, maybe you can try checking if it is a StarExpr(...)
and then somehow "unpack" type if it maps to iterable supertype?
Actually, after looking into it, it's not as easy to change as I thought. It turns out that |
@dgelessus I left two more comments, I would propose to apply simple fixes, land this (finally after many months :-) and then fix the remaining issues in a separate PR later. (The test that currently fails can be just skipped by adding |
This case cannot be handled correctly yet, because the function hook API doesn't expose the necessary information (see comment in code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I think this is ready now. The problem with *args
can be revisited after we improve the plugin hook API.
# unpacked. However, FunctionContext currently doesn't provide a way to differentiate | ||
# between normal arguments and *args, so the iterable type is considered invalid. | ||
# Once FunctionContext has an API for this, *args should be allowed here if the | ||
# iterable's element type is compatible with the array element type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any workaround for this? I currently have a list of integers that I want to convert into a ctypes array of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just put a # type: ignore
on the line where error appears. We are already working on something that will allow to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Relevant issue: #5409 (comment))
Followup to python/typeshed#1986 (specifically this discussion). This adds a
mypy.plugins.ctypes
module, which provides accurate type info forctypes.Array
's methods. The corresponding Typeshed stub usesAny
in lots of places, becausectypes
' rules for what is accepted/returned are too complex. This plugin provides callbacks for the followingctypes.Array
methods and attributes:__init__
ctypes.Array
constructor__getitem__
__setitem__
__iter__
value
andraw
(see ctypes.create_unicode_buffer().value type bug typeshed#2111)Note: ignore the following paragraphs, the issues described there have now been fixed in mypy.
This is still a work in progress. I added some debug prints showing which plugin hooks are called when, because it seems they are not always called properly. Specifically:
__getitem__
,get_method_hook
is called as expected.__init__
,get_method_hook
is not called. Instead,get_function_hook
is called with"Array"
as the argument. This seems like a bug - shouldn't mypy figure out thatArray
is a class and look at its__init__
method? Even if this is intentional, the name should be"ctypes.Array"
and not"Array"
.__setitem__
,get_method_hook
is not called either. Instead,get_function_hook
is called with"__setitem__ of Array"
as the argument. This also seems like a bug -__setitem__
is clearly a method and not a function, and"__setitem__ of Array"
is not a fully qualified name.__iter__
, the behavior is like with__setitem__
.get_method_signature_hook
is never called for any of these dunder methods.Are these bugs, or am I simply misunderstanding how the plugin hooks are supposed to behave?
Here's the test code with which I observed this behavior: