-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-108494: Argument Clinic: inline parsing code for positional-only parameters in the limited C API #108622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-108494: Argument Clinic: inline parsing code for positional-only parameters in the limited C API #108622
Conversation
1cef242
to
31238a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR avoids private _PyArg_BadArgument() and _PyArg_CheckPositional() functions in the limited C API, and generate similar code for that. What about adding these functions to the limited C API instead?
The advantage of your approach is that we can target limited C API older than version 3.13 which is appealing. But is it worth it?
pl = '' if min_pos == 1 else 's' | ||
parser_code.append(normalize_snippet(f""" | ||
if ({nargs} != {min_pos}) {{{{ | ||
PyErr_Format(PyExc_TypeError, "{{name}} expected {min_pos} argument{pl}, got %zd", {nargs}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generate many static C strings for similar error messages. Would it make sense to pass name and min_pos (and "s" for plural) as argument to PyErr_Format(), and have a unique format string? It would be faster but make the Python binary smaller and use less memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyArg_BadArgument()
is has 4 parameters, but it is not generic enough yet. It formats an error message in the style of some error messages formatted in PyArg_ParseTuple()
and PyArg_ParseTupleAndKeywords()
, but even these functions has cases which cannot be covered by _PyArg_BadArgument()
("must be %d-item sequence", "must be sequence of length %d"), and there are many other forms of error messages for wrong type of argument in the rest of code. If _PyArg_BadArgument()
be general enough and used in many other places outside of Argument Clinic generated code, it will bi worth to add it to the public API.
On other hand, PyErr_Format()
is extremely general and flexible variant of _PyArg_BadArgument()
.
indent=4)) | ||
else: | ||
parser_code = [normalize_snippet(f""" | ||
if (!_PyArg_CheckPositional("{{name}}", {nargs}, {min_pos}, {max_args})) {{{{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about moving _PyArg_CheckPositional() to the limited C API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be done, together with _PyArg_NoKeywords()
, _PyArg_NoPositional()
and _PyArg_NoKwnames()
.
But the main problem is handling keyword arguments, so I want to add new API only when we see the whole image.
|
||
elif (not requires_defining_class and pos_only == len(parameters) and | ||
not pseudo_args and limited_capi): | ||
# positional-only for the limited C API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is no longer needed, because there is other PyArg_ParseTuple()
below which is used if inlining the parsing code failed. PyArg_UnpackTuple()
no longer used, because inlining is always successful in this simple case.
nargs = f"Py_MIN(nargs, {max_pos})" if max_pos else "0" | ||
|
||
if limited_capi: | ||
# positional-or-keyword arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was moved down, to a fallback code used if inlining failed. For now, it always failed for limited_capi, because there is no public variant of _PyArg_UnpackKeywords
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanatory review comments.
I don't have the bandwith to review the generated C param parsing code. I guess Victor does a better job of reviewing that, than me. If Alex is fine with the clinic.py changes, you are good to go. BTW, I really like the introduction of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto to what Erlend said: I can't really review the generated C code, but I approve of the concept and the clinic.py changes look good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The overall change LGTM. clinic.py coding style makes reviews really complicated :-( I prefer testing fastcall
than new_or_init
, that's a nice improvment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! if there are bugs lurking in here (it is hard to visually spot everything in this kind of large clinic.py change) i'm sure they'll show themselves in generated code behaviors that can be fixed up later.
There's a new commit after the PR has been approved. @gpshead, @vstinner, @erlend-aasland, @AlexWaygood: please review the changes made to this pull request. |
It fixes some code which uses private converters, like
_PyLong_UnsignedShort_Converter
. Inlined code only uses the limited C API.But it still includes internal headers. It is difficult to not include them, because they are needed when the parsing code is not inlined, and this depends not only on the specified converter, but on other converters and other used features.
There is a behavior change: converters like
_PyLong_UnsignedShort_Converter
raise ValueError for negative integer, but the inlined code raise OverflowError. It is very difficult to detect negative integer in the limited C API. It can be fixed by changing exception inPyLong_AsUnsignedLong
(see #74019).I also reorganized the code for keyword parameters, so it will be easier to enable inlining once we add a public version of
_PyArg_UnpackKeywords
in the limited C API, which is the only stopper.All tests (except one) are passed with using the limited C API (if possible).