KEMBAR78
[mypyc] Use faster METH_FASTCALL wrapper functions on Python 3.7+ by JukkaL · Pull Request #9894 · python/mypy · GitHub
Skip to content

Conversation

@JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Jan 9, 2021

Implement faster argument parsing based on METH_FASTCALL on supported
Python versions.

Use vgetargskeywordsfast extracted from Python 3.9 with some modifications:

  • Support required keyword-only arguments, *args and **kwargs
  • Only support the 'O' type (to reduce code size and speed things up)

The modifications are very similar to what we have in the old-style
argument parsing logic.

The legacy calling convention is still used for __init__ and __call__. I'll add
__call__ support in a separate PR. I haven't looked into supporting __init__
yet.

Here are some benchmark results (on Python 3.8)

  • keyword_args_from_interpreted: 3.5x faster than before
  • positional_args_from_interpreted: 1.4x faster than before

However, the above benchmarks are still slower when compiled. I'll continue
working on further improvements after this PR.

Fixes mypyc/mypyc#578.

@JukkaL JukkaL requested a review from msullivan January 9, 2021 13:53
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one suggestion.

# This is because CPyArg_ParseStackAndKeywords format string requires
# them grouped in that way.
groups = make_arg_groups(real_args)
reordered_args = reorder_arg_groups(groups)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These preparatory steps here are very similar to the legacy wrapper below. Would it make sense to factor them out in a helper?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored some of the shared code. I didn't share everything, since I'm planning further changes that may only be relevant for new-style wrapper functions.

@JukkaL JukkaL merged commit 5d2ea16 into master Jan 23, 2021
@JukkaL JukkaL deleted the vectorcall branch January 23, 2021 14:21
JukkaL added a commit that referenced this pull request Jan 23, 2021
Allocate a vectorcall function pointer as a struct field for native
classes that include `__call__`, including nested functions. This
lets us use METH_FASTCALL wrapper functions with `__call__`
methods.

See https://www.python.org/dev/peps/pep-0590/ for details of why
we jump through these hoops.

This makes the `nested_func` microbenchmark about 1.5x faster.

Follow-up to #9894.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Declare C API wrappers with METH_FASTCALL (on supported versions)

2 participants