KEMBAR78
GitHub · Where software is built
Skip to content

Optimize class creation #132042

@JelleZijlstra

Description

@JelleZijlstra

Currently, creating an empty class is about 70x slower than creating an empty function in my profiling. Classes are much more complex and it makes sense that they're slower to create, but 70x feels excessive. (Related: #118761.)

I ran some profiling on my Mac with a sample script that just made empty classes in a loop:

Image

A few things stood out:

  • A lot of time is spent updating slot definitions, i.e. filling in all of the tp_*, nb_*, etc. functions in the C struct for the type. We do this by iterating over all the slots, then looking up the function name (e.g., __add__) in the MRO and placing it in the slot for this class.
  • Significant time is spent in resolve_slotdups which has a comment "XXX Maybe this could be optimized more -- but is it worth it?". Sounds promising. It helps deal with cases where one name maps to multiple slots (e.g. __add__ is both nb_add and sq_concat), and does that by iterating over all the slotdefs and finding other slots with the same name. It does that using some scratch space in the interpreter state, which seems not thread-safe. I feel we could precompute the data instead, so we don't have to figure it out at runtime. For example, the slotdef struct could grow a new member to indicate whether or not the name is unique.

Most types will define very few of these slots, so it makes sense to try to look for an approach that does less work for slots without changes. I think something like this should work:

  • First fill in the slots table with all the slots from the first base class.
  • Then collect all slots for which we may need changes: either slots that have a non-NULL value in the second or later base, or slots the name of which appears in the new class's __dict__. For those slots only, perform an update.

This should make it possible to make class creation something like 2x faster. I haven't started working on implementing this and I may not have time to do it; if you see this and are interested, feel free to pick it up!

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions