-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Currently all references to objects in frameobjects use _PyStackRef instead of PyObject *.
This is necessary for the free-threaded build to support deferred references.
For the default build _PyStackRef is just an alias for PyObject *.
We should change _PyStackRef to use proper tagged pointers in the default build for two important reasons:
- It will reduce the maintenance burden of using tagged pointers if they were the same in both builds
- It offers a lot of optimization potential. The overhead of reference counting operations is large, and tagged pointers will allow us to reduce that overhead considerably.
My initial implementation is 0.8% slower, although I'd like to get that closer to 0 before merging anything. There is some speedup in the GC due to streamlined immortality checks, and some slowdown due to increased overhead of turning new PyObject * references into _PyStackRefs.
This small slowdown will allow us a large speedup (maybe more than 5%) as we can do the following:
- Reduce the overhead of refcount operations by using tagged references for the majority of
LOAD_instructions in the interpreter. - Completely eliminate many decref operations by tracking which references are tagged in the JIT.
The tagging scheme:
| Tag | Meaning |
|---|---|
| 00 | Normal pointers |
| 01 | Pointers with embedded reference count |
| 10 | Unused |
| 11 | Pointer to immortal object1 (including NULL) |
This tagging scheme is chosen as it provides the best performance for the most common operations:
- PyStackRef_DUP: Can check to see if the object's reference count needs updating with a single check and no memory read:
ptr & 1 - PyStackRef_CLOSE: As for PyStackRef_DUP, only a single bit check is needed
- PyStackRef_XCLOSE: Since
NULLis treated as immortal and tagged, this is the same as PyStackRef_CLOSE.
Maintaining the invariant that tag 11 is used for all immortal objects is a bit expensive, but can be mitigated by pushing the conversion from PyObject * to _PyStackRef down to a point where it is known whether an object is newly created or not.
For newly created objects PyStackRef_FromPyObjectStealMortal can be used which performs no immortality check.
- Actually, any object that was immortal when the reference was created. References to objects that are made immortal after the reference is created would have the low bits set to
00, or01. This is OK as immortal refcounts have a huge margin of error and the number of possible references to one of these immortal objects is very small.
Linked PRs
- GH-127705: Use
_PyStackRefs in the default build. #127875 - GH-127705: Add debug mode for
_PyStackRefs inspired by HPy debug mode #128121 - GH-127705: better double free message. #130785
- GH-127705: Check for immortality in refcount accounting #131072
- GH-127705: Fix _Py_RefcntAdd to handle objects becoming immortal #131140
- GH-127705: Handle trace refs in specialized decref #131198
- GH-127705: Adds the missing bits from #131198 #131365
- GH-127705: Revert "Move mortal decrefs to internal header and make sure _PyReftracerTrack is called" #131500
- GH-127705: Don't call _Py_ForgetReference before _Py_Dealloc #131508
- gh-127705: Move Py_INCREF_MORTAL() to the internal C API #136178
- [3.14] gh-127705: Move Py_INCREF_MORTAL() to the internal C API (GH-136178) #136206