KEMBAR78
gh-127119: Remove check on accidental deallocation of immortal objects for free-threaded build by eendebakpt · Pull Request #127120 · python/cpython · GitHub
Skip to content

Conversation

@eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Nov 21, 2024

(Description was updated) The python small ints are immortal with a fixed reference count (also some other types). However, old extension modules can still modify the reference count and end up deallocating these objects.

This should not be an issue on 64-bit builds (even when accidentically changing the reference count it will take a very long time for an object to reach refcount zero) or the free-threaded build (which used a different reference counts, so has no legacy extensions modules)

In this PR we remove the check on accidental deallocation for the free-threading build. This has performance impact for the long type (where the tp_dealloc is actually called often). For the other types we remove it to be consistent, but it should have no performance impact.

Also see:

@skirpichev
Copy link
Contributor

Apparently, assertion is broken in tests.

@eendebakpt
Copy link
Contributor Author

Apparently, assertion is broken in tests.

Yes, my mistake. Should be fixed now

ZeroIntensity
ZeroIntensity previously approved these changes Nov 22, 2024
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZeroIntensity ZeroIntensity added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Nov 22, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit 4dc97c9 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Nov 22, 2024
@ZeroIntensity ZeroIntensity dismissed their stale review November 22, 2024 15:12

CI is failing, because apparently you can build with assertions but not with debug enabled (that's what I get for prematurely approving, ha).

ZeroIntensity
ZeroIntensity previously approved these changes Nov 22, 2024
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (again)!

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changes from silently re-immortalizing the small ints to actually letting them be deallocated in non-debug builds. in theory this is supposed to be fine, but how confident are we of (buggy) reference counting code not relying in the former behavior?

(I'm not blocking this PR, I think the change is acceptable, just want that pondered...)

@ZeroIntensity
Copy link
Member

in theory this is supposed to be fine, but how confident are we of (buggy) reference counting code not relying in the former behavior?

I thought about that too, but I think issues around immortal objects being buggy was fixed by GH-125251 (at least, I thought that was implied by the removal of _Py_IsImmortalLoose). Someone should double check though (I guess maybe it's possible for stable ABI extensions to break things).

@eendebakpt
Copy link
Contributor Author

The possibility of de-allocating immortal objects is described in [PEP 683, section Accidental De-Immortalizing](https://peps.python.org/pep-0683/#accidental-de-immortalizing). Also see
[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" round 3. Even with the changes from #125251 this is still possible. The check in long_dealloc is therefore not redundant (although it is unlikely). The condiderations from PEP 683 are only 2 years old, so I do not think enough time has passed to reconsider the tradeoffs. I will update the comments in the PR with a pointer to the PEP.

But if I understand the correctly the accidental-de-immortalizing cannot happen on: 64-bit systems, or with the free-threading build, or when Py_LIMITED_API is not defined. I removed the check for these cases. I will benchmark this as well to see wether this makes a difference.

And yes, this should be double checked!

@eendebakpt
Copy link
Contributor Author

Results on a microbenchmark:

bench_long: Mean +- std dev: [main] 194 ns +- 3 ns -> [pr_small_int_check] 188 ns +- 2 ns: 1.03x faster
bench_alloc: Mean +- std dev: [main] 407 us +- 14 us -> [pr_small_int_check] 393 us +- 21 us: 1.03x faster

Benchmark hidden because not significant (1): bench_collatz

Geometric mean: 1.02x faster

I expect no overall improvement on the pyperformance benchmark.

Benchmark script
# Quick benchmark for cpython long objects

import pyperf


def collatz(a):
    while a > 1:
        if a % 2 == 0:
            a = a // 2
        else:
            a = 3 * a + 1


def bench_collatz(loops):
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    for ii in range_it:
        collatz(ii)
    return pyperf.perf_counter() - t0


def bench_long(loops):
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    x = 10
    for ii in range_it:
        x = x * x
        y = x // 2
        x = y + ii + x
        if x > 10**10:
            x = x % 1000
    return pyperf.perf_counter() - t0


def bench_alloc(loops):
    range_it = range(loops)
    t0 = pyperf.perf_counter()
    for ii in range_it:
        for kk in range(20_000):
            del kk
    return pyperf.perf_counter() - t0


# %timeit bench_long(1000)

if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.bench_time_func("bench_collatz", bench_collatz)
    runner.bench_time_func("bench_long", bench_long)
    runner.bench_time_func("bench_alloc", bench_alloc)

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now incorrect :(

Comment on lines 3617 to 3618
#ifdef Py_LIMITED_API
#ifndef Py_GIL_DISABLED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's possible for us to know when we're using the limited API. Py_LIMITED_API is something that affects Python.h at the compile-time of an extension module, but this happens during the compilation of CPython, so it will never be true.

@markshannon
Copy link
Member

As an alternative to removing the check altogether we could make it cheaper by using the reserved bit https://github.com/python/cpython/blob/main/Include/cpython/longintrepr.h#L79 as a marker for immortal/small ints.

@eendebakpt
Copy link
Contributor Author

@markshannon Thanks for the idea! I think it works and would not be hard to implement, but I am a bit hesitant to use a reserved bit for a small optimization like this. I created an alternative PR with a faster check based on your idea though

@markshannon
Copy link
Member

That bit is reserved for exactly this use case, so it is fine to use it.

@eendebakpt eendebakpt changed the title gh-127119: Remove check on small ints in long_dealloc gh-127119: Remove check on accidental deallocation of immortal objects for free-threaded build Dec 7, 2024
@eendebakpt
Copy link
Contributor Author

Closing in favor of #127620

@eendebakpt eendebakpt closed this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants