KEMBAR78
gh-132657: Make deepcopy and copy scale with free-threading by eendebakpt · Pull Request #138429 · python/cpython · GitHub
Skip to content

Conversation

@eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Sep 3, 2025

We improve scaling by:

  • Using a frozenset instead of a set for atomic type lookup (this avoids locks)
  • Removing the redundant _nil sentinel. The _nil is not immortal, so results in some refcount contention.

On the benchmark from #132658 this results in:

Main:

shallow_copy               5.8x slower
deepcopy                   2.1x slower

PR:

shallow_copy               1.1x faster
deepcopy                   2.5x faster

Remaining work to be done (followup PRs):

  • The module level objects like copy.deepcopy, _atomic_types and _deepcopy_dispatch do not have deferred referenence counting or immortality, so there is some refcount contention left.
  • The _deepcopy_dispatch is a dict. Once there is a frozendict in cpython, we should use that instead.

@eendebakpt eendebakpt added stdlib Standard Library Python modules in the Lib/ directory topic-free-threading performance Performance or resource usage labels Sep 3, 2025
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if memo is None:
memo = {}
else:
y = memo.get(d, _nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised, but replacing _nil with None works well. It seems like memo values are never None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the only values set are integers (id of objects)

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the user code that patches these sets, but this is not a big deal. There is a slower alternative -- modify copyreg.dispatch_table by using the public API copyreg.pickle(). LGTM. 👍

As for _deepcopy_dispatch() -- I am planning to add a public API for modifying it, so it cannot be frozen.

And note that this performance gain is not for long. We will need to get rid of None in favor of other sentinel object. See #109498.

eendebakpt and others added 2 commits September 4, 2025 09:14
Co-authored-by: Victor Stinner <vstinner@python.org>
@kumaraditya303 kumaraditya303 merged commit e46d403 into python:main Sep 4, 2025
45 checks passed
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot s390x Fedora Stable Refleaks 3.x (tier-3) has failed when building commit e46d403.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1641/builds/858) and take a look at the build logs.
  4. Check if the failure is related to this commit (e46d403) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1641/builds/858

Failed tests:

  • test_external_inspection

Failed subtests:

  • test_only_active_thread - test.test_external_inspection.TestGetStackTrace.test_only_active_thread

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-stable-s390x.refleak/build/Lib/test/test_external_inspection.py", line 1246, in test_only_active_thread
    self.assertEqual(
    ~~~~~~~~~~~~~~~~^
        len(gil_traces), 1, "Should have exactly one GIL holder"
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
AssertionError: 0 != 1 : Should have exactly one GIL holder

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 CentOS9 NoGIL Refleaks 3.x (tier-1) has failed when building commit e46d403.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1610/builds/2013) and take a look at the build logs.
  4. Check if the failure is related to this commit (e46d403) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1610/builds/2013

Failed tests:

  • test_free_threading

Summary of the results of the build (if available):

==

Click to see traceback logs
remote: Enumerating objects: 9, done.        
remote: Counting objects:  12% (1/8)        
remote: Counting objects:  25% (2/8)        
remote: Counting objects:  37% (3/8)        
remote: Counting objects:  50% (4/8)        
remote: Counting objects:  62% (5/8)        
remote: Counting objects:  75% (6/8)        
remote: Counting objects:  87% (7/8)        
remote: Counting objects: 100% (8/8)        
remote: Counting objects: 100% (8/8), done.        
remote: Compressing objects:  12% (1/8)        
remote: Compressing objects:  25% (2/8)        
remote: Compressing objects:  37% (3/8)        
remote: Compressing objects:  50% (4/8)        
remote: Compressing objects:  62% (5/8)        
remote: Compressing objects:  75% (6/8)        
remote: Compressing objects:  87% (7/8)        
remote: Compressing objects: 100% (8/8)        
remote: Compressing objects: 100% (8/8), done.        
remote: Total 9 (delta 0), reused 3 (delta 0), pack-reused 1 (from 1)        
From https://github.com/python/cpython
 * branch                    main       -> FETCH_HEAD
Note: switching to 'e46d403d59ab28b49d2056b55cc871600816a2bb'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at e46d403d59a gh-132657: improve `deepcopy` and `copy` scaling on free-threading (#138429)
Switched to and reset branch 'main'

configure: WARNING: no system libmpdec found; falling back to pure-Python version for the decimal module

make: *** [Makefile:2486: buildbottest] Error 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage stdlib Standard Library Python modules in the Lib/ directory topic-free-threading

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants