-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Bug report
Bug description:
Apologies if this is a duplicate. I couldn’t find a similar report, though.
The issue and how to reproduce
We’re seeing a performance regression since 124af17. The best way to reproduce it is to spawn lots of processes from a ThreadPoolExecutor. For example:
#!/usr/bin/env python3
from concurrent.futures import ThreadPoolExecutor, wait
from subprocess import Popen
def target(i):
p = Popen(['touch', f'/tmp/reproc-{i}'])
p.communicate()
executor = ThreadPoolExecutor(max_workers=64)
futures = set()
for i in range(10_000):
futures.add(executor.submit(target, i))
wait(futures)Before, on 49cae39, it’s roughly this:
real 0m2.419s
user 0m4.524s
sys 0m0.976s
Since 124af17, it’s roughly this:
real 0m11.772s
user 0m10.287s
sys 0m14.409s
An attempt at an analysis and possible fix
strace shows that the new code doesn’t use vfork() anymore but clone(). I believe that the reason for this is an incorrect check of num_groups (or extra_group_size, as it is now called on main).
124af17 checks if extra_group_size is less than zero to determine if we can use vfork(). Is that correct? Maybe this should be equal to zero? I’m talking about these two locations (diff relative to main/9e56eedd018e1a46):
diff --git a/Modules/_posixsubprocess.c b/Modules/_posixsubprocess.c
index 2898eedc3e..fb6c235901 100644
--- a/Modules/_posixsubprocess.c
+++ b/Modules/_posixsubprocess.c
@@ -889,7 +889,7 @@ do_fork_exec(char *const exec_array[],
/* These are checked by our caller; verify them in debug builds. */
assert(uid == (uid_t)-1);
assert(gid == (gid_t)-1);
- assert(extra_group_size < 0);
+ assert(extra_group_size == 0);
assert(preexec_fn == Py_None);
/* Drop the GIL so that other threads can continue execution while this
@@ -1208,7 +1208,7 @@ subprocess_fork_exec_impl(PyObject *module, PyObject *process_args,
/* Use vfork() only if it's safe. See the comment above child_exec(). */
sigset_t old_sigs;
if (preexec_fn == Py_None && allow_vfork &&
- uid == (uid_t)-1 && gid == (gid_t)-1 && extra_group_size < 0) {
+ uid == (uid_t)-1 && gid == (gid_t)-1 && extra_group_size == 0) {
/* Block all signals to ensure that no signal handlers are run in the
* child process while it shares memory with us. Note that signals
* used internally by C libraries won't be blocked byextra_group_size is the result of the call to PySequence_Size(extra_groups_packed). If I understand the docs correctly, then this function only returns negative values to indicate errors. This error condition is already checked, right after the call itself:
extra_group_size = PySequence_Size(extra_groups_packed);
if (extra_group_size < 0)
goto cleanup;Later in the code, extra_group_size can never be less than zero. It can, however, be equal to zero if extra_groups is an empty list. I believe this is what was meant to be checked here.
I’ll happily open a PR for this if you agree that this is the way to go.
CPython versions tested on:
3.11, 3.12, 3.13, CPython main branch
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status