KEMBAR78
gh-109649: Add os.process_cpu_count() function by vstinner · Pull Request #109907 · python/cpython · GitHub
Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Sep 26, 2023

  • Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes!
  • Doc: Specify that os.cpu_count() counts logicial CPUs.

📚 Documentation preview 📚: https://cpython-previews--109907.org.readthedocs.build/

@vstinner
Copy link
Member Author

Adding a new process_cpu_count() makes it easier to extend the function later for new use cases, instead of modifying the existing os.cpu_count():

  • Maybe read Linux cgroups
  • Maybe add a pid parameter to get the CPU count of another process.
  • -X cpu_count=value option
  • Future new exciting stuff :-)

About the pid parameter. First, I would like to implement this function on Windows (get process affinity), and then check if it would be possible to get the CPU affinity of another process on Windows, before considering to add a pid parameter.

@vstinner
Copy link
Member Author

In psutil, this function is called Process.cpu_affinity(): https://psutil.readthedocs.io/en/latest/#psutil.Process.cpu_affinity

@vstinner vstinner marked this pull request as draft September 27, 2023 00:41
@vstinner vstinner marked this pull request as ready for review September 29, 2023 13:33
@vstinner
Copy link
Member Author

I prefer adding a new os.process_cpu_count() function, instead of adding a process=True parameter to the existing os.cpu_count() function because:

  • I consider adding a pid optional argument to os.process_cpu_count()
  • IMO the word "process" makes it explicit that the result is specific to the process, wheras cpu_count() is system-wide

@corona10
Copy link
Member

corona10 commented Sep 29, 2023

I will take a look at it by tomorrow!

@vstinner
Copy link
Member Author

cc @gpshead @indygreg

@vstinner
Copy link
Member Author

@gpshead: I updated my PR to reimplement os.process_cpu_count() in Python.

@vstinner
Copy link
Member Author

cpu_set = sched_getaffinity(0)
num_cpus = cpu_count()
return min(len(cpu_set), num_cpus) if cpu_set else num_cpus

Why do you want to return cpu_count() if it's smaller than len(sched_getaffinity(0))? It should not happen. If it happens, I would prefer to not workaround bugs, but fix the OS instead. For me, it should not happen.

sched_getaffinity() is basically a "mask" on the list of all available CPUs. It cannot be larger, unless cpu_count() ... counts CPUs differently. In that case, we should fix cpu_count() instead, no?

When I added time.monotonic(), I added an assertion in debug mode to make sure that... the clock is monotonic (doesn't go backward). Surprise, surprise. The clock is not. On some virtual machines, I saw it jumping backwards sometimes. Honestly. I just removed the assertion. Python cannot and should not work around OS bugs, but just expose them. Maybe for the worst known OS bugs, we might detect them and issue a warning. But I'm not even sure if it's worth it.

I recall that Python detects broken poll() implementation on macOS. In that case, we go further: we remove the function at runtime! (at Python startup)

I prefer to not make assumptions about hypothetical bugs, but wait until we get real concrete bug reports, and then decide how to deal with them.

Anyway, thanks for thinking about all corner cases, it's a good thing!

@corona10
Copy link
Member

corona10 commented Sep 30, 2023

Why do you want to return cpu_count() if it's smaller than len(sched_getaffinity(0))? It should not happen. If it happens, I would prefer to not workaround bugs, but fix the OS instead. For me, it should not happen

If we pass the -Xcpu_count=limited_number, it will be a more efficient implementation, we have to override os.cpu_count only. (It will automatically override os.process_cpu_count)

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, for the detail, I will delegate it to @gpshead

@vstinner
Copy link
Member Author

If we pass the -Xcpu_count=limited_number, it will be a more efficient implementation, we have to override os.cpu_count only. (It will automatically override os.process_cpu_count)

I would prefer that the discussion separately the implementation and the expected behavior.

Also, can we discussion -Xcpu_count later? This PR is about adding process_cpu_count(), nothing more.


.. function:: process_cpu_count()

Get the number of logical CPUs usable by the current process. Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably more accurate to say "usable by the calling thread of the current process". I believe each thread can have its own affinity.

(For parallelism planning purposes, the main thread prior to spawning others is likely what people would be calling this from anyways)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, you're right. I updated the doc.

Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One documentation improvement suggested but otherwise good.

@vstinner
Copy link
Member Author

@gpshead is right, the number of CPUs is "per thread" in thread.

Example:

import os
import threading

def sched_remove_one_cpu():
    mask = os.sched_getaffinity(0)
    mask.pop()
    os.sched_setaffinity(0, mask)

def set_affinity():
    tid = threading.get_native_id()
    print(f"thread {tid}: remove a second CPU")
    sched_remove_one_cpu()
    print(f"thread {tid}: {os.process_cpu_count()=}")

tid = threading.get_native_id()

print(f"main thread {tid}: remove a CPU")
sched_remove_one_cpu()

print(f"main thread {tid} before: {os.cpu_count()=}")
print(f"main thread {tid} before: {os.process_cpu_count()=}")

print()
thread = threading.Thread(target=set_affinity)
thread.start()
thread.join()
print()

print(f"main thread {tid} after: {os.cpu_count()=}")
print(f"main thread {tid} after: {os.process_cpu_count()=}")

Output on Linux:

main thread 2059677: remove a CPU
main thread 2059677 before: os.cpu_count()=12
main thread 2059677 before: os.process_cpu_count()=11

thread 2059678: remove a second CPU
thread 2059678: os.process_cpu_count()=10

main thread 2059677 after: os.cpu_count()=12
main thread 2059677 after: os.process_cpu_count()=11

You can see that the main thread loses a CPU when sched_remove_one_cpu() is called: os.process_cpu_count() is affected, but os.cpu_count() is not affected.

When a thread removes a second CPU, it affects os.process_cpu_count() in the thread, but the second CPU removal does not affected the main thread.

New thread inherit the mask of the caller thread, but a change in a child thread does not affect the parent thread. Well, that's the expected behavior.

* Refactor os_sched_getaffinity_impl(): move variable definitions to
  their first assignment.
* Fix test_posix.test_sched_getaffinity(): restore the old CPU mask
  when the test completes!
* Doc: Specify that os.cpu_count() counts *logicial* CPUs.
* Doc: Specify that os.sched_getaffinity(0) is related to the calling
  thread.
@vstinner
Copy link
Member Author

I made a few further changes to clarify differences between cpu_count(), process_cpu_count() and sched_affinity().

@vstinner
Copy link
Member Author

Tests / Hypothesis tests on Ubuntu (pull_request) Failing after 6m

FAIL: test_find_periodic_pattern (test.test_userstring.UserStringTest.test_find_periodic_pattern) (p='', text='')

I created issue GH-110160 for this bug.

@vstinner
Copy link
Member Author

Unrelated CI failures:

@vstinner vstinner merged commit c815210 into python:main Sep 30, 2023
@vstinner vstinner deleted the process_cpu_count branch September 30, 2023 22:12
@vstinner
Copy link
Member Author

Merged, thanks for reviews @gpshead and @corona10!

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 RHEL8 3.x has failed when building commit c815210.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/529/builds/5071) and take a look at the build logs.
  4. Check if the failure is related to this commit (c815210) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/529/builds/5071

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/threading.py", line 1066, in _bootstrap_inner
    self.run()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/threading.py", line 1003, in run
    self._target(*self._args, **self._kwargs)
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/test/test_interpreters.py", line 484, in task
    interp = interpreters.create()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/test/support/interpreters.py", line 25, in create
    id = _interpreters.create(isolated=isolated)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: interpreter creation failed
k


Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1325, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 929, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 1004, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1100, in get_code
  File "<frozen importlib._bootstrap_external>", line 1199, in get_data
TypeError: descriptor 'close' for '_io.BufferedReader' objects doesn't apply to a '_io.FileIO' object

@gpshead
Copy link
Member

gpshead commented Sep 30, 2023

if you're inclined to do so, the documentation improvements to the existing APIs in this PR would be worthwhile backporting to the 3.12 branch so that they show up on the default /3/ docs on python.org soon.

@vstinner
Copy link
Member Author

vstinner commented Oct 1, 2023

@gpshead:

if you're inclined to do so, the documentation improvements to the existing APIs in this PR would be worthwhile backporting to the 3.12 branch so that they show up on the default /3/ docs on python.org soon.

It makes sense. I wrote PR gh-110169 for Python 3.12 (and added "backport to 3.11" label).

Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
* Refactor os_sched_getaffinity_impl(): move variable definitions to
  their first assignment.
* Fix test_posix.test_sched_getaffinity(): restore the old CPU mask
  when the test completes!
* Doc: Specify that os.cpu_count() counts *logicial* CPUs.
* Doc: Specify that os.sched_getaffinity(0) is related to the calling
  thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants