-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Use cpuinfo to determine c10::ThreadPool thread number #107010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107010
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 40d248c with merge base d392963 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "topic: not user facing" |
afdf87f
to
b03c33f
Compare
3f88f7f
to
434cfb2
Compare
5fd145c
to
dd44c51
Compare
dd44c51
to
40d248c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK to me. @ezyang, what do you think? Is there someone who still maintains c10 ThreadPool?
I removed "topic: not user facing" because if we are changing the default number of threads, I think that's worth mentioning in release notes.
size_t num_threads = cpuinfo_get_processors_count(); | ||
if (num_threads > 0) { | ||
return num_threads; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make more sense to invoke cpuinfo_get_cores_count
instead? It is preferrable to only utilize physical cores in the case of HT on. cpuinfo_get_cores_count
makes sure we always utilize all physical CPU cores in both HT on and HT off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgong5 cpuinfo_get_processors_count should return core count when HT off. As a reference implementation, caffe2/utils/threadpool/ThreadPool.cc also uses cpuinfo_get_processors_count for thread number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about the changes of the behavior when HT is on. Previously, only number of physical cores would be used. Now it would be all cores including HT cores, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this seems OK. I don't think anyone is full timing on thread pool, so if you're willing to keep pushing on these changes this would be a big help.
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@ezyang My pleasure to help the community |
@pytorchbot revert -m 'Breaks internal meta builds' -c ghfirst Hi, @cyyever, I'm terribly sorry, but I have to revert this for now, as it breaks internal meta builds with:
I'll look into forward fixing the builds ASAP and help you reland this PR. Again, apologies for the inconvenience. |
@pytorchbot successfully started a revert job. Check the current status here. |
@cyyever your PR has been successfully reverted. |
…)" This reverts commit ad04765. Reverted #107010 on behalf of https://github.com/izaitsevfb due to Breaks internal meta builds ([comment](#107010 (comment)))
@izaitsevfb Maybe some bazel target is missing cpuinfo dependency |
Just FYI, this could be captured in OSS CI by adding the |
This PR prefers "logical processor number" (the cpu cores shown in htop) returned by cpuinfo for determining c10 thread number. If that fails, it uses hardware_concurrency exactly. The motivation is that in a x86 host with 64 cores and Hyper-Threading disabled, the current behavior uses 32 threads, resulting half of cores being idle. Pull Request resolved: pytorch#107010 Approved by: https://github.com/ezyang
…ch#107010)" This reverts commit ad04765. Reverted pytorch#107010 on behalf of https://github.com/izaitsevfb due to Breaks internal meta builds ([comment](pytorch#107010 (comment)))
Relands PR #107010 and fixes BUCK builds. Pull Request resolved: #107339 Approved by: https://github.com/ezyang
This PR prefers "logical processor number" (the cpu cores shown in htop) returned by cpuinfo for determining c10 thread number. If that fails, it uses hardware_concurrency exactly.
The motivation is that in a x86 host with 64 cores and Hyper-Threading disabled, the current behavior uses 32 threads, resulting half of cores being idle.