KEMBAR78
Use cpuinfo to determine c10::ThreadPool thread number by cyyever · Pull Request #107010 · pytorch/pytorch · GitHub
Skip to content

Conversation

@cyyever
Copy link
Collaborator

@cyyever cyyever commented Aug 11, 2023

This PR prefers "logical processor number" (the cpu cores shown in htop) returned by cpuinfo for determining c10 thread number. If that fails, it uses hardware_concurrency exactly.
The motivation is that in a x86 host with 64 cores and Hyper-Threading disabled, the current behavior uses 32 threads, resulting half of cores being idle.

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 11, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107010

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 40d248c with merge base d392963 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@cyyever cyyever changed the title use cpuinfo to get thread num Use cpuinfo to determine thread pool size Aug 11, 2023
@cyyever
Copy link
Collaborator Author

cyyever commented Aug 11, 2023

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Aug 11, 2023
@cyyever cyyever changed the title Use cpuinfo to determine thread pool size Use cpuinfo to determine c10::ThreadPool size Aug 11, 2023
@cyyever cyyever changed the title Use cpuinfo to determine c10::ThreadPool size Use cpuinfo to determine c10::ThreadPool thread number Aug 11, 2023
@cyyever cyyever force-pushed the cpu_core_count branch 2 times, most recently from afdf87f to b03c33f Compare August 11, 2023 04:07
@cyyever cyyever marked this pull request as draft August 11, 2023 05:09
@cyyever cyyever force-pushed the cpu_core_count branch 3 times, most recently from 3f88f7f to 434cfb2 Compare August 11, 2023 06:19
@cyyever cyyever marked this pull request as ready for review August 12, 2023 01:54
@cyyever cyyever force-pushed the cpu_core_count branch 3 times, most recently from 5fd145c to dd44c51 Compare August 14, 2023 01:16
@colesbury colesbury added release notes: intel release notes category and removed topic: not user facing topic category labels Aug 14, 2023
@colesbury colesbury requested a review from ezyang August 14, 2023 19:43
Copy link
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me. @ezyang, what do you think? Is there someone who still maintains c10 ThreadPool?

I removed "topic: not user facing" because if we are changing the default number of threads, I think that's worth mentioning in release notes.

@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 14, 2023
Comment on lines +9 to +12
size_t num_threads = cpuinfo_get_processors_count();
if (num_threads > 0) {
return num_threads;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make more sense to invoke cpuinfo_get_cores_count instead? It is preferrable to only utilize physical cores in the case of HT on. cpuinfo_get_cores_count makes sure we always utilize all physical CPU cores in both HT on and HT off.

Copy link
Collaborator Author

@cyyever cyyever Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgong5 cpuinfo_get_processors_count should return core count when HT off. As a reference implementation, caffe2/utils/threadpool/ThreadPool.cc also uses cpuinfo_get_processors_count for thread number.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the changes of the behavior when HT is on. Previously, only number of physical cores would be used. Now it would be all cores including HT cores, isn't it?

@cyyever cyyever requested a review from jgong5 August 15, 2023 05:24
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems OK. I don't think anyone is full timing on thread pool, so if you're willing to keep pushing on these changes this would be a big help.

@cyyever
Copy link
Collaborator Author

cyyever commented Aug 15, 2023

@pytorchmergebot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 15, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@cyyever
Copy link
Collaborator Author

cyyever commented Aug 15, 2023

@ezyang My pleasure to help the community

@izaitsevfb
Copy link
Contributor

@pytorchbot revert -m 'Breaks internal meta builds' -c ghfirst

Hi, @cyyever, I'm terribly sorry, but I have to revert this for now, as it breaks internal meta builds with:

caffe2/c10/core/thread_pool.cpp:3:10: fatal error: 'cpuinfo.h' file not found
#include <cpuinfo.h>
         ^~~~~~~~~~~
1 error generated.
    When running <c++ preprocess_and_compile>.

I'll look into forward fixing the builds ASAP and help you reland this PR. Again, apologies for the inconvenience.

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@cyyever your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Aug 16, 2023
…)"

This reverts commit ad04765.

Reverted #107010 on behalf of https://github.com/izaitsevfb due to Breaks internal meta builds ([comment](#107010 (comment)))
@cyyever
Copy link
Collaborator Author

cyyever commented Aug 16, 2023

@izaitsevfb Maybe some bazel target is missing cpuinfo dependency

@huydhn
Copy link
Contributor

huydhn commented Aug 16, 2023

Just FYI, this could be captured in OSS CI by adding the ciflow/periodic label to run run buck job there. Here is an example failure on periodic trunk buck job https://hud.pytorch.org/pytorch/pytorch/commit/22f5889753d38661aefeb8ea7110ff5787a1b5e9

summerdo pushed a commit to summerdo/pytorch that referenced this pull request Aug 17, 2023
This PR prefers "logical processor number" (the cpu cores shown in htop) returned by cpuinfo for determining c10 thread number. If that fails, it uses hardware_concurrency exactly.
The motivation is that in a x86 host with 64 cores and Hyper-Threading disabled, the current behavior uses 32 threads, resulting half of cores being idle.

Pull Request resolved: pytorch#107010
Approved by: https://github.com/ezyang
summerdo pushed a commit to summerdo/pytorch that referenced this pull request Aug 17, 2023
@cyyever cyyever deleted the cpu_core_count branch September 13, 2023 02:00
pytorchmergebot pushed a commit that referenced this pull request Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: intel release notes category Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants