KEMBAR78

Use PyTorch's p2p access enable function by banitag1 · Pull Request #1991 · pytorch/FBGEMM · GitHub

Use PyTorch's p2p access enable function #1991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

banitag1 wants to merge 0 commits into pytorch:main from banitag1:export-D48939723

+0 −0

Contributor

banitag1 commented Sep 3, 2023

Summary:
Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs get_p2p_access which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

netlify bot commented Sep 3, 2023 •

edited

Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`9ed959f`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/64f77c9eebb8990008be1198
😎 Deploy Preview	https://deploy-preview-1991--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot added the cla signed label

Contributor

facebook-github-bot commented Sep 3, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

facebook-github-bot added the fb-exported label

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

5a35c11

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from de33565 to 5a35c11 Compare

September 4, 2023 16:23

Contributor

facebook-github-bot commented Sep 4, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

d57362e

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from 5a35c11 to d57362e Compare

September 4, 2023 16:24

Contributor

facebook-github-bot commented Sep 4, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

077967a

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from d57362e to 077967a Compare

September 4, 2023 16:24

Contributor

facebook-github-bot commented Sep 4, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

643088e

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from 077967a to 643088e Compare

September 4, 2023 23:44

Contributor

facebook-github-bot commented Sep 4, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

52b7ea7

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from 643088e to 52b7ea7 Compare

September 4, 2023 23:45

Contributor

facebook-github-bot commented Sep 4, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

ac93de4

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Reviewed By: zdevito

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from 52b7ea7 to ac93de4 Compare

September 5, 2023 17:04

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Reviewed By: zdevito

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from ac93de4 to 9081083 Compare

September 5, 2023 17:05

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 pushed a commit to banitag1/FBGEMM that referenced this pull request


          Reland "Use PyTorch's p2p access enable function" after init fixes (p…

719a963

…ytorch#1991)

Summary:

Reland the diff after fixing the issues with some initialization issues.

cudaEnablePeerAccess only enables cross device access for memory allocated with cudaMalloc. When using other cuda APIs such cuMemMap, peer access is managed differently.
expandable_segments:True in PyTorch uses cuMemMap, so code that just calls cudaEnablePeerAccess is not sufficient to enable cross-device copies. This patch switching the p2p access enabling functions
to use PyTorchs `get_p2p_access` which lets its allocator figure out how to correctly enable p2p access for that memory.

In the normal case (expandable_segments:False), this code performs exactly the same cuda calls as before.

Reviewed By: zdevito

Differential Revision: D48939723

banitag1 force-pushed the export-D48939723 branch from 9081083 to 719a963 Compare

September 5, 2023 17:34

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

2 similar comments

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 closed this

banitag1 force-pushed the export-D48939723 branch from 719a963 to 9ed959f Compare

September 5, 2023 19:08

Contributor

facebook-github-bot commented Sep 5, 2023

This pull request was exported from Phabricator. Differential Revision: D48939723

banitag1 mentioned this pull request

[PyTorch] Add the lazy init call for p2p access function (#1991) pytorch/pytorch#108589

Closed

banitag1 pushed a commit to banitag1/pytorch that referenced this pull request


          [PyTorch] Add the lazy init call for p2p access function (pytorch#1991)

4865c30

Summary: Pull Request resolved: pytorch/FBGEMM#1991

Test Plan: sandcastle

Reviewed By: zdevito

Differential Revision: D48939723

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request


          [PyTorch] Add the lazy init call for p2p access function (#1991) (#10…

3fe8417

…8589)

Summary: Pull Request resolved: pytorch/FBGEMM#1991

Test Plan: sandcastle

Reviewed By: zdevito

Differential Revision: D48939723

Pull Request resolved: #108589
Approved by: https://github.com/zdevito

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed fb-exported