KEMBAR78
[ROCm] Update CUDAPluggableAllocator.h (#1984) by pytorchbot · Pull Request #153974 · pytorch/pytorch · GitHub
Skip to content

Conversation

@pytorchbot
Copy link
Collaborator

Altering the flag to use the correct streamType in CUDAPluggableAllocator class for ROCm gpu. The flag TORCH_HIP_VERSION does not work for ROCm as intended. This flag is replaced with USE_ROCM. This is impacting Distributed Fused Adam in Rocm/APEX when using nccl_ub feature. This has been tested with rocm/apex.

See PR ROCm/apex#184

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

Altering the flag to use the correct streamType in CUDAPluggableAllocator class for ROCm gpu. The flag TORCH_HIP_VERSION does not work for ROCm as intended. This flag is replaced with USE_ROCM. This is impacting Distributed Fused Adam in Rocm/APEX when using nccl_ub feature. This has been tested with rocm/apex.

See PR ROCm/apex#184

Pull Request resolved: #150010
Approved by: https://github.com/jeffdaily

(cherry picked from commit a19b667)
@pytorch-bot
Copy link

pytorch-bot bot commented May 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153974

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 54e7bae with merge base 924a247 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/rocm Trigger "default" config CI on ROCm module: rocm AMD GPU support for Pytorch labels May 20, 2025
Copy link
Contributor

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@atalman atalman merged commit 1ae9953 into release/2.7 May 22, 2025
191 of 196 checks passed
@github-actions github-actions bot deleted the cherry-pick-150010-by-pytorch_bot_bot_ branch June 23, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm module: rocm AMD GPU support for Pytorch open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants