-
Notifications
You must be signed in to change notification settings - Fork 4.6k
rocblas -> hipblas changes for ROCm #5401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
735ac6a
to
e74d5bf
Compare
1309094
to
bef085e
Compare
dda4bba
to
d288d36
Compare
dfc8c6c
to
6d9fee9
Compare
@rraminen Formatting checks failed with trailing whitespace error: https://github.com/microsoft/DeepSpeed/actions/runs/9115455800/job/25064328013?pr=5401#step:5:60 Should be a straightforward one, can you please check? |
Verified the extension builds in the following cases rocm/pytorch:rocm6.1_ubuntu20.04_py3.9_pytorch_1.13.1 |
This PR enables building the below extensions for AMD GPUs with warp size 32. - transformer_inference - quantizer - random_ltd This PR works stand-alone for torch version <=2.0. For the latest versions, #5401 is required to be merged in addition to this PR. Unit test results (rocm/pytorch:rocm6.1_ubuntu20.04_py3.9_pytorch_2.1.2) on NAVI3x: **transformer_inference:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/transformer/inference Before this PR: ===== 674 failed, 622 skipped, 8 warnings, 1728 errors in 69.37s (0:01:09) ===== After this PR: ========== 476 failed, 1062 passed, 1486 skipped, 8 warnings in 9.31s ========== **quantizer:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/quantizer Before this PR: ==== 244 failed, 8 warnings in 30.53s ==== After this PR: ====== 186 failed, 58 passed, 8 warnings in 8.89s ====== I could not find random_ltd related unit tests to run. Fixes: #4753 #5474 ROCm#68 cc: @jithunnair-amd --------- Co-authored-by: rraminen@amd.com <rraminen> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Fixes deepspeedai#4989 In addition to this PR, below changes are required to build below extensions successfully. Please note that not all unit tests for these extensions will pass with this PR. More details on the unit test results are below. These unit tests are skipped in CI anyway, so they will not break the CI. - transformer_inference - quantizer - random_ltd - pytorch/pytorch#121030 - deepspeedai#5402 Unit test results (rocm/pytorch:rocm6.1_ubuntu20.04_py3.9_pytorch_2.1.2) on MI200: **transformer_inference:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/transformer/inference Before this PR: ==== 674 failed, 622 skipped, 8 warnings, 1728 errors in 123.66s (0:02:03) ===== After this PR: ========== 555 failed, 983 passed, 1486 skipped, 8 warnings in 14.35s ========== **quantizer:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/quantizer Before this PR: ==== 244 failed, 8 warnings in 48.02s ==== After this PR: ===== 187 failed, 57 passed, 8 warnings in 14.74s ==== I could not find random_ltd related unit tests to run. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Logan Adams <loadams@microsoft.com>
This PR enables building the below extensions for AMD GPUs with warp size 32. - transformer_inference - quantizer - random_ltd This PR works stand-alone for torch version <=2.0. For the latest versions, deepspeedai#5401 is required to be merged in addition to this PR. Unit test results (rocm/pytorch:rocm6.1_ubuntu20.04_py3.9_pytorch_2.1.2) on NAVI3x: **transformer_inference:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/transformer/inference Before this PR: ===== 674 failed, 622 skipped, 8 warnings, 1728 errors in 69.37s (0:01:09) ===== After this PR: ========== 476 failed, 1062 passed, 1486 skipped, 8 warnings in 9.31s ========== **quantizer:** pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/quantizer Before this PR: ==== 244 failed, 8 warnings in 30.53s ==== After this PR: ====== 186 failed, 58 passed, 8 warnings in 8.89s ====== I could not find random_ltd related unit tests to run. Fixes: deepspeedai#4753 deepspeedai#5474 ROCm#68 cc: @jithunnair-amd --------- Co-authored-by: rraminen@amd.com <rraminen> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Fixes #4989
In addition to this PR, below changes are required to build below extensions successfully. Please note that not all unit tests for these extensions will pass with this PR. More details on the unit test results are below. These unit tests are skipped in CI anyway, so they will not break the CI.
transformer_inference
quantizer
random_ltd
[ROCm] Add cublasGemmAlgo_t -> hipblasGemmAlgo_t pytorch/pytorch#121030
Rocm warp size fix #5402
Unit test results (rocm/pytorch:rocm6.1_ubuntu20.04_py3.9_pytorch_2.1.2) on MI200:
transformer_inference:
pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/transformer/inference
Before this PR:
==== 674 failed, 622 skipped, 8 warnings, 1728 errors in 123.66s (0:02:03) =====
After this PR:
========== 555 failed, 983 passed, 1486 skipped, 8 warnings in 14.35s ==========
quantizer:
pytest --color=yes --durations=0 --verbose -s -m "inference_ops" -rF -n 4 unit/ops/quantizer
Before this PR:
==== 244 failed, 8 warnings in 48.02s ====
After this PR:
===== 187 failed, 57 passed, 8 warnings in 14.74s ====
I could not find random_ltd related unit tests to run.