KEMBAR78
[ROCm] Prevent accidental enablement of efficient attention. (#134531) by xinyazhang · Pull Request #1565 · ROCm/pytorch · GitHub
Skip to content

Conversation

@xinyazhang
Copy link

[ROCm] Prevent accidental enablement of efficient attention. (pytorch#133331)

Currently Efficient attention and Flash attention share the same set of GPU kernels on ROCM and have common limitations on head sizes.

Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)

Fixes pytorch#132004

…#134531)

[ROCm] Prevent accidental enablement of efficient attention. (pytorch#133331)

Currently Efficient attention and Flash attention share the same set of GPU
kernels on ROCM and have common limitations on head sizes.

Fixes pytorch#132004

Pull Request resolved: pytorch#133331
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd

(cherry picked from commit 46ecc67)

Co-authored-by: Xinya Zhang <Xinya.Zhang@amd.com>
@pruthvistony pruthvistony merged commit 7e5ac3e into release/2.4 Sep 9, 2024
1 check failed
@pruthvistony pruthvistony deleted the xinyazhang/no257meff-rocm-2.4 branch September 9, 2024 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants