-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Closed
Labels
Description
System Info
codes related to Flash-Attention2 judgment logic in 4.55.3
is different from main
, 4.55.3
branch code is unavailable on Ascend NPU, main
branch is OK
For Ascend NPU, 4.55.3
branch will go into else logic branch, and search flash-attn
package, which will cause ImportError
on Ascend NPU.
in 4.55.3
transformers/src/transformers/modeling_utils.py
Lines 2483 to 2492 in 7dbc054
if not is_flash_attn_2_available(): | |
preface = "FlashAttention2 has been toggled on, but it cannot be used due to the following error:" | |
install_message = "Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2." | |
# package `flash-attn` can not be installed on Ascend NPU, ignore related validation logi | |
if importlib.util.find_spec("flash_attn") is None and not is_torch_npu_available(): | |
raise ImportError(f"{preface} the package flash_attn seems to be not installed. {install_message}") | |
else: | |
# Check FA2 installed version compatibility | |
flash_attention_version = version.parse(importlib.metadata.version("flash_attn")) |
in main
transformers/src/transformers/modeling_utils.py
Lines 2479 to 2487 in 6b5bd11
if not is_flash_attn_2_available(): | |
preface = "FlashAttention2 has been toggled on, but it cannot be used due to the following error:" | |
install_message = "Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2." | |
# package `flash-attn` can not be installed on Ascend NPU, following validation logics can be ignored. | |
if is_torch_npu_available(): | |
logger.info("Detect using FlashAttention2 on Ascend NPU.") | |
return True | |
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Tested through LLaMA-Factory
, which is not strong related with above error.
Expected behavior
4.55.3
branch keep the code logic in main
branch