KEMBAR78
[QNN EP] Fuse scale into softmax by qti-yuduo · Pull Request #24809 · microsoft/onnxruntime · GitHub
Skip to content

Conversation

@qti-yuduo
Copy link
Contributor

@qti-yuduo qti-yuduo commented May 19, 2025

QNN Softmax op defines pre-scale (beta) that we can fold constant scalar multiply into it.

@qti-yuduo
Copy link
Contributor Author

@microsoft-github-policy-service agree [company=Qualcomm]

@qti-yuduo qti-yuduo force-pushed the dev/yuduow/scale-softmax-fusion branch from 3b33063 to 2e8583a Compare May 19, 2025 18:36
@qti-yuduo
Copy link
Contributor Author

@microsoft-github-policy-service agree

@yuslepukhin
Copy link
Member

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,ONNX Runtime Web CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline

@yuslepukhin
Copy link
Member

/azp run Linux QNN CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@edgchen1 edgchen1 added the ep:QNN issues related to QNN exeution provider label May 20, 2025
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@qti-yuduo qti-yuduo force-pushed the dev/yuduow/scale-softmax-fusion branch from f1b7014 to ecbbcf7 Compare May 20, 2025 21:14
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@HectorSVC
Copy link
Contributor

You only need to check the QNN related build. You should have access to the build pipeline, so is the build log.
For Linux:
/mnt/vss/_work/_temp/bfb99205-f415-44a0-be47-77d7f5a6877e.sh: line 3: 118621 Segmentation fault (core dumped) ./build/Release/onnx_test_runner -e qnn -j 1 -i "backend_path|/mnt/vss/_work/_temp/qnn-v2.33.2.250410/lib/x86_64-linux-clang/libQnnCpu.so" cmake/external/onnx/onnx/backend/test/data/node
You can run onnx_test_runner to reproduce it.

For Windows:
1: [ FAILED ] 5 tests, listed below:
1: [ FAILED ] QnnHTPBackendTests.ScaleSoftmaxFusionScalarInitializer
1: [ FAILED ] QnnHTPBackendTests.ScaleSoftmaxFusionScalarConstant
1: [ FAILED ] QnnHTPBackendTests.ScaleSoftmaxFusionScalarInitializerReversed
1: [ FAILED ] QnnHTPBackendTests.ScaleSoftmaxFusionScalarConstantReversed
1: [ FAILED ] QnnHTPBackendTests.ScaleSoftmaxFusionSoftmaxNegativeAxis
1:
1: 5 FAILED TESTS
1: YOU HAVE 13 DISABLED TESTS
1:
1/9 Test #1: onnxruntime_test_all ....................***Failed 248.51 sec
You can run command to repro:
onnxruntime_test_all.exe --gtest_filter=QnnHTPBackendTests.ScaleSoftmaxFusionScalarInitializer

@qti-yuduo
Copy link
Contributor Author

qti-yuduo commented May 22, 2025

@HectorSVC mind help trigger CI again? Thank you!!

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@qti-yuduo
Copy link
Contributor Author

qti-yuduo commented May 23, 2025

I can repro the Linux error, It should be fixed now.

@qti-yuduo qti-yuduo force-pushed the dev/yuduow/scale-softmax-fusion branch from 5a40bde to 09a403d Compare May 23, 2025 20:06
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@qti-yuduo
Copy link
Contributor Author

ping.

Copy link
Contributor

@HectorSVC HectorSVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@HectorSVC HectorSVC merged commit f9739c2 into microsoft:main May 28, 2025
82 checks passed
@qti-yuduo qti-yuduo deleted the dev/yuduow/scale-softmax-fusion branch July 18, 2025 20:14
adrianlizarraga pushed a commit that referenced this pull request Aug 1, 2025
QNN [Softmax op defines pre-scale (`beta`)](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/MasterOpDef.html#softmax) that we can fold constant scalar multiply into it.
adrianlizarraga pushed a commit that referenced this pull request Aug 5, 2025
QNN [Softmax op defines pre-scale (`beta`)](https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/MasterOpDef.html#softmax) that we can fold constant scalar multiply into it.
adrianlizarraga added a commit that referenced this pull request Aug 11, 2025
### Description
- #24265
- #24616
- #24640
- #24707
- #24646
- #24750
- #24809
- #24895
- #24820
- #25002
- #25171
- #25283
- #24818
- #25351
- #25361
- #25388
- #25520
- #25158




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: quic-zhaoxul <quic_zhaoxul@quicinc.com>
Co-authored-by: Yuduo Wu <6426433+1duo@users.noreply.github.com>
Co-authored-by: Hector Li <hecli@microsoft.com>
Co-authored-by: chenweng-quic <168707118+chenweng-quic@users.noreply.github.com>
Co-authored-by: qti-yuduo <yuduow@qti.qualcomm.com>
Co-authored-by: Akupadhye <aupadhye@qti.qualcomm.com>
Co-authored-by: Jeff Kilpatrick <jkilpatrick@qti.qualcomm.com>
Co-authored-by: Jeff Kilpatrick <jkilpat@qti.qualcomm.com>
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: quic-calvnguy <quic_calvnguy@quicinc.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:QNN issues related to QNN exeution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants