KEMBAR78
[BLAS] Avoid downcasts for fp16fp16->fp32 BLAS by malfet · Pull Request #161999 · pytorch/pytorch · GitHub
Skip to content

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Sep 2, 2025

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161999

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 3 Pending

As of commit c3df264 with merge base 6737e2c (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

malfet added a commit that referenced this pull request Sep 2, 2025
Followup after #154012

Fixes CPU part of #160841

ghstack-source-id: 966f5fd
Pull Request resolved: #161999
@malfet malfet requested review from Skylion007 and drisspg September 2, 2025 19:54
@malfet malfet added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul topic: bug fixes topic category ciflow/trunk Trigger trunk jobs on your pull request labels Sep 2, 2025
@malfet malfet added release notes: linalg_frontend release notes category and removed module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul labels Sep 3, 2025
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #162001

pytorchmergebot pushed a commit that referenced this pull request Sep 3, 2025
Followup after #154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: #162001
Approved by: https://github.com/drisspg
ghstack dependencies: #161999
@jeanschmidt
Copy link
Contributor

@pytorchbot revert -m "break a few internal tests" -c ghfirst

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Sep 4, 2025
This reverts commit b40d943.

Reverted #162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](#161999 (comment)))
pytorchmergebot added a commit that referenced this pull request Sep 4, 2025
This reverts commit 02c83f1.

Reverted #161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](#161999 (comment)))
@pytorchmergebot
Copy link
Collaborator

@malfet your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Sep 4, 2025
@malfet
Copy link
Contributor Author

malfet commented Sep 4, 2025

@pytorchbot merge -f "Not sure why it was reverted in the 1st place..."

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
This reverts commit b40d943.

Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 )
Pull Request resolved: pytorch#162401
Approved by: https://github.com/Skylion007, https://github.com/drisspg
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
This reverts commit b40d943.

Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 )
Pull Request resolved: pytorch#162401
Approved by: https://github.com/Skylion007, https://github.com/drisspg
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
This reverts commit b40d943.

Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 )
Pull Request resolved: pytorch#162401
Approved by: https://github.com/Skylion007, https://github.com/drisspg
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
This reverts commit b40d943.

Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 )
Pull Request resolved: pytorch#162401
Approved by: https://github.com/Skylion007, https://github.com/drisspg
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Followup after pytorch#154012

Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase
Pull Request resolved: pytorch#162001
Approved by: https://github.com/drisspg
ghstack dependencies: pytorch#161999
@github-actions github-actions bot deleted the gh/malfet/504/head branch October 5, 2025 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: linalg_frontend release notes category Reverted topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants