-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[BE][Ez]: Reserve vector for NT GEMM Matmul #141130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BE][Ez]: Reserve vector for NT GEMM Matmul #141130
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141130
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 3fa9708 with merge base a440a01 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
2136ede to
2767e0f
Compare
2767e0f to
e1a031f
Compare
e1a031f to
3fa9708
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though would be nice to add a comment, that PR is aimed at improving the perf.
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Easy fix to missing reserve calls in NT Matmul CUDA kernel to improve perf. Pull Request resolved: pytorch#141130 Approved by: https://github.com/malfet
Easy fix to missing reserve calls in NT Matmul CUDA kernel to improve perf.