-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[Inductor] External callable registration API for Matmul tuning candidates #130774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130774
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit f7e07bf with merge base b35f70d ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test.
torch/_inductor/kernel/mm.py
Outdated
| from ..external_callable import external_matmul | ||
|
|
||
| for k in external_matmul: | ||
| choices.append(ExternKernelChoice(k).bind((mat1, mat2), layout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only construct ExternKernelChoice once, since I believe the constructor here registers a new name. Maybe we need a functools.lru_cache(None) to dedupe these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jansel , I see that ExternKernelChoice registers a new name for each new function. And we could run into duplication problem. But could you elaborate more how to use functools.lrucache(None) to dedupe these new name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@funtools.lru_cache(None)
def lazy_register_extern_choice(fn):
return ExternKernelChoice(fn)
will cause it to only get constructed once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added deduplication class
|
Hi Jason, I added a test and moved the list to config.py. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-experimental-split-build-test / test (default, 1, 5, linux.4xlarge.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
|
@jansel Could you help me with the failed test: Also test_flex_attention case is failing. I don't think it has anything to do with my PR. Do I need to update my upstream? |
|
Yes, you need to skip the test. I think the other inductor tests already skip M60 gpus, so you can look at test_torchinductor.py for an example. |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
c765c6b to
df8d1b6
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: Apply lint suggestions / lintrunner-autoformat Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda12.1-py3 / build Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #130769
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang