KEMBAR78
[Inductor] External callable registration API for Matmul tuning candidates by maxyanghu · Pull Request #130774 · pytorch/pytorch · GitHub
Skip to content

Conversation

@maxyanghu
Copy link
Contributor

@maxyanghu maxyanghu commented Jul 15, 2024

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130774

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f7e07bf with merge base b35f70d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 15, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test.

from ..external_callable import external_matmul

for k in external_matmul:
choices.append(ExternKernelChoice(k).bind((mat1, mat2), layout))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only construct ExternKernelChoice once, since I believe the constructor here registers a new name. Maybe we need a functools.lru_cache(None) to dedupe these.

Copy link
Contributor Author

@maxyanghu maxyanghu Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jansel , I see that ExternKernelChoice registers a new name for each new function. And we could run into duplication problem. But could you elaborate more how to use functools.lrucache(None) to dedupe these new name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@funtools.lru_cache(None)
def lazy_register_extern_choice(fn):
   return ExternKernelChoice(fn)

will cause it to only get constructed once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added deduplication class

@maxyanghu
Copy link
Contributor Author

Hi Jason, I added a test and moved the list to config.py.

@maxyanghu maxyanghu marked this pull request as ready for review July 22, 2024 20:46
@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 23, 2024
@maxyanghu maxyanghu requested a review from jansel July 24, 2024 15:35
@jansel
Copy link
Contributor

jansel commented Jul 27, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 27, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-experimental-split-build-test / test (default, 1, 5, linux.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@maxyanghu
Copy link
Contributor Author

@jansel Could you help me with the failed test:
RuntimeError: Found Tesla M60 which is too old to be supported by the triton GPU compiler, which is used as the backend. Triton only supports devices of CUDA Capability >= 7.0, but your device is of CUDA capability 5.2
Seems that I need to skip tests on Tesla M60? Is there a way to do it?

Also test_flex_attention case is failing. I don't think it has anything to do with my PR.

Do I need to update my upstream?

@jansel
Copy link
Contributor

jansel commented Jul 30, 2024

Yes, you need to skip the test. I think the other inductor tests already skip M60 gpus, so you can look at test_torchinductor.py for an example.

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Sep 28, 2024
@jansel
Copy link
Contributor

jansel commented Sep 28, 2024

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased external-registration onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout external-registration && git pull --rebase)

@jansel
Copy link
Contributor

jansel commented Sep 28, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: Apply lint suggestions / lintrunner-autoformat

Details for Dev Infra team Raised by workflow job

@jansel
Copy link
Contributor

jansel commented Sep 28, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@jansel
Copy link
Contributor

jansel commented Oct 2, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda12.1-py3 / build

Details for Dev Infra team Raised by workflow job

@jansel
Copy link
Contributor

jansel commented Oct 2, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor open source release notes: inductor triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants