KEMBAR78
Add CUTLASS-based support for mixed dtypes matrix multiplication by cpuhrsch · Pull Request #110981 · pytorch/pytorch · GitHub
Skip to content

Conversation

@cpuhrsch
Copy link
Contributor

@cpuhrsch cpuhrsch commented Oct 10, 2023

Resubmission without ghstack to make it easier to import https://github.com/pytorch/pytorch/pull/110934/commits

cc @albanD

@cpuhrsch cpuhrsch requested a review from jerryzh168 as a code owner October 10, 2023 20:41
@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Oct 10, 2023
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 10, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110981

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4914b56 with merge base 95ff51d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@cpuhrsch cpuhrsch added skip-pr-sanity-checks and removed release notes: quantization release notes category labels Oct 10, 2023
@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Oct 10, 2023
@facebook-github-bot
Copy link
Contributor

@cpuhrsch has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cpuhrsch cpuhrsch requested a review from albanD October 10, 2023 21:21
@facebook-github-bot
Copy link
Contributor

@cpuhrsch has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@cpuhrsch has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

…torch#110981)

Summary:
Resubmission without ghstack to make it easier to import https://github.com/pytorch/pytorch/pull/110934/commits

cc albanD


Reviewed By: soulitzer

Differential Revision: D50139631

Pulled By: cpuhrsch
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50139631

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 11, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

constexpr auto ElementsPerAccessB = 128 / cutlass::sizeof_bits<ElementInputB>::value;
constexpr auto ElementsPerAccessC = ElementsPerAccessA;
constexpr auto Stages = 4;
constexpr auto SplitKFactor = 1; // Wrong outputs if !=1, even if

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I am encounting the same question,SplitKFactor must be 1, otherwise, I'll get a error. Do you know why? I am looking forward to your reply so much, Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged release notes: quantization release notes category skip-pr-sanity-checks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants