[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu #141556

Xia-Weiwen · 2024-11-26T10:08:02Z

Stack from ghstack (oldest at bottom):

Description
Fuse and prepack weight for linear_dynamic_fp16 with post op relu. In Inductor, the pattern we see is

fp32 activation
  |
(reshape)
  |
mm/addmm <- t <- to_fp32 <- tp_fp16 <- weight
  |
(reshape) <- relu

Or

fp32 activation
  |
expand
  |
 bmm <- expand <- t <- to_fp32 <- tp_fp16 <- weight
  |
(add) <- relu

The second pattern is for x.ndim > 2 and x is not contiguous. The first pattern is for other cases.

Fuse the pattern with weight prepack, and we get

fp32 activation
  |
onednn.linear_relu_dynamic_fp16 <- onednn.linear_prepack_fp16 <- weight

After freezing, the prepack op is gone.

Test plan

python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_relu_dynamic_fp16

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2024-11-26T10:08:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141556

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 974bb79 with merge base 795f28a ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / linux-jammy-py3.10-clang15-asan / test (default, 5, 6, lf.linux.4xlarge) (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
trunk / win-vs2019-cpu-py3 / test (default, 1, 3, lf.windows.4xlarge.nonephemeral) (gh) (similar failure)
cpp/quantized_test 1/1 failed!

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: ba6364d Pull Request resolved: #141556

[ghstack-poisoned]

ghstack-source-id: 3c7ed06 Pull Request resolved: #141556

[ghstack-poisoned]

ghstack-source-id: 6ed922e Pull Request resolved: #141556

[ghstack-poisoned]

ghstack-source-id: 1d8c75f Pull Request resolved: #141556

[ghstack-poisoned]

ghstack-source-id: ad0cfa2 Pull Request resolved: #141556

[ghstack-poisoned]

ghstack-source-id: 254e3b8 Pull Request resolved: #141556

Xia-Weiwen · 2024-12-09T01:56:42Z

@pytorchbot merge

pytorchmergebot · 2024-12-09T01:58:42Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…elu (pytorch#141556) **Description** Fuse and prepack weight for `linear_dynamic_fp16` with post op relu. In Inductor, the pattern we see is ``` fp32 activation | (reshape) | mm/addmm <- t <- to_fp32 <- tp_fp16 <- weight | (reshape) <- relu ``` Or ``` fp32 activation | expand | bmm <- expand <- t <- to_fp32 <- tp_fp16 <- weight | (add) <- relu ``` The second pattern is for x.ndim > 2 and x is not contiguous. The first pattern is for other cases. Fuse the pattern with weight prepack, and we get ``` fp32 activation | onednn.linear_relu_dynamic_fp16 <- onednn.linear_prepack_fp16 <- weight ``` After freezing, the prepack op is gone. **Test plan** ``` python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_relu_dynamic_fp16 ``` Pull Request resolved: pytorch#141556 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: pytorch#141549

Update

96385a2

[ghstack-poisoned]

Xia-Weiwen mentioned this pull request Nov 26, 2024

[Quant][PT2E][X86] annotate and convert for linear_dynamic_fp16 #141480

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Nov 26, 2024

Xia-Weiwen mentioned this pull request Nov 26, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 #141549

Closed

Xia-Weiwen marked this pull request as draft November 26, 2024 10:08

Xia-Weiwen added a commit that referenced this pull request Nov 26, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

5090bf0

ghstack-source-id: ba6364d Pull Request resolved: #141556

pytorchbot added the open source label Nov 26, 2024

Xia-Weiwen added intel This tag is for PR from Intel topic: not user facing topic category labels Nov 26, 2024

Xia-Weiwen requested review from jgong5 and leslie-fang-intel November 27, 2024 01:34

Update

a3866ea

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 27, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

59598a3

ghstack-source-id: 3c7ed06 Pull Request resolved: #141556

jgong5 approved these changes Nov 27, 2024

View reviewed changes

Update

dd944a0

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 27, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

28f7977

ghstack-source-id: 6ed922e Pull Request resolved: #141556

Update

a2acf67

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 28, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

8353c46

ghstack-source-id: 1d8c75f Pull Request resolved: #141556

Update

fbced30

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 29, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

99ea906

ghstack-source-id: ad0cfa2 Pull Request resolved: #141556

Update

974bb79

[ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Dec 2, 2024

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu

e00d408

ghstack-source-id: 254e3b8 Pull Request resolved: #141556

Xia-Weiwen marked this pull request as ready for review December 2, 2024 10:56

jerryzh168 approved these changes Dec 3, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 9, 2024

pytorchmergebot added the merging label Dec 9, 2024

pytorchmergebot added the Merged label Dec 9, 2024

pytorchmergebot closed this in 2cc01cc Dec 9, 2024

pytorchmergebot removed the merging label Dec 9, 2024

github-actions bot deleted the gh/Xia-Weiwen/22/head branch January 9, 2025 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu #141556

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu #141556

Uh oh!

Xia-Weiwen commented Nov 26, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 26, 2024 •

edited

Loading

Uh oh!

Xia-Weiwen commented Dec 9, 2024

Uh oh!

pytorchmergebot commented Dec 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu #141556

[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu #141556

Uh oh!

Conversation

Xia-Weiwen commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141556

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

Xia-Weiwen commented Dec 9, 2024

Uh oh!

pytorchmergebot commented Dec 9, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Xia-Weiwen commented Nov 26, 2024 •

edited

Loading

pytorch-bot bot commented Nov 26, 2024 •

edited

Loading