KEMBAR78
[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 with relu by Xia-Weiwen · Pull Request #141556 · pytorch/pytorch · GitHub
Skip to content

Conversation

@Xia-Weiwen
Copy link
Collaborator

@Xia-Weiwen Xia-Weiwen commented Nov 26, 2024

Stack from ghstack (oldest at bottom):

Description
Fuse and prepack weight for linear_dynamic_fp16 with post op relu. In Inductor, the pattern we see is

fp32 activation
  |
(reshape)
  |
mm/addmm <- t <- to_fp32 <- tp_fp16 <- weight
  |
(reshape) <- relu

Or

fp32 activation
  |
expand
  |
 bmm <- expand <- t <- to_fp32 <- tp_fp16 <- weight
  |
(add) <- relu

The second pattern is for x.ndim > 2 and x is not contiguous. The first pattern is for other cases.

Fuse the pattern with weight prepack, and we get

fp32 activation
  |
onednn.linear_relu_dynamic_fp16 <- onednn.linear_prepack_fp16 <- weight

After freezing, the prepack op is gone.

Test plan

python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_relu_dynamic_fp16

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141556

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 974bb79 with merge base 795f28a (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
Xia-Weiwen added a commit that referenced this pull request Nov 27, 2024
[ghstack-poisoned]
Xia-Weiwen added a commit that referenced this pull request Nov 27, 2024
[ghstack-poisoned]
Xia-Weiwen added a commit that referenced this pull request Nov 28, 2024
[ghstack-poisoned]
Xia-Weiwen added a commit that referenced this pull request Nov 29, 2024
[ghstack-poisoned]
Xia-Weiwen added a commit that referenced this pull request Dec 2, 2024
@Xia-Weiwen Xia-Weiwen marked this pull request as ready for review December 2, 2024 10:56
@Xia-Weiwen
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 9, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

AmdSampsa pushed a commit to AmdSampsa/pytorch that referenced this pull request Dec 9, 2024
…elu (pytorch#141556)

**Description**
Fuse and prepack weight for `linear_dynamic_fp16` with post op relu. In Inductor, the pattern we see is
```
fp32 activation
  |
(reshape)
  |
mm/addmm <- t <- to_fp32 <- tp_fp16 <- weight
  |
(reshape) <- relu
```
Or
```
fp32 activation
  |
expand
  |
 bmm <- expand <- t <- to_fp32 <- tp_fp16 <- weight
  |
(add) <- relu
```
The second pattern is for x.ndim > 2 and x is not contiguous. The first pattern is for other cases.

Fuse the pattern with weight prepack, and we get
```
fp32 activation
  |
onednn.linear_relu_dynamic_fp16 <- onednn.linear_prepack_fp16 <- weight
```
After freezing, the prepack op is gone.

**Test plan**
```
python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_relu_dynamic_fp16
```

Pull Request resolved: pytorch#141556
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: pytorch#141549
@github-actions github-actions bot deleted the gh/Xia-Weiwen/22/head branch January 9, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: inductor open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants