Enable oneDNN QLinear FP32/BF16 output #112126

leslie-fang-intel · 2023-10-26T06:01:58Z

Stack from ghstack (oldest at bottom):

Summary

PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor [RFC] Enable Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor #111640.
Enable QLinear (relu) with BFloat16 or Float32 output.

TestPlan

python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

[ghstack-poisoned]

pytorch-bot · 2023-10-26T06:02:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112126

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 98ba52d with merge base 0d95378 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / macos-12-py3-arm64-mps / test (mps, 1, 1, macos-m1-12) (gh)

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / macos-12-py3-arm64 / test (default, 2, 3, macos-m1-12) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: b74527c Pull Request resolved: #112126

**Summary** - PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor #111640. - Enable QLinear (relu) with BFloat16 or Float32 output. **TestPlan** ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]

ghstack-source-id: 45b6cd0 Pull Request resolved: pytorch#112126

**Summary** - PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor #111640. - Enable QLinear (relu) with BFloat16 or Float32 output. **TestPlan** ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]

ghstack-source-id: 807ea00 Pull Request resolved: pytorch#112126

leslie-fang-intel · 2023-11-03T08:18:06Z

@pytorchbot merge

pytorchmergebot · 2023-11-03T08:20:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…torQuantizer (#112140) **Summary** - PR 3 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor #111640. - Remove the output annotation of QConv/QLinear in X86InductorQuantizer. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qlinear python -m pytest test_x86inductor_quantizer.py -k Conv2d python -m pytest test_x86inductor_quantizer.py -k Linear ``` Pull Request resolved: #112140 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #112010, #112126

**Summary** - PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor pytorch#111640. - Enable QLinear (relu) with BFloat16 or Float32 output. **TestPlan** ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` Pull Request resolved: pytorch#112126 Approved by: https://github.com/jerryzh168, https://github.com/jgong5 ghstack dependencies: pytorch#112010

…torQuantizer (pytorch#112140) **Summary** - PR 3 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor pytorch#111640. - Remove the output annotation of QConv/QLinear in X86InductorQuantizer. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qlinear python -m pytest test_x86inductor_quantizer.py -k Conv2d python -m pytest test_x86inductor_quantizer.py -k Linear ``` Pull Request resolved: pytorch#112140 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: pytorch#112010, pytorch#112126

**Summary** - PR 2 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor pytorch#111640. - Enable QLinear (relu) with BFloat16 or Float32 output. **TestPlan** ``` python -u -m pytest -s -v test_quantized_op.py -k test_qlinear_pt2e ``` Pull Request resolved: pytorch#112126 Approved by: https://github.com/jerryzh168, https://github.com/jgong5 ghstack dependencies: pytorch#112010

…torQuantizer (pytorch#112140) **Summary** - PR 3 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor pytorch#111640. - Remove the output annotation of QConv/QLinear in X86InductorQuantizer. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qlinear python -m pytest test_x86inductor_quantizer.py -k Conv2d python -m pytest test_x86inductor_quantizer.py -k Linear ``` Pull Request resolved: pytorch#112140 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: pytorch#112010, pytorch#112126

Enable oneDNN qlinear FP32/BF16 output

3c10d36

[ghstack-poisoned]

leslie-fang-intel requested review from digantdesai, jerryzh168, jianyuh, kimishpatel and salilsdesai as code owners October 26, 2023 06:01

leslie-fang-intel mentioned this pull request Oct 26, 2023

Enable oneDNN QConv FP32/BF16 output #112010

Closed

pytorch-bot bot added the release notes: quantization release notes category label Oct 26, 2023

leslie-fang-intel marked this pull request as draft October 26, 2023 06:02

github-actions bot added module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor ciflow/inductor labels Oct 26, 2023

leslie-fang-intel added a commit that referenced this pull request Oct 26, 2023

Enable oneDNN qlinear FP32/BF16 output

e382e2f

ghstack-source-id: b74527c Pull Request resolved: #112126

leslie-fang-intel added ciflow/trunk Trigger trunk jobs on your pull request open source labels Oct 26, 2023

This was referenced Oct 26, 2023

[Quant] [PT2] Remove the output Annotation of Conv/Linear in x86InductorQuantizer #112140

Closed

[Quant] [PT2] Enable Decomposed quant per tensor/channel to accept bfloat16 input #112225

Closed

leslie-fang-intel changed the title ~~Enable oneDNN qlinear FP32/BF16 output~~ Enable oneDNN QLinear FP32/BF16 output Oct 30, 2023

jerryzh168 approved these changes Oct 31, 2023

View reviewed changes

This was referenced Oct 31, 2023

[Inductor] [Quant] Enable QConv2d int8-mixed-bf16 Lowering #112469

Closed

[Inductor] [Quant] Enable QLinear int8-mixed-bf16 Lowering #112486

Closed

leslie-fang-intel requested review from Xia-Weiwen and jgong5 November 1, 2023 02:13

This was referenced Nov 1, 2023

[Inductor] [Quant] Enable QConv2d Unary int8-mixed-bf16 Lowering #112550

Closed

[Inductor] [Quant] Enable QConv2d Binary int8-mixed-bf16 Lowering #112551

Closed

leslie-fang-intel marked this pull request as ready for review November 1, 2023 02:15

jgong5 approved these changes Nov 1, 2023

View reviewed changes

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Nov 3, 2023

Enable oneDNN qlinear FP32/BF16 output

2b2a1aa

ghstack-source-id: 45b6cd0 Pull Request resolved: pytorch#112126

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Nov 3, 2023

Enable oneDNN qlinear FP32/BF16 output

366e860

ghstack-source-id: 807ea00 Pull Request resolved: pytorch#112126

pytorchmergebot added the merging label Nov 3, 2023

pytorchmergebot added the Merged label Nov 3, 2023

pytorchmergebot removed the merging label Nov 3, 2023

pytorchmergebot closed this in a53d29c Nov 3, 2023

facebook-github-bot deleted the gh/leslie-fang-intel/35/head branch November 6, 2023 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable oneDNN QLinear FP32/BF16 output #112126

Enable oneDNN QLinear FP32/BF16 output #112126

Uh oh!

leslie-fang-intel commented Oct 26, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 26, 2023 •

edited

Loading

Uh oh!

leslie-fang-intel commented Nov 3, 2023

Uh oh!

pytorchmergebot commented Nov 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Enable oneDNN QLinear FP32/BF16 output #112126

Enable oneDNN QLinear FP32/BF16 output #112126

Uh oh!

Conversation

leslie-fang-intel commented Oct 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112126

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

leslie-fang-intel commented Nov 3, 2023

Uh oh!

pytorchmergebot commented Nov 3, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

leslie-fang-intel commented Oct 26, 2023 •

edited

Loading

pytorch-bot bot commented Oct 26, 2023 •

edited

Loading