KEMBAR78
[quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer by jerryzh168 · Pull Request #113288 · pytorch/pytorch · GitHub
Skip to content

Conversation

@jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Nov 8, 2023

Stack from ghstack (oldest at bottom):

Summary:
FX graph mode quant workflow and also pt2e flow relies on the is_dynamic flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D51124725

…for linear in XNNPACKQuantizer

Summary:
att

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 8, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/113288

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit be173fa with merge base 4cb7dd0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Nov 8, 2023
jerryzh168 added a commit that referenced this pull request Nov 8, 2023
…for linear in XNNPACKQuantizer

Summary:
att

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 6647f68
Pull Request resolved: #113288
@jerryzh168
Copy link
Contributor Author

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…antization for linear in XNNPACKQuantizer"

Summary:
att

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Nov 30, 2023
…for linear in XNNPACKQuantizer

Summary:
att

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: f4bb78c
Pull Request resolved: #113288
@jerryzh168 jerryzh168 changed the title [wip][quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer [quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer Nov 30, 2023
Copy link
Contributor

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add context for the need of this change. not clear to me

…ation for linear in XNNPACKQuantizer"

Summary:
att

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Nov 30, 2023
…inear in XNNPACKQuantizer

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dyanmic quantization pattern is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: babea59
Pull Request resolved: #113288
…ation for linear in XNNPACKQuantizer"

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dyanmic quantization pattern is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
@jerryzh168
Copy link
Contributor Author

Can you add context for the need of this change. not clear to me

yeah sure, just added.

…ation for linear in XNNPACKQuantizer"


Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Nov 30, 2023
…inear in XNNPACKQuantizer

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dyanmic quantization pattern is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1900c0c
Pull Request resolved: #113288
…ation for linear in XNNPACKQuantizer"


Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
…ation for linear in XNNPACKQuantizer"


Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
…ation for linear in XNNPACKQuantizer"


Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Dec 2, 2023
…inear in XNNPACKQuantizer

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dyanmic quantization pattern is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 2497cbb
Pull Request resolved: #113288
…ation for linear in XNNPACKQuantizer"


Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Dec 4, 2023
…inear in XNNPACKQuantizer

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dyanmic quantization pattern is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 552fb1a
Pull Request resolved: #113288
@jerryzh168
Copy link
Contributor Author

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jerryzh168
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 4, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/jerryzh168/931/head branch December 8, 2023 15:29
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
…inear in XNNPACKQuantizer (pytorch#113288)

Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.

However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.

Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)
Pull Request resolved: pytorch#113288
Approved by: https://github.com/kimishpatel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants