[quant][pt2] Fix custom dtype per channel weight in QAT #112612

andrewor14 · 2023-11-01T17:01:12Z

Stack from ghstack (oldest at bottom):

Summary: Previously we only copied over q/dq args for the per
tensor case. This was because the qparams for quantize_per_tensor
are literals while the qparams for quantize_per_channel are
get_attr nodes (tensors), which disappear from the original
nodes in the graph after subgraph rewriting.

However, this is problematic because, in the per channel case,
not all q/dq args are tensors. In particular, the args after
the qparams (axis, qmin, qmax, dtype) are all literals. For
these literal args we simply used the hardcoded ones
(0, -127, 127, torch.int8 respectively), even if the user
explicitly specified to use a different weight dtype. This
commit fixes this by copying over these literal args for the
per channel case as well.

Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype

Reviewers: jerryzh168, kimishpatel

Subscribers: jerryzh168, kimishpatel, supriyar

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar [ghstack-poisoned]

pytorch-bot · 2023-11-01T17:01:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112612

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7bccb52 with merge base 27e31ab ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/quantization/pt2e/test_quantize_pt2e_qat.py

jerryzh168 · 2023-11-01T17:07:19Z

test/quantization/pt2e/test_quantize_pt2e_qat.py

+        self.assertEqual(dq_dtype, torch.int32)
+
+
+def _get_conv_bn_getitem_nodes(model: torch.fx.GraphModule):


oh OK you copied this, then it's probably fine

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar [ghstack-poisoned]

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar ghstack-source-id: 8998a5c Pull Request resolved: #112612

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar [ghstack-poisoned]

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar ghstack-source-id: c56ddff Pull Request resolved: #112612

andrewor14 · 2023-11-06T02:14:26Z

@pytorchbot merge

pytorchmergebot · 2023-11-06T02:16:30Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-11-06T02:45:57Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-12-py3-arm64 / test (default, 1, 3, macos-m1-12)

Details for Dev Infra team

Raised by workflow job

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar [ghstack-poisoned]

andrewor14 · 2023-11-07T20:08:35Z

@pytorchbot merge

pytorchmergebot · 2023-11-07T20:10:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: pytorch#112612 Approved by: https://github.com/jerryzh168

Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar ghstack-source-id: df92283 Pull Request resolved: pytorch/pytorch#112612

andrewor14 requested a review from jerryzh168 as a code owner November 1, 2023 17:01

andrewor14 mentioned this pull request Nov 1, 2023

[quant][pt2] Support quantized conv bias in QAT fusion #112528

Closed

pytorch-bot bot added release notes: quantization release notes category labels Nov 1, 2023

andrewor14 requested a review from kimishpatel November 1, 2023 17:02

jerryzh168 reviewed Nov 1, 2023

View reviewed changes

test/quantization/pt2e/test_quantize_pt2e_qat.py Show resolved Hide resolved

jerryzh168 reviewed Nov 1, 2023

View reviewed changes

jerryzh168 approved these changes Nov 1, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 6, 2023

pytorchmergebot added the merging label Nov 6, 2023

pytorchmergebot removed the merging label Nov 6, 2023

andrewor14 mentioned this pull request Nov 6, 2023

[quant][pt2][be] Remove add/relu from conv-bn QAT pattern #113006

Closed

andrewor14 added 2 commits November 6, 2023 14:17

pytorchmergebot added the merging label Nov 7, 2023

pytorchmergebot added Merged and removed merging labels Nov 7, 2023

pytorchmergebot closed this in c0aba9b Nov 7, 2023

facebook-github-bot deleted the gh/andrewor14/39/head branch November 11, 2023 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][pt2] Fix custom dtype per channel weight in QAT #112612

[quant][pt2] Fix custom dtype per channel weight in QAT #112612

Uh oh!

andrewor14 commented Nov 1, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 1, 2023 •

edited

Loading

Uh oh!

Uh oh!

jerryzh168 Nov 1, 2023

Uh oh!

andrewor14 commented Nov 6, 2023

Uh oh!

pytorchmergebot commented Nov 6, 2023

Uh oh!

pytorchmergebot commented Nov 6, 2023

Uh oh!

andrewor14 commented Nov 7, 2023

Uh oh!

pytorchmergebot commented Nov 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		self.assertEqual(dq_dtype, torch.int32)


		def _get_conv_bn_getitem_nodes(model: torch.fx.GraphModule):

[quant][pt2] Fix custom dtype per channel weight in QAT #112612

[quant][pt2] Fix custom dtype per channel weight in QAT #112612

Uh oh!

Conversation

andrewor14 commented Nov 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112612

✅ No Failures

Uh oh!

Uh oh!

jerryzh168 Nov 1, 2023

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Nov 6, 2023

Uh oh!

pytorchmergebot commented Nov 6, 2023

Merge started

Uh oh!

pytorchmergebot commented Nov 6, 2023

Merge failed

Uh oh!

andrewor14 commented Nov 7, 2023

Uh oh!

pytorchmergebot commented Nov 7, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andrewor14 commented Nov 1, 2023 •

edited

Loading

pytorch-bot bot commented Nov 1, 2023 •

edited

Loading