KEMBAR78
[Quant] Add DQ duplication pass by kimishpatel · Pull Request #107900 · pytorch/pytorch · GitHub
Skip to content

Conversation

@kimishpatel
Copy link
Contributor

@kimishpatel kimishpatel commented Aug 24, 2023

Stack from ghstack (oldest at bottom):

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

             ---> OP2 -> Q -> DQ
            /

OP -> Q -> DQ -

---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:

  • Porting metadata: which "quantized" partition this DQ node belongs
  • Quantized representation, equivalently, needs to identify
    self-contained quantized pattern that is replaced by its equivalent pattern
    that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D48663147

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @anijain2305

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 24, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107900

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d614744 with merge base 0f1a225 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

return False


def _find_q_dq_node_for_user(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this will be hard to debug and maintain, what is the problem with the existing duplicate dq implementation: https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/qat_utils.py#L580?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I looked again, is this for test only? can you move this to test files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is also used later in the stack for metdata porting.

_copy_and_replace_use(gm, producer, user)


class DuplicateDQPass(_ExportPassBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it makes sense to just always duplicate dequant for now, or do a duplicate and then deduplicate like we have did before in https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/qat_utils.py#L580-L627

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you mean, move this before fold_conv_bn? If so yes. makes sense.

Copy link
Contributor

@jerryzh168 jerryzh168 Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we can do:

duplicate all dequant -> port metadata -> deduplicate dequantize ops when we didn't find metadatas on them

so that we don't need to refer to quantization_annotation here again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that better? It might be cleaner if we dont duplicate and dedup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so current quantization_annotation doesn't really convey any information about share or not share dq nodes, I think it will be better to not rely on it

actually I'm wondering when does this have to be landed? I'm thinking of doing a proper design for this part next week, right now this information is missing from QuantizationSpec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no sharing of dq nodes at all. The only reason annotation is taken into account is so that you dont unnecessarily duplicate dq nodes. I can very much remove quantization_annotation from decision making, it will just have performance imlication

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
Comment on lines 21 to 23
"filter_sym_size_users",
"find_q_dq_node_for_user",
"is_valid_annotation",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you make these private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. and having to export it from util was complaining that if this is to be used elsewhere remove _. If you have suggestions let me know and i can do it in subsequent diffs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, maybe I can try to remove, I do remember we have been successfully using private utils somewhere

return False


def find_q_dq_node_for_user(
Copy link
Contributor

@jerryzh168 jerryzh168 Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this function is a bit too complicated and might be error prone as well, is there a way to simplify this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just finding q, dq from [producer -> .... -> q -> dq - > user]. I can add doc string if you want

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah a docstring would be helpful



"""
This API is specifically to replace a specific use of producer node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment still up to date?

Copy link
Contributor Author

@kimishpatel kimishpatel Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh the internal api part. yes let me update that

"""


def _copy_and_replace_use(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe merge this function with _maybe_duplicate_dq?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure



def _maybe_duplicate_dq(
gm: torch.fx.GraphModule, producer: torch.fx.Node, user: torch.fx.Node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: renaming producer to dq_node might be clearer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

def call(self, graph_module: torch.fx.GraphModule) -> PassResult:
for node in graph_module.graph.nodes:
if node.op == "call_function" and node.target in _DEQUANTIZE_OPS:
dq_users = filter_sym_size_users(node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed? can we just replace all users? also after the IR changes, we don't have sym size nodes anymore right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I dont know if the sym size node will show up due to user code. like querying size on a tensor. I think here we are just making a call of not duplicating dq for sym size nodes. Either choice is fine. I would say we can go with this for now and change the behavior later

"""

def call(self, graph_module: torch.fx.GraphModule) -> PassResult:
for node in graph_module.graph.nodes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I'm wondering if you can reuse https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/qat_utils.py#L585 actually, I think we don't need to maintain multiple implementations of the same thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that doesnt account for all the things that is being done here. things like sym_size filtering, accounting valid annotation etc. I agree with unifying the code though, but I would imagine we can update that code or move the fold_conv_bn after duplicate dq. although that needs some code update because in fold conv bn it removes duplicate dq at some point.

@mcr229
Copy link
Contributor

mcr229 commented Aug 30, 2023

Does this mean DQ duplication will be a default in prepare pt2e?

@jerryzh168
Copy link
Contributor

Does this mean DQ duplication will be a default in prepare pt2e?

yeah, this will be handled in quant flow for normal cases

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Kimish but I think we want to keep move_model_to_eval under torch.ao.quantization directly (cc @jerryzh168). We're also cherry picking this back in 2.1 so it would break BC for those users if we moved this. Could we look into other ways to work around the circular dependency issue?

return node_users


def _is_valid_annotation(annotation: QuantizationAnnotation) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where we use QuantizationAnnotation right? Maybe we can leave this as Any for now since it's an internal function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if there is a work around

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kimishpatel, I have this PR which should take care of the circular deps issue: #108363. Please rebase on top of this when you have the chance, then you don't have to do anything special

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

cc voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kimishpatel
Copy link
Contributor Author

@leslie-fang-intel
Copy link
Collaborator

Sure, I will take a look soon @kimishpatel

@leslie-fang-intel
Copy link
Collaborator

Hi @kimishpatel, I think we need to make corresponding changes in test_mkldnn_pattern_matcher.py. Please kindly refer to this commit: leslie-fang-intel@59e5ce9 to adjust the match numbers. Please let me if anything missing.

Copy link
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move_model_to_eval changes look good to me

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

cc voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor Author

Hi @kimishpatel, I think we need to make corresponding changes in test_mkldnn_pattern_matcher.py. Please kindly refer to this commit: leslie-fang-intel@59e5ce9 to adjust the match numbers. Please let me if anything missing.

Thanks Leslie. Please review changes to test_mkldnn_pattern_matcher.py.

As I was reviewing those, I realize that decomposition for quantize/dequantize does not match that of here, https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/pt2e/representation/rewrite.py#L428. Specifically in inductor decomp of quantize_per_tensor, addition of zero point still happens in fp, which I dont think is accurate because of the rounding errors that can happen on rouding zero point to fp and then whatever rounding happens during add. So I think we should sync those. cc: @jerryzh168

Also, please see how this diff affects quantization for x86 inductor. I see there was a pass to remove convert node that appear in quant/dequant pattern. But since now quant can more than 1 dequant users, that pattern wont match. If you still need to remove such redundant convert nodes, you will have to change the pattern/replacement

Summary:
During convert step observers are first replaced by Q-DQ pair. In some
scenarios like following output DQ has a fan out.

                 ---> OP2 -> Q -> DQ
                /
OP -> Q -> DQ -
                \
                 ---> OP3 -> Q -> DQ

If either op OP2 or OP3 are configured to be quantized, then the input
is expected to quantized. In this case quantized equivalent of some
pattern, that quantizer asked to be quantized, should look like:
[DQ -> {pattern} -> Q]. However, in scenario like above where DQ node
is shared between multiple "quantized" patterns, boundary of "quantized"
pattern is not clear because DQ now belongs to multiple quantized
patterns.

This poses challenge for:
- Porting metadata: which "quantized" partition this DQ node belongs
- Quantized representation, equivalently, needs to identify
self-contained quantized pattern that is replaced by its equivalent pattern
that captures compute in the quantized precision.

Test Plan:
test_duplicate_dq_pass

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov anijain2305

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@kimishpatel
Copy link
Contributor Author

@kimishpatel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

# 1. Pair of to_int8 and to_fp32 at conv input * 1, extra input of add * 1, and graph output * 1
# matched in pointless_convert pass at
# torch/_inductor/fx_passes/joint_graph.py: [convert_element_type, convert_element_type_1]
# NB: since quant workflow now duplicates DQ node, for each user, we wont necessarily see
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kimishpatel, could you also update the comments here https://github.com/leslie-fang-intel/pytorch/blob/59e5ce92f8b5ca68205df0abc444d0b55f16fc58/test/inductor/test_mkldnn_pattern_matcher.py#L514-L515 and other similar places? so, it will be aligned with expected count values.

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 2, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Sep 2, 2023
…107107)

Summary:
This diff adds adding metadata to q-dq nodes by inferring the
quatization intent from node annotations. Annotations on the node are
way for user to specify how a node or subgraph is supposed to be
quantized. We continue to use that information to copy metadata on Q/DQ
node from appropriate nodes.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D48488416](https://our.internmc.facebook.com/intern/diff/D48488416)
Pull Request resolved: #107107
Approved by: https://github.com/jerryzh168
ghstack dependencies: #107105, #107106, #107899, #107900
@facebook-github-bot facebook-github-bot deleted the gh/kimishpatel/177/head branch September 5, 2023 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants