KEMBAR78
[export] fix re-export custom metadata by yiming0416 · Pull Request #135282 · pytorch/pytorch · GitHub
Skip to content

Conversation

@yiming0416
Copy link
Contributor

@yiming0416 yiming0416 commented Sep 5, 2024

Fixes #134778

When a model is exported and debug handles are added to the "custom" field of non-placeholder and non-output nodes in the graph, re-exporting it will change the metadata of placeholder nodes (the "custom" field will be added or copied to these nodes, depending whether ExportedProgram or ExportedProgram.module() is passed to generate_numeric_debug_handle()).

This occurs because when we re-export the model, placeholder nodes are unlifted to get_attr nodes. These nodes remain as get_attr after being exported to gm_torch_level. Their metadata are modified here based on params_buffers_to_node_meta which is collected here.

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 5, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135282

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f86c445 with merge base f65a564 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@yiming0416 yiming0416 force-pushed the yiming0416/fix_re_export_custom_meta branch 2 times, most recently from 99de798 to 78d5349 Compare September 6, 2024 15:18
@yiming0416 yiming0416 marked this pull request as ready for review September 6, 2024 16:30
@facebook-github-bot
Copy link
Contributor

@yiming0416 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Comment on lines 168 to 169
if k == "custom":
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiming0416 just curious why do we need to skip over this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the issue of @jerryzh168 is that the custom field is copied to get_attr nodes from the call_function nodes unexpectedly which causes the quantization test to fail.

I don't understand why we need the copy during the populating process though...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed offline with @yiming0416 I think the fix should be in: https://github.com/pytorch/pytorch/blob/main/torch/_export/utils.py#L136-L143

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this seems bit hacky, let's discuss offline whether we actually need this logic in the first place.

@yiming0416 yiming0416 force-pushed the yiming0416/fix_re_export_custom_meta branch from 78d5349 to 0d894d5 Compare September 9, 2024 18:26
@facebook-github-bot
Copy link
Contributor

@yiming0416 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@yiming0416 yiming0416 force-pushed the yiming0416/fix_re_export_custom_meta branch 3 times, most recently from 8a783dc to 7b47c69 Compare September 9, 2024 21:33
@facebook-github-bot
Copy link
Contributor

@yiming0416 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@yiming0416 yiming0416 force-pushed the yiming0416/fix_re_export_custom_meta branch from 7b47c69 to b12e134 Compare September 9, 2024 23:32
@facebook-github-bot
Copy link
Contributor

@yiming0416 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@yiming0416 yiming0416 force-pushed the yiming0416/fix_re_export_custom_meta branch from b12e134 to f86c445 Compare September 10, 2024 15:30
@facebook-github-bot
Copy link
Contributor

@yiming0416 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
Fixes pytorch#134778

When a model is exported and debug handles are added to the "custom" field of non-placeholder and non-output nodes in the graph, re-exporting it will change the metadata of placeholder nodes (the "custom" field will be added or copied to these nodes, depending whether `ExportedProgram` or `ExportedProgram.module()` is passed to `generate_numeric_debug_handle()`).

This occurs because when we re-export the model, `placeholder` nodes are unlifted to `get_attr` nodes. These nodes remain as `get_attr` after being exported to `gm_torch_level`.  Their metadata are modified [here](https://github.com/pytorch/pytorch/blob/main/torch/export/_trace.py#L1347) based on `params_buffers_to_node_meta` which is collected [here](https://github.com/pytorch/pytorch/blob/main/torch/export/_trace.py#L1312).
Pull Request resolved: pytorch#135282
Approved by: https://github.com/jerryzh168, https://github.com/zhxchen17, https://github.com/tugsbayasgalan
@github-actions github-actions bot deleted the yiming0416/fix_re_export_custom_meta branch October 12, 2024 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

torch.export didn't fully preserve "custom" metadata during re-export and run_decomposition

6 participants