[AOTI] Use `len(serialized_weights)` when calculating `consts_size` #139054

chunyuan-w · 2024-10-28T09:20:46Z

Stack from ghstack (oldest at bottom):

-> [AOTI] Use len(serialized_weights) when calculating consts_size #139054

Fixes the failure of INT8 DLRM using AOTI.
The previous code calculates consts_size directly using tensor from graph.constants:

  consts_size = sum(
      get_nbytes_of_tensor(tensor, all_cuda)
      for (name, tensor) in graph.constants.items()
      if name not in graph.folded_constants
  )

Meanwhile, the actual bytes to serialize (serialized_weights) is using graph.get_original_value_of_constant(name):

  serialized_weights = b"".join(
      _to_bytes(graph.get_original_value_of_constant(name), all_cuda)
      for name in graph.constants.keys()
      if name not in graph.folded_constants
  )

tensor from graph.constants could be different from graph.get_original_value_of_constant(name) thus making the consts_size inconsistent with the actual byte size of the serialized_weights, resulting in runtime error weights_offset must be aligned to 16K boundary, similar to what happened in #135205.

This PR direclty gets consts_size using len(serialized_weights), which fixes the inconsistency.

We also added a reduce_range argument to the get_default_x86_inductor_quantization_config function, which is needed in the unit test to avoid accuracy issue on CI machines (earlier CPUs without VNNI).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

pytorch-bot · 2024-10-28T09:20:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139054

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 99f163f with merge base a7479fa ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 2217556 Pull Request resolved: #139054

[ghstack-poisoned]

ghstack-source-id: 526091e Pull Request resolved: #139054

chunyuan-w · 2024-10-29T08:35:41Z

test/inductor/test_aot_inductor.py


 class AOTInductorTestABICompatibleCpu(AOTITestCase):
    device = "cpu"
+    device_type = "cpu"


device_type is needed in skipCUDAIf

chunyuan-w · 2024-10-29T08:35:47Z

test/inductor/test_aot_inductor.py

 @unittest.skipIf(sys.platform == "darwin", "No CUDA on MacOS")
 class AOTInductorTestABICompatibleCuda(AOTITestCase):
    device = "cuda"
+    device_type = "cuda"


ghstack-source-id: 6b192ab Pull Request resolved: #139054

chunyuan-w · 2024-10-29T09:24:03Z

test/inductor/test_aot_inductor_arrayref.py


 class AOTInductorTestABICompatibleCpuWithStackAllocation(AOTITestCase):
    device = "cpu"
+    device_type = "cpu"


chunyuan-w · 2024-10-29T09:24:09Z

test/inductor/test_aot_inductor_arrayref.py

    TestCase
 ):
    device = "cpu"
+    device_type = "cpu"


[ghstack-poisoned]

chunyuan-w · 2024-10-31T06:23:08Z

@pytorchbot merge

pytorchmergebot · 2024-10-31T06:24:57Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-10-31T06:30:55Z

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x 6cd0b4b8a1f050585d611bf9a28a406a4ee7bb88 returned non-zero exit code 1

Auto-merging test/inductor/test_aot_inductor.py
Auto-merging torch/_inductor/codecache.py
CONFLICT (content): Merge conflict in torch/_inductor/codecache.py
Auto-merging torch/ao/quantization/quantizer/x86_inductor_quantizer.py
error: could not apply 6cd0b4b8a1f... [AOTI] Use get_original_value_of_constant when calculating consts_size
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
hint: Disable this message with "git config advice.mergeConflict false"

Details for Dev Infra team

Raised by workflow job

ghstack-source-id: e9233d5 Pull Request resolved: #139054

chunyuan-w · 2024-10-31T09:15:13Z

@pytorchbot merge

pytorchmergebot · 2024-10-31T09:16:59Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[ghstack-poisoned]

yushangdi · 2024-11-01T20:12:01Z

@chunyuan-w Hi, I noticed that this PR uses capture_pre_autograd_graph. Note that this API is deprecated and we plan to delete it very soon. Please use torch.export.export_for_training(...).module() instead moving forward. Thanks!

cc @leslie-fang-intel @jgong5

…ytorch#139054) Fixes the failure of INT8 DLRM using AOTI. The previous code calculates `consts_size` directly using `tensor` from `graph.constants`: ``` consts_size = sum( get_nbytes_of_tensor(tensor, all_cuda) for (name, tensor) in graph.constants.items() if name not in graph.folded_constants ) ``` Meanwhile, the actual bytes to serialize (`serialized_weights`) is using `graph.get_original_value_of_constant(name)`: ``` serialized_weights = b"".join( _to_bytes(graph.get_original_value_of_constant(name), all_cuda) for name in graph.constants.keys() if name not in graph.folded_constants ) ``` `tensor` from `graph.constants` could be different from `graph.get_original_value_of_constant(name)` thus making the `consts_size` inconsistent with the actual byte size of the `serialized_weights`, resulting in runtime error `weights_offset must be aligned to 16K boundary`, similar to what happened in pytorch#135205. This PR direclty gets `consts_size ` using `len(serialized_weights)`, which fixes the inconsistency. We also added a `reduce_range` argument to the `get_default_x86_inductor_quantization_config` function, which is needed in the unit test to avoid accuracy issue on CI machines (earlier CPUs without VNNI). Pull Request resolved: pytorch#139054 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/desertfire

chunyuan-w mentioned this pull request Oct 28, 2024

[AOTI] add C shim for QConvPointWise #138540

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Oct 28, 2024

This was referenced Oct 28, 2024

[AOTI] add C shim for _weight_int8pack_mm #138691

Closed

[AOTI] fix pointer_to_list #138806

Closed

chunyuan-w marked this pull request as draft October 28, 2024 09:21

chunyuan-w added the topic: not user facing topic category label Oct 28, 2024

pytorchbot added the open source label Oct 28, 2024

chunyuan-w added a commit that referenced this pull request Oct 28, 2024

[AOTI] Use get_original_value_of_constant when calculating consts_size

99d4375

ghstack-source-id: 2217556 Pull Request resolved: #139054

chunyuan-w added 2 commits October 28, 2024 17:20

Update

d562e40

[ghstack-poisoned]

Update

64272c6

[ghstack-poisoned]

pytorch-bot bot added the release notes: quantization release notes category label Oct 29, 2024

chunyuan-w added a commit that referenced this pull request Oct 29, 2024

[AOTI] Use get_original_value_of_constant when calculating consts_size

8674937

ghstack-source-id: 526091e Pull Request resolved: #139054

chunyuan-w commented Oct 29, 2024

View reviewed changes

chunyuan-w added a commit that referenced this pull request Oct 29, 2024

[AOTI] Use get_original_value_of_constant when calculating consts_size

6cd0b4b

ghstack-source-id: 6b192ab Pull Request resolved: #139054

chunyuan-w commented Oct 29, 2024

View reviewed changes

test/inductor/test_aot_inductor_arrayref.py

TestCase

):

device = "cpu"

device_type = "cpu"

Copy link

Collaborator Author

chunyuan-w Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

chunyuan-w added 2 commits October 29, 2024 09:33

Update

57a37c5

[ghstack-poisoned]

Update

4210e38

[ghstack-poisoned]

chunyuan-w marked this pull request as ready for review October 29, 2024 12:57

chunyuan-w requested review from jgong5 and leslie-fang-intel October 29, 2024 12:59

chunyuan-w changed the title ~~[AOTI] Use get_original_value_of_constant when calculating consts_size~~ [AOTI] Use len(serialized_weights) when calculating consts_size Oct 29, 2024

chunyuan-w changed the title ~~[AOTI] Use len(serialized_weights) when calculating consts_size~~ [AOTI] Use len(serialized_weights) when calculating consts_size Oct 29, 2024

chunyuan-w added 4 commits October 29, 2024 14:11

Update

f3197df

[ghstack-poisoned]

Update

0bcf298

[ghstack-poisoned]

Update

cc8f813

[ghstack-poisoned]

Update

1511eda

[ghstack-poisoned]

leslie-fang-intel approved these changes Oct 30, 2024

View reviewed changes

jgong5 approved these changes Oct 30, 2024

View reviewed changes

chunyuan-w requested a review from desertfire October 30, 2024 03:02

desertfire approved these changes Oct 31, 2024

View reviewed changes

chunyuan-w added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 31, 2024

pytorchmergebot added the merging label Oct 31, 2024

pytorchmergebot removed the merging label Oct 31, 2024

chunyuan-w added a commit that referenced this pull request Oct 31, 2024

[AOTI] Use get_original_value_of_constant when calculating consts_size

42086ce

ghstack-source-id: e9233d5 Pull Request resolved: #139054

pytorchmergebot added the merging label Oct 31, 2024

pytorchmergebot added the Merged label Oct 31, 2024

pytorchmergebot closed this in 3192bde Oct 31, 2024

pytorchmergebot removed the merging label Oct 31, 2024

Update

99f163f

[ghstack-poisoned]

yushangdi mentioned this pull request Nov 1, 2024

[BC-Breaking]Remove capture_pre_autograd_graph references in quantization #139505

Closed

github-actions bot deleted the gh/chunyuan-w/42/head branch December 2, 2024 02:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AOTI] Use `len(serialized_weights)` when calculating `consts_size` #139054

[AOTI] Use `len(serialized_weights)` when calculating `consts_size` #139054

Uh oh!

chunyuan-w commented Oct 28, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 28, 2024 •

edited

Loading

Uh oh!

chunyuan-w Oct 29, 2024

Uh oh!

chunyuan-w Oct 29, 2024

Uh oh!

chunyuan-w Oct 29, 2024

Uh oh!

chunyuan-w Oct 29, 2024

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Uh oh!

yushangdi commented Nov 1, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[AOTI] Use len(serialized_weights) when calculating consts_size #139054

[AOTI] Use len(serialized_weights) when calculating consts_size #139054

Uh oh!

Conversation

chunyuan-w commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139054

✅ No Failures

Uh oh!

chunyuan-w Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Merge started

Uh oh!

pytorchmergebot commented Oct 31, 2024

Merge failed

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Merge started

Uh oh!

yushangdi commented Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[AOTI] Use `len(serialized_weights)` when calculating `consts_size` #139054

[AOTI] Use `len(serialized_weights)` when calculating `consts_size` #139054

chunyuan-w commented Oct 28, 2024 •

edited

Loading

pytorch-bot bot commented Oct 28, 2024 •

edited

Loading

yushangdi commented Nov 1, 2024 •

edited

Loading