[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

davidberard98 · 2025-07-29T23:58:15Z

Stack from ghstack (oldest at bottom):

-> [AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

Motivation / Context: (what I think is happening here)

In "eager"/just-in-time PT2 usage, dynamo/inductor will guard on whether indices fit in int32 or not. So it's generally safe in Inductor code to rely on the example values for symbolic ints in order to determine whether indices fit in int32, because the indices will be guarded on anyway; and if the inputs ever increase to >int32_max, dynamo will cause a recompilation.

But with AOTI, those int32 guards aren't respected; so if the example input is < int32_max but can be > int32_max during future execution, then the future execution might fail / IMA.

Solution space

Export allows users to specify which dimension are dynamic, and to provide ranges of valid sizes.

One solution idea is to always respect the upper bound of the dynamic shape range when doing AOTI; if the index's range includes values >int32_max, then don't use the hint and assume that this index doesn't fit in int32.

However, the problem with this is that many users may specify dynamism without specifying a range of values - the upper bound of the range will be set to the default of inf. Such use cases could potentially experience a perf regression if we implemented the idea above.

To prevent any such regressions, this implementation will rely solely on the specified range only if the upper bound of the range isn't inf. In other words, we'll ignore the hints/example values for AOTI (and rely only on the specified range) only if the upper bound of the range isn't inf - if users explicitly specify a range that extends past int32, we can be fairly sure that they actually do need values >int32_max.

If we continue to see correctness issues even with this implementation, we could consider more aggressively relying on the ranges.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

Differential Revision: D79220301

…d is provided [ghstack-poisoned]

pytorch-bot · 2025-07-29T23:58:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159433

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit e9a9ff1 with merge base 9a680e1 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, linux.12xlarge, unstable) (gh) (#158876)
sccache: error: couldn't connect to server

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…d is provided ghstack-source-id: dcf376c Pull Request resolved: #159433

davidberard98 · 2025-07-30T00:00:09Z

@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jingsh · 2025-07-30T16:13:49Z

torch/_inductor/utils.py

+
+    if V.aot_compilation:
+        # check whether value has an upper bound
+        if V.graph.sizevars.statically_known_true(e < 1e100):


2^64 (~1e20) should be good enough?

… upper bound is provided" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D79220301](https://our.internmc.facebook.com/intern/diff/D79220301) [ghstack-poisoned]

…d is provided ghstack-source-id: a5ed891 Pull Request resolved: #159433

… upper bound is provided" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D79220301](https://our.internmc.facebook.com/intern/diff/D79220301) [ghstack-poisoned]

…d is provided ghstack-source-id: 32bfa86 Pull Request resolved: #159433

… upper bound is provided" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D79220301](https://our.internmc.facebook.com/intern/diff/D79220301) [ghstack-poisoned]

…d is provided ghstack-source-id: 824546e Pull Request resolved: #159433

davidberard98 · 2025-08-02T04:49:52Z

@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…pper bound is provided" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D79220301](https://our.internmc.facebook.com/intern/diff/D79220301) [ghstack-poisoned]

…d is provided ghstack-source-id: 9a4d42a Pull Request resolved: #159433

davidberard98 · 2025-08-03T21:36:02Z

@pytorchbot rebase

pytorchmergebot · 2025-08-03T21:37:37Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2025-08-03T21:37:53Z

Successfully rebased gh/davidberard98/393/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/159433)

…d is provided ghstack-source-id: ae56ecf Pull Request resolved: #159433

davidberard98 · 2025-08-03T21:38:28Z

@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ColinPeppler · 2025-08-05T00:03:32Z

I ran an A100 benchmark and there's no significant performance regression. LGTM!

davidberard98 · 2025-08-05T00:09:41Z

@pytorchbot merge

pytorchmergebot · 2025-08-05T00:11:31Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…d is provided (pytorch#159433) **Motivation / Context**: (what I _think_ is happening here) In "eager"/just-in-time PT2 usage, dynamo/inductor will guard on whether indices fit in int32 or not. So it's generally safe in Inductor code to rely on the example values for symbolic ints in order to determine whether indices fit in int32, because the indices will be guarded on anyway; and if the inputs ever increase to `>int32_max`, dynamo will cause a recompilation. But with AOTI, those int32 guards aren't respected; so if the example input is `< int32_max` but can be `> int32_max` during future execution, then the future execution might fail / IMA. **Solution space** Export allows users to specify which dimension are dynamic, and to provide **ranges of valid sizes**. One solution idea is to always respect the upper bound of the dynamic shape range when doing AOTI; if the index's range includes values `>int32_max`, then don't use the hint and assume that this index doesn't fit in int32. However, the problem with this is that many users may specify dynamism without specifying a range of values - the upper bound of the range will be set to the default of `inf`. Such use cases could potentially experience a perf regression if we implemented the idea above. To prevent any such regressions, this implementation will rely solely on the specified range only if the upper bound of the range isn't inf. In other words, we'll ignore the hints/example values for AOTI (and rely only on the specified range) only if the upper bound of the range isn't inf - if users explicitly specify a range that extends past int32, we can be fairly sure that they actually do need values `>int32_max`. If we continue to see correctness issues even with this implementation, we could consider more aggressively relying on the ranges. Differential Revision: [D79220301](https://our.internmc.facebook.com/intern/diff/D79220301) Pull Request resolved: pytorch#159433 Approved by: https://github.com/jingsh, https://github.com/ColinPeppler

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

6aca5de

…d is provided [ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Jul 29, 2025

davidberard98 added a commit that referenced this pull request Jul 29, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

85bfcc9

…d is provided ghstack-source-id: dcf376c Pull Request resolved: #159433

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 30, 2025

jingsh approved these changes Jul 30, 2025

View reviewed changes

davidberard98 added a commit that referenced this pull request Jul 30, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

7289835

…d is provided ghstack-source-id: a5ed891 Pull Request resolved: #159433

davidberard98 added a commit that referenced this pull request Aug 2, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

4cbdcb7

…d is provided ghstack-source-id: 32bfa86 Pull Request resolved: #159433

davidberard98 added a commit that referenced this pull request Aug 2, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

c16d1de

…d is provided ghstack-source-id: 824546e Pull Request resolved: #159433

davidberard98 marked this pull request as ready for review August 2, 2025 04:47

davidberard98 added the release notes: inductor (aoti) label Aug 2, 2025

davidberard98 added a commit that referenced this pull request Aug 2, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

099d4f8

…d is provided ghstack-source-id: 9a4d42a Pull Request resolved: #159433

Update

e9a9ff1

[ghstack-poisoned]

pytorchmergebot pushed a commit that referenced this pull request Aug 3, 2025

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper boun…

7dda65d

…d is provided ghstack-source-id: ae56ecf Pull Request resolved: #159433

davidberard98 requested review from ColinPeppler and desertfire August 3, 2025 21:55

ColinPeppler approved these changes Aug 5, 2025

View reviewed changes

pytorchmergebot added the merging label Aug 5, 2025

pytorchmergebot closed this in 5e0fc2c Aug 5, 2025

pytorchmergebot added Merged and removed merging labels Aug 5, 2025

github-actions bot deleted the gh/davidberard98/393/head branch September 4, 2025 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

Uh oh!

davidberard98 commented Jul 29, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 29, 2025 •

edited

Loading

Uh oh!

davidberard98 commented Jul 30, 2025

Uh oh!

jingsh Jul 30, 2025

Uh oh!

davidberard98 commented Aug 2, 2025

Uh oh!

davidberard98 commented Aug 3, 2025

Uh oh!

pytorchmergebot commented Aug 3, 2025

Uh oh!

pytorchmergebot commented Aug 3, 2025

Uh oh!

davidberard98 commented Aug 3, 2025

Uh oh!

ColinPeppler commented Aug 5, 2025

Uh oh!

davidberard98 commented Aug 5, 2025

Uh oh!

pytorchmergebot commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

[AOTI] don't allow int32 indices if {non-inf, > int32_max} upper bound is provided #159433

Uh oh!

Conversation

davidberard98 commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159433

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

davidberard98 commented Jul 30, 2025

Uh oh!

jingsh Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

davidberard98 commented Aug 2, 2025

Uh oh!

davidberard98 commented Aug 3, 2025

Uh oh!

pytorchmergebot commented Aug 3, 2025

Uh oh!

pytorchmergebot commented Aug 3, 2025

Uh oh!

davidberard98 commented Aug 3, 2025

Uh oh!

ColinPeppler commented Aug 5, 2025

Uh oh!

davidberard98 commented Aug 5, 2025

Uh oh!

pytorchmergebot commented Aug 5, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

davidberard98 commented Jul 29, 2025 •

edited

Loading

pytorch-bot bot commented Jul 29, 2025 •

edited

Loading