NJT <-> padded dense conversions #125947

jbschlosser · 2024-05-10T18:51:10Z

Stack from ghstack (oldest at bottom):

This PR:

Implements the pre-existing nt.to_padded_tensor(padding_val) ATen op via the FBGEMM kernel + appropriate view gymnastics (since that kernel only handles 2D values)
Introduces a new _nested_from_padded_tensor op for the reverse conversion, implemented via the reverse FBGEMM kernel + view gymnastics
- Note: there is currently no public API for this; design booted to a future PR

TODO:

~~Propagate min / max sequence length via the new factory function _nested_from_padded_tensor~~
~~Verify that Inductor does computation fusion via test logic~~

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec

[ghstack-poisoned]

pytorch-bot · 2024-05-10T18:51:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125947

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 6b9f037 with merge base bc1b8f0 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf, 1, 1, linux.24xl.spr-metal) (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vadimkantorov · 2024-05-10T19:20:57Z

Related old discussion on this being a useful primitive (in the old days for collate_fn of data loading) and deserving more fame :)

[feature request] Jagged / padding version of torch.stack / torch.cat + some general nested tensor discussion #65156

One useful thing here is also to support "padding multiples" per dimension

[ghstack-poisoned]

vadimkantorov · 2024-05-14T19:48:05Z

Maybe one way to auto-construct NJT from torch.stack([...]) call in default collate could be:

from dataset.__getitem__ return dense tensors but wrapped in NJT (but the internal representation should be just regular dense tensor)
support that torch.stack([..., ]) returns a NJT if elements in the input list are NJT (even if inside they are just dense tensors)

like this the collate_fn code could be kept unchanged, but if the inputs are wrapped as NJT, it would start to produce a NJT...

[ghstack-poisoned]

huydhn · 2024-09-09T21:59:27Z

@pytorchbot revert -m 'Sorry for reverting your change but it is failing dynamo test https://hud.pytorch.org/pytorch/pytorch/commit/09a5e88bef04d5485b70d8f65f46a675aaa52942, maybe a landrace' -c landrace

pytorchmergebot · 2024-09-09T22:00:59Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2024-09-09T22:01:11Z

@jbschlosser your PR has been successfully reverted.

This reverts commit 09a5e88. Reverted #125947 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing dynamo test https://hud.pytorch.org/pytorch/pytorch/commit/09a5e88bef04d5485b70d8f65f46a675aaa52942, maybe a landrace ([comment](#125947 (comment)))

github-actions · 2024-09-09T22:05:09Z

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.

Caused by:

aten/src/ATen/native/native_functions.yaml

This PR: * Implements the pre-existing `nt.to_padded_tensor(padding_val)` ATen op via the FBGEMM kernel + appropriate view gymnastics (since that kernel only handles 2D values) * Introduces a new `_nested_from_padded_tensor` op for the reverse conversion, implemented via the reverse FBGEMM kernel + view gymnastics * Note: there is currently no public API for this; design booted to a future PR TODO: * ~~Propagate min / max sequence length via the new factory function `_nested_from_padded_tensor`~~ * ~~Verify that Inductor does computation fusion via test logic~~ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames rec [ghstack-poisoned]

jbschlosser · 2024-09-12T14:32:26Z

@pytorchbot merge

jbschlosser · 2024-09-12T14:34:19Z

Changing the meta registration for _padded_dense_to_jagged_forward() to be a fake tensor impl fixes the failing dynamo test. It must be the latter since it has to create an unbacked SymInt in the case that sum_S is not specified.

pytorchmergebot · 2024-09-12T14:35:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

`rms_norm()` is a nice-to-have for ViT :) This PR: * SymInt-ifies `rms_norm()`, allowing NJT to use the same decomp. * Adds torch_function-based input validation logic for nested-specific stuff (no normalization supported over the ragged dim for now) on the python NJT side. * Adds multi-dim support (on non-ragged, non-batch dims) to `mean()` for NJT. Pull Request resolved: #135872 Approved by: https://github.com/mikaylagawarecki ghstack dependencies: #125947

This PR: * Implements the pre-existing `nt.to_padded_tensor(padding_val)` ATen op via the FBGEMM kernel + appropriate view gymnastics (since that kernel only handles 2D values) * Introduces a new `_nested_from_padded_tensor` op for the reverse conversion, implemented via the reverse FBGEMM kernel + view gymnastics * Note: there is currently no public API for this; design booted to a future PR TODO: * ~~Propagate min / max sequence length via the new factory function `_nested_from_padded_tensor`~~ * ~~Verify that Inductor does computation fusion via test logic~~ Pull Request resolved: pytorch#125947 Approved by: https://github.com/soulitzer

This reverts commit 09a5e88. Reverted pytorch#125947 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing dynamo test https://hud.pytorch.org/pytorch/pytorch/commit/09a5e88bef04d5485b70d8f65f46a675aaa52942, maybe a landrace ([comment](pytorch#125947 (comment)))

This PR: * Implements the pre-existing `nt.to_padded_tensor(padding_val)` ATen op via the FBGEMM kernel + appropriate view gymnastics (since that kernel only handles 2D values) * Introduces a new `_nested_from_padded_tensor` op for the reverse conversion, implemented via the reverse FBGEMM kernel + view gymnastics * Note: there is currently no public API for this; design booted to a future PR TODO: * ~~Propagate min / max sequence length via the new factory function `_nested_from_padded_tensor`~~ * ~~Verify that Inductor does computation fusion via test logic~~ Pull Request resolved: pytorch#125947 Approved by: https://github.com/soulitzer

`rms_norm()` is a nice-to-have for ViT :) This PR: * SymInt-ifies `rms_norm()`, allowing NJT to use the same decomp. * Adds torch_function-based input validation logic for nested-specific stuff (no normalization supported over the ragged dim for now) on the python NJT side. * Adds multi-dim support (on non-ragged, non-batch dims) to `mean()` for NJT. Pull Request resolved: pytorch#135872 Approved by: https://github.com/mikaylagawarecki ghstack dependencies: pytorch#125947

ghstack-source-id: df642b5 Pull Request resolved: pytorch/pytorch#125947

NJT <-> padded dense conversions

560c873

[ghstack-poisoned]

This was referenced May 10, 2024

Short-term fix to preserve NJT metadata cache in torch.compile #122836

Closed

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

Closed

(WIP) to_padded_tensor() triton kernel for NJT #121947

Closed

jbschlosser marked this pull request as draft May 10, 2024 18:54

Update on "NJT <-> padded dense conversions"

693979b

[ghstack-poisoned]

This was referenced May 10, 2024

Lift inductor lowerings for jagged <-> padded dense kernels #125968

Closed

torch.compile does not support strided NestedTensor #126025

Closed

Update on "NJT <-> padded dense conversions"

9d86762

[ghstack-poisoned]

jbschlosser added the topic: not user facing topic category label May 14, 2024

Update on "NJT <-> padded dense conversions"

75a274c

[ghstack-poisoned]

jbschlosser mentioned this pull request May 14, 2024

Traceable wrapper subclass support for deferred runtime asserts #126198

Closed

jbschlosser added 4 commits May 14, 2024 17:21

Update on "NJT <-> padded dense conversions"

84a6f5f

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

afc43d1

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

8e77035

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

ca8b676

[ghstack-poisoned]

This was referenced May 17, 2024

Use return_and_correct_aliasing() for NJT + compatible storage setting #126552

Closed

[Tracker] Move nested tensors to beta #112398

Open

jbschlosser added 9 commits May 17, 2024 14:15

Update on "NJT <-> padded dense conversions"

e33eecd

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

b2ad249

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

29dc557

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

0a19ede

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

a24cea6

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

8b2bd58

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

29921ff

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

85a0586

[ghstack-poisoned]

Update on "NJT <-> padded dense conversions"

974c8e9

[ghstack-poisoned]

pytorchmergebot added the Reverted label Sep 9, 2024

pytorchmergebot reopened this Sep 9, 2024

pytorchmergebot added the merging label Sep 12, 2024

jbschlosser mentioned this pull request Sep 12, 2024

Support rms_norm() for NJT #135872

Closed

pytorchmergebot closed this in 525bec8 Sep 12, 2024

pytorchmergebot removed the merging label Sep 12, 2024

jbschlosser mentioned this pull request Sep 12, 2024

Support embedding_bag() with NJT input #135888

Closed

jbschlosser mentioned this pull request Sep 17, 2024

Integer overflow while creating nested tensors #135930

Open

github-actions bot deleted the gh/jbschlosser/140/head branch October 13, 2024 02:09

KnAwnime pushed a commit to KnAwnime/Biblioteka that referenced this pull request Oct 16, 2024

NJT <-> padded dense conversions

eb9be50

ghstack-source-id: df642b5 Pull Request resolved: pytorch/pytorch#125947

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NJT <-> padded dense conversions #125947

NJT <-> padded dense conversions #125947

Uh oh!

jbschlosser commented May 10, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 10, 2024 •

edited

Loading

Uh oh!

vadimkantorov commented May 10, 2024 •

edited

Loading

Uh oh!

vadimkantorov commented May 14, 2024

Uh oh!

huydhn commented Sep 9, 2024

Uh oh!

pytorchmergebot commented Sep 9, 2024

Uh oh!

pytorchmergebot commented Sep 9, 2024

Uh oh!

github-actions bot commented Sep 9, 2024

Uh oh!

jbschlosser commented Sep 12, 2024

Uh oh!

jbschlosser commented Sep 12, 2024

Uh oh!

pytorchmergebot commented Sep 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

NJT <-> padded dense conversions #125947

NJT <-> padded dense conversions #125947

Uh oh!

Conversation

jbschlosser commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125947

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

vadimkantorov commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadimkantorov commented May 14, 2024

Uh oh!

huydhn commented Sep 9, 2024

Uh oh!

pytorchmergebot commented Sep 9, 2024

Uh oh!

pytorchmergebot commented Sep 9, 2024

Uh oh!

github-actions bot commented Sep 9, 2024

Attention! native_functions.yaml was changed

Uh oh!

jbschlosser commented Sep 12, 2024

Uh oh!

jbschlosser commented Sep 12, 2024

Uh oh!

pytorchmergebot commented Sep 12, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jbschlosser commented May 10, 2024 •

edited

Loading

pytorch-bot bot commented May 10, 2024 •

edited

Loading

vadimkantorov commented May 10, 2024 •

edited

Loading