Initial NJT testing over dim type / views #140161

jbschlosser · 2024-11-08T18:22:16Z

Stack from ghstack (oldest at bottom):

This PR introduces ExtraOpData, a structure that contains op metadata regarding whether the op is a view and the dim-related args it accepts. It also populates a huge database for dim-wise / view ops with this info.

Test logic (sample input generation, references) have been updated to utilize this data. It allows for a fairly generic set of sample inputs & a reference for the class of ops that accept a single NJT and operate dim-wise (AKA "unary dimwise ops").

Testing is added over the following ops:

chunk()
narrow()
select()
split()
split_with_sizes()
squeeze()
unflatten()
unsqueeze()

Most of the above do not operate on the ragged / batch dims or on non-contiguous NJTs, so the proper xfails are added as needed.

I also slipped in a couple minor fixes (sorry):

The _wrap_jagged_dim() helper now avoids assuming the nt._ragged_idx == 1 and allows for a batch dim to be a valid input, disambiguating the converted inner dim as necessary through an additional operating_on_batch return value (i.e. both dim=0 and dim=1 map to dim=0 on the inner values tensor, since that dim represents a packed ragged dim for all batch items)
Padded dense -> NJT conversion requires shape gymnastics to operate with the restrictive FBGEMM kernel. The gymnastics were slightly wrong for the transposed NJT case, and this PR fixes that

[ghstack-poisoned]

pytorch-bot · 2024-11-08T18:22:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140161

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit a3051d6 with merge base efec302 ():

NEW FAILURE - The following job has failed:

trunk / linux-focal-rocm6.2-py3.10 / test (distributed, 1, 1, linux.rocm.gpu) (gh)
##[error]Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: e6579f7 Pull Request resolved: #140161

[ghstack-poisoned]

ghstack-source-id: db28ba8 Pull Request resolved: #140161

[ghstack-poisoned]

ghstack-source-id: 4ad6765 Pull Request resolved: #140161

[ghstack-poisoned]

ghstack-source-id: a295dcb Pull Request resolved: #140161

[ghstack-poisoned]

ghstack-source-id: 6a58a23 Pull Request resolved: #140161

[ghstack-poisoned]

ghstack-source-id: de67a63 Pull Request resolved: #140161

pytorchmergebot · 2024-11-25T22:05:59Z

Merge failed

Reason: New commits were pushed while merging. Please rerun the merge command.

Details for Dev Infra team

Raised by workflow job

jbschlosser · 2024-11-25T22:06:21Z

@pytorchbot merge

pytorchmergebot · 2024-11-25T22:09:15Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-25T22:30:31Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Lint / lintrunner-noclang / linux-job

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

This PR introduces `ExtraOpData`, a structure that contains op metadata regarding whether the op is a view and the dim-related args it accepts. It also populates a huge database for dim-wise / view ops with this info. Test logic (sample input generation, references) have been updated to utilize this data. It allows for a fairly generic set of sample inputs & a reference for the class of ops that accept a single NJT and operate dim-wise (AKA "unary dimwise ops"). Testing is added over the following ops: * `chunk()` * `narrow()` * `select()` * `split()` * `split_with_sizes()` * `squeeze()` * `unflatten()` * `unsqueeze()` Most of the above do not operate on the ragged / batch dims or on non-contiguous NJTs, so the proper xfails are added as needed. I also slipped in a couple minor fixes (sorry): 1. The `_wrap_jagged_dim()` helper now avoids assuming the `nt._ragged_idx == 1` and allows for a batch dim to be a valid input, disambiguating the converted inner dim as necessary through an additional `operating_on_batch` return value (i.e. both dim=0 and dim=1 map to dim=0 on the inner values tensor, since that dim represents a packed ragged dim for all batch items) 2. Padded dense -> NJT conversion requires shape gymnastics to operate with the restrictive FBGEMM kernel. The gymnastics were slightly wrong for the transposed NJT case, and this PR fixes that [ghstack-poisoned]

jbschlosser · 2024-11-26T21:33:50Z

@pytorchbot merge -i

pytorchmergebot · 2024-11-26T21:35:32Z

Merge started

Your change will be merged while ignoring the following 1 checks: trunk / linux-focal-rocm6.2-py3.10 / test (distributed, 1, 1, linux.rocm.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This PR contains three `unsqueeze()`-related fixes for NJT: 1. Adjusts the output's `_ragged_idx` when `unsqueeze()` inserts a dim before the ragged dim 2. Corrects the unbind reference for `unsqueeze()` after the last input dim. For this case, the dim kwarg canonicalization logic needs to be applied wrt `inp.dim() + 1` to account for `dim=-1` properly 3. Adds ragged dim support to `unsqueeze()`, allowing for e.g. `(B, j1, D) -> (B, 1, j1, D)`. This is okay now after #137125 Note that `unsqueeze()` still doesn't support batch dim operation, and arguably should never support this. Pull Request resolved: #141392 Approved by: https://github.com/cpuhrsch ghstack dependencies: #141500, #140736, #140161

) This fixes some bugs when performing reductions / select() on dims before the ragged dim. In this case, the output NJT has a smaller number of dims, and its ragged_idx should reflect that correctly. Pull Request resolved: pytorch#141506 Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer ghstack dependencies: pytorch#141500, pytorch#140736, pytorch#140161, pytorch#141392

…141604) Old logic was completely wrong, returning `chunk_size` chunks instead of the intended number. The original test didn't catch this because `chunk_size == num_chunks` :p New OpInfo-based testing covers it though. Pull Request resolved: #141604 Approved by: https://github.com/soulitzer ghstack dependencies: #141500, #140736, #140161, #141392, #141506

This PR introduces `ExtraOpData`, a structure that contains op metadata regarding whether the op is a view and the dim-related args it accepts. It also populates a huge database for dim-wise / view ops with this info. Test logic (sample input generation, references) have been updated to utilize this data. It allows for a fairly generic set of sample inputs & a reference for the class of ops that accept a single NJT and operate dim-wise (AKA "unary dimwise ops"). Testing is added over the following ops: * `chunk()` * `narrow()` * `select()` * `split()` * `split_with_sizes()` * `squeeze()` * `unflatten()` * `unsqueeze()` Most of the above do not operate on the ragged / batch dims or on non-contiguous NJTs, so the proper xfails are added as needed. I also slipped in a couple minor fixes (sorry): 1. The `_wrap_jagged_dim()` helper now avoids assuming the `nt._ragged_idx == 1` and allows for a batch dim to be a valid input, disambiguating the converted inner dim as necessary through an additional `operating_on_batch` return value (i.e. both dim=0 and dim=1 map to dim=0 on the inner values tensor, since that dim represents a packed ragged dim for all batch items) 2. Padded dense -> NJT conversion requires shape gymnastics to operate with the restrictive FBGEMM kernel. The gymnastics were slightly wrong for the transposed NJT case, and this PR fixes that Pull Request resolved: pytorch#140161 Approved by: https://github.com/Skylion007, https://github.com/cpuhrsch ghstack dependencies: pytorch#140736

This PR contains three `unsqueeze()`-related fixes for NJT: 1. Adjusts the output's `_ragged_idx` when `unsqueeze()` inserts a dim before the ragged dim 2. Corrects the unbind reference for `unsqueeze()` after the last input dim. For this case, the dim kwarg canonicalization logic needs to be applied wrt `inp.dim() + 1` to account for `dim=-1` properly 3. Adds ragged dim support to `unsqueeze()`, allowing for e.g. `(B, j1, D) -> (B, 1, j1, D)`. This is okay now after pytorch#137125 Note that `unsqueeze()` still doesn't support batch dim operation, and arguably should never support this. Pull Request resolved: pytorch#141392 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#140736, pytorch#140161

This reverts commit 730caf0. Reverted pytorch#140161 on behalf of https://github.com/malfet due to Sorry for reverting your change but its tests are failing in trunk ([comment](pytorch#140736 (comment)))

This PR introduces `ExtraOpData`, a structure that contains op metadata regarding whether the op is a view and the dim-related args it accepts. It also populates a huge database for dim-wise / view ops with this info. Test logic (sample input generation, references) have been updated to utilize this data. It allows for a fairly generic set of sample inputs & a reference for the class of ops that accept a single NJT and operate dim-wise (AKA "unary dimwise ops"). Testing is added over the following ops: * `chunk()` * `narrow()` * `select()` * `split()` * `split_with_sizes()` * `squeeze()` * `unflatten()` * `unsqueeze()` Most of the above do not operate on the ragged / batch dims or on non-contiguous NJTs, so the proper xfails are added as needed. I also slipped in a couple minor fixes (sorry): 1. The `_wrap_jagged_dim()` helper now avoids assuming the `nt._ragged_idx == 1` and allows for a batch dim to be a valid input, disambiguating the converted inner dim as necessary through an additional `operating_on_batch` return value (i.e. both dim=0 and dim=1 map to dim=0 on the inner values tensor, since that dim represents a packed ragged dim for all batch items) 2. Padded dense -> NJT conversion requires shape gymnastics to operate with the restrictive FBGEMM kernel. The gymnastics were slightly wrong for the transposed NJT case, and this PR fixes that Pull Request resolved: pytorch#140161 Approved by: https://github.com/Skylion007, https://github.com/cpuhrsch ghstack dependencies: pytorch#141500, pytorch#140736

This PR contains three `unsqueeze()`-related fixes for NJT: 1. Adjusts the output's `_ragged_idx` when `unsqueeze()` inserts a dim before the ragged dim 2. Corrects the unbind reference for `unsqueeze()` after the last input dim. For this case, the dim kwarg canonicalization logic needs to be applied wrt `inp.dim() + 1` to account for `dim=-1` properly 3. Adds ragged dim support to `unsqueeze()`, allowing for e.g. `(B, j1, D) -> (B, 1, j1, D)`. This is okay now after pytorch#137125 Note that `unsqueeze()` still doesn't support batch dim operation, and arguably should never support this. Pull Request resolved: pytorch#141392 Approved by: https://github.com/cpuhrsch ghstack dependencies: pytorch#141500, pytorch#140736, pytorch#140161

) This fixes some bugs when performing reductions / select() on dims before the ragged dim. In this case, the output NJT has a smaller number of dims, and its ragged_idx should reflect that correctly. Pull Request resolved: pytorch#141506 Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer ghstack dependencies: pytorch#141500, pytorch#140736, pytorch#140161, pytorch#141392

…ytorch#141604) Old logic was completely wrong, returning `chunk_size` chunks instead of the intended number. The original test didn't catch this because `chunk_size == num_chunks` :p New OpInfo-based testing covers it though. Pull Request resolved: pytorch#141604 Approved by: https://github.com/soulitzer ghstack dependencies: pytorch#141500, pytorch#140736, pytorch#140161, pytorch#141392, pytorch#141506

NJT testing over dim type / views

e02e5f9

[ghstack-poisoned]

This was referenced Nov 8, 2024

Misc. non-contig NJT fixes #140160

Closed

NJT OpInfo tests v2 #138370

Closed

Skylion007 approved these changes Nov 8, 2024

View reviewed changes

jbschlosser marked this pull request as draft November 8, 2024 18:53

jbschlosser added the topic: not user facing topic category label Nov 8, 2024

Update on "NJT testing over dim type / views"

8de4f1f

[ghstack-poisoned]

Update on "NJT testing over dim type / views"

84987a1

[ghstack-poisoned]

Update on "NJT testing over dim type / views"

05bc1dc

[ghstack-poisoned]

jbschlosser added a commit that referenced this pull request Nov 8, 2024

NJT testing over dim type / views

cf859ad

ghstack-source-id: e6579f7 Pull Request resolved: #140161

Update on "NJT testing over dim type / views"

18a1666

[ghstack-poisoned]

jbschlosser added a commit that referenced this pull request Nov 11, 2024

NJT testing over dim type / views

de5e138

ghstack-source-id: db28ba8 Pull Request resolved: #140161

Update on "NJT testing over dim type / views"

587a2ec

[ghstack-poisoned]

jbschlosser mentioned this pull request Nov 12, 2024

General per-SampleInput xfail / skip system #140443

Closed

Update on "NJT testing over dim type / views"

7fea6e0

[ghstack-poisoned]

Update on "NJT testing over dim type / views"

e0845f9

[ghstack-poisoned]

jbschlosser added a commit that referenced this pull request Nov 12, 2024

NJT testing over dim type / views

ac2b38c

ghstack-source-id: 4ad6765 Pull Request resolved: #140161

Update on "NJT testing over dim type / views"

6119fdd

[ghstack-poisoned]

jbschlosser added a commit that referenced this pull request Nov 13, 2024

NJT testing over dim type / views

7335e2f

ghstack-source-id: a295dcb Pull Request resolved: #140161

Update on "NJT testing over dim type / views"

d93d02e

[ghstack-poisoned]

jbschlosser added a commit that referenced this pull request Nov 14, 2024

NJT testing over dim type / views

78e1255

ghstack-source-id: 6a58a23 Pull Request resolved: #140161

Update on "NJT testing over dim type / views"

31db0c3

[ghstack-poisoned]

jbschlosser mentioned this pull request Nov 14, 2024

Forward / backward NJT support for several activation functions #140736

Closed

jbschlosser added a commit that referenced this pull request Nov 14, 2024

NJT testing over dim type / views

5f732ac

ghstack-source-id: de67a63 Pull Request resolved: #140161

pytorchmergebot removed the merging label Nov 25, 2024

pytorchmergebot added the merging label Nov 25, 2024

pytorchmergebot removed the merging label Nov 25, 2024

jbschlosser added 2 commits November 26, 2024 12:16

jbschlosser mentioned this pull request Nov 26, 2024

NJT: Return correct number of outputs for chunk() on the batch dim #141604

Closed

pytorchmergebot added the merging label Nov 26, 2024

pytorchmergebot closed this in 9ee5d6f Nov 26, 2024

pytorchmergebot removed the merging label Nov 26, 2024

github-actions bot deleted the gh/jbschlosser/198/head branch December 27, 2024 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initial NJT testing over dim type / views #140161

Initial NJT testing over dim type / views #140161

Uh oh!

jbschlosser commented Nov 8, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 8, 2024 •

edited

Loading

Uh oh!

pytorchmergebot commented Nov 25, 2024

Uh oh!

jbschlosser commented Nov 25, 2024

Uh oh!

pytorchmergebot commented Nov 25, 2024

Uh oh!

pytorchmergebot commented Nov 25, 2024

Uh oh!

jbschlosser commented Nov 26, 2024

Uh oh!

pytorchmergebot commented Nov 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Initial NJT testing over dim type / views #140161

Initial NJT testing over dim type / views #140161

Uh oh!

Conversation

jbschlosser commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140161

❌ 1 New Failure

Uh oh!

pytorchmergebot commented Nov 25, 2024

Merge failed

Uh oh!

jbschlosser commented Nov 25, 2024

Uh oh!

pytorchmergebot commented Nov 25, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 25, 2024

Merge failed

Uh oh!

jbschlosser commented Nov 26, 2024

Uh oh!

pytorchmergebot commented Nov 26, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jbschlosser commented Nov 8, 2024 •

edited

Loading

pytorch-bot bot commented Nov 8, 2024 •

edited

Loading