KEMBAR78

[DTensor][1/N] add forward layer norm support by XilunWu · Pull Request #113105 · pytorch/pytorch · GitHub

[DTensor][1/N] add forward layer norm support #113105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

XilunWu wants to merge 3 commits into gh/XilunWu/48/base from gh/XilunWu/48/head

Contributor

XilunWu commented Nov 7, 2023 •

edited by pytorch-bot bot

Loading

Stack from ghstack (oldest at bottom):

Summary:
This PR adds DTensor implementation for ATen op native_layer_norm.

Test:
pytest test/distributed/_tensor/test_dtensor_ops.py -s -k layer_norm


          [DTensor][1/N] add forward layer norm support

d0413d4

[ghstack-poisoned]

XilunWu requested review from H-Huang, awgu, d4l3k, fduwjj, fegin, kiukchung, kwen2501, mrshenli, rohan-varma, wanchaol, wz337 and zhaojuanmao as code owners

November 7, 2023 00:43

XilunWu mentioned this pull request

[dtensor][BE][1/N] fix DTensor Ops test #113104

Closed

pytorch-bot bot commented Nov 7, 2023 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/113105

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 400ef8f with merge base 56e514a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

XilunWu added a commit that referenced this pull request


          [DTensor][1/N] add forward layer norm support

30e5505

ghstack-source-id: 00df82a
Pull Request resolved: #113105

wanchaol reviewed

View reviewed changes

Collaborator

wanchaol left a comment

first pass, please see the inline comments, I think the semantics of replicate_dim_start_at looks wrong and it might make all the tensor dims be replicated.

torch/distributed/_tensor/ops/math_ops.py Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated

    
                  # must be in form of OpStrategy

                  assert isinstance(input_schema, OpStrategy)

                  assert isinstance(normalized_shape, (int, Sequence, torch.Size))

                  normalized_size = (

Collaborator

wanchaol Nov 7, 2023

I thought you should call normalize_to_torch_size that's why we refactored that?

Contributor Author

XilunWu Nov 7, 2023

oops, the code was override in rebase...

Contributor Author

XilunWu Nov 8, 2023

note: forward fix normalize_to_torch_size in #113244 to handle the size argument of type int, Sequence, or torch.Size. This fix also enables 3 dtensor op tests.

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved


          Update on "[DTensor][1/N] add forward layer norm support"

90408d3

**Summary**:
This PR adds DTensor implementation for ATen op `native_layer_norm`.

**Test**:
`pytest test/distributed/_tensor/test_dtensor_ops.py -s -k layer_norm`


[ghstack-poisoned]

XilunWu added a commit that referenced this pull request


          [DTensor][1/N] add forward layer norm support

9990df0

ghstack-source-id: 36d25c9
Pull Request resolved: #113105

wanchaol reviewed

View reviewed changes

Collaborator

wanchaol left a comment

Have a few comments inlined otherwise good to go!

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Outdated Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py Show resolved Hide resolved

torch/distributed/_tensor/ops/math_ops.py

    
            @@ -1,5 +1,5 @@
          
              # Copyright (c) Meta Platforms, Inc. and affiliates

              from typing import cast, List, Optional, Tuple

              from typing import cast, List, Optional, Sequence, Tuple

Collaborator

wanchaol Nov 7, 2023

In addition to the op level tests, I think we should add a test in test_math_ops.py to use distribute_module to partition a nn.LayerNorm, and mark input as sharded in prepare_input_fn, to make sure things work for the forward case when we apply this to nn.Module, this can be a follow up PR


          Update on "[DTensor][1/N] add forward layer norm support"

400ef8f

**Summary**:
This PR adds DTensor implementation for ATen op `native_layer_norm`.

**Test**:
`pytest test/distributed/_tensor/test_dtensor_ops.py -s -k layer_norm`


[ghstack-poisoned]

XilunWu added a commit that referenced this pull request


          [DTensor][1/N] add forward layer norm support

85ad49c

ghstack-source-id: 4e87fa4
Pull Request resolved: #113105

wanchaol approved these changes

View reviewed changes

Collaborator

wanchaol left a comment

lgtm!

XilunWu mentioned this pull request

[DTensor][2/N][forward fix] extend util function normalize_to_torch_size to accept single int size #113244

Closed

XilunWu added ciflow/trunk module: dtensor release notes: distributed (dtensor) labels

pytorchmergebot added the Merged label

pytorchmergebot closed this in

pytorchmergebot pushed a commit that referenced this pull request


          [DTensor][2/N][forward fix] extend util function normalize_to_torch_s…

e138d80

…ize to accept single int size (#113244)

**Summary**:
In #113105 I used the util function `normalize_to_torch_size` to unify the `size` argument that may be in multiple formats. However the use of that function would only handle input of `Sequence[int]` therefore I submit this forward fix to have `normalize_to_torch_size` also able to handle the size argument of type `int` and `torch.Size`. A side product of this fix is it also enables 3 dtensor op tests (check `test_dtensor_ops.py`).

**Test**:
`pytest test/distributed/_tensor/test_dtensor_ops.py`

Pull Request resolved: #113244
Approved by: https://github.com/wanchaol
ghstack dependencies: #113105

wanchaol mentioned this pull request

[dtensor] support convolution ops #113123

Closed

facebook-github-bot deleted the gh/XilunWu/48/head branch

November 12, 2023 15:23

Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request


          [DTensor][1/N] add forward layer norm support (pytorch#113105)

d7a25e6

**Summary**:
This PR adds DTensor implementation for ATen op `native_layer_norm`.

**Test**:
`pytest test/distributed/_tensor/test_dtensor_ops.py -s -k layer_norm`

Pull Request resolved: pytorch#113105
Approved by: https://github.com/wanchaol

Skylion007 pushed a commit to Skylion007/pytorch that referenced this pull request


          [DTensor][2/N][forward fix] extend util function normalize_to_torch_s…

aaa697b

…ize to accept single int size (pytorch#113244)

**Summary**:
In pytorch#113105 I used the util function `normalize_to_torch_size` to unify the `size` argument that may be in multiple formats. However the use of that function would only handle input of `Sequence[int]` therefore I submit this forward fix to have `normalize_to_torch_size` also able to handle the size argument of type `int` and `torch.Size`. A side product of this fix is it also enables 3 dtensor op tests (check `test_dtensor_ops.py`).

**Test**:
`pytest test/distributed/_tensor/test_dtensor_ops.py`

Pull Request resolved: pytorch#113244
Approved by: https://github.com/wanchaol
ghstack dependencies: pytorch#113105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

wanchaol wanchaol approved these changes

mrshenli Awaiting requested review from mrshenli

zhaojuanmao Awaiting requested review from zhaojuanmao

rohan-varma Awaiting requested review from rohan-varma

H-Huang Awaiting requested review from H-Huang

awgu Awaiting requested review from awgu

kwen2501 Awaiting requested review from kwen2501

fegin Awaiting requested review from fegin

fduwjj Awaiting requested review from fduwjj

wz337 Awaiting requested review from wz337

kiukchung Awaiting requested review from kiukchung

d4l3k Awaiting requested review from d4l3k

Labels

ciflow/trunk Merged module: dtensor release notes: distributed (dtensor)