KEMBAR78
[RFC][FSDP] Don't move ignored params / buffers to device by rohan-varma · Pull Request #108033 · pytorch/pytorch · GitHub
Skip to content

Conversation

@rohan-varma
Copy link
Contributor

@rohan-varma rohan-varma commented Aug 28, 2023

Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 28, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108033

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

If you must merge, use @pytorchbot merge -f.

✅ No Failures

As of commit 9a5eae4 with merge base a20fac8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: distributed (fsdp) release notes category label Aug 28, 2023
rohan-varma added a commit that referenced this pull request Aug 28, 2023
Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

ghstack-source-id: 198857863
Pull Request resolved: #108033
import torch.distributed.fsdp._traversal_utils as traversal_utils
import torch.distributed.fsdp.fully_sharded_data_parallel as fsdp_file
import torch.nn as nn
from torch import Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If we do this, I feel like we should replace all torch.Tensor in the file with Tensor, or else it gets fragmented.

Maybe just use torch.Tensor on L857?

for submodule in curr_module.children():
if not isinstance(submodule, fsdp_file.FullyShardedDataParallel):
queue.append(submodule)
# NOTE: This includes moving ignored modules' parameters. If we
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this note.

@awgu
Copy link
Collaborator

awgu commented Aug 31, 2023

I am on board with the change.

Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

[ghstack-poisoned]
rohan-varma added a commit that referenced this pull request Sep 1, 2023
Pull Request resolved: #108033

Since these are ignored by FSDP, don't move them.
ghstack-source-id: 199417931
@exported-using-ghexport

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)
@rohan-varma rohan-varma requested a review from awgu September 1, 2023 00:43
Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

[ghstack-poisoned]
@awgu
Copy link
Collaborator

awgu commented Sep 1, 2023

Please check lint before landing. Thanks!

Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

[ghstack-poisoned]
Since these are ignored by FSDP, don't move them.

Differential Revision: [D48727044](https://our.internmc.facebook.com/intern/diff/D48727044/)

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Sep 5, 2023
…NTRANT (#108435)

We should use no_reentrant. There are a lot of users of this API, but
it is in a prototype state so should be fine to change.

Differential Revision: [D48898148](https://our.internmc.facebook.com/intern/diff/D48898148/)
Pull Request resolved: #108435
Approved by: https://github.com/awgu
ghstack dependencies: #108032, #108033
@facebook-github-bot facebook-github-bot deleted the gh/rohan-varma/735/head branch September 9, 2023 14:22
ignored_buffers: Set[torch.Tensor],
device_from_device_id: Optional[torch.device],
) -> None:
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docstring below should be updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants