KEMBAR78
Fix mismatched tensor metadata between FakeTensor and Intel XPU concrete tensor when running `F.logsigmoid` by chunhuanMeng · Pull Request #141333 · pytorch/pytorch · GitHub
Skip to content

Conversation

@chunhuanMeng
Copy link
Contributor

@chunhuanMeng chunhuanMeng commented Nov 22, 2024

Fixes #141332
F.logsigmoid will return two outputs: output and buffer.
For F.logsigmoid cpu path, it will use buffer to store some intermediate values and use them when computing gradients, so it returns a buffer tensor with nonzero size. For cuda and xpu paths, buffer is useless, so the buffer tensor size of xpu F.logsigmoid will be zero, just like cuda. The root cause of the issue is that the codes in decompositions.py (ref:https://github.com/pytorch/pytorch/blob/main/torch/_decomp/decompositions.py#L2803) only handle the cuda cases, when the a fake tensor with device is xpu run to here, it will use the cpu path and return a buffer with nonzero size, which is conflict to the implementation of intel xpu concrete tensor. Therefore this pr add conditions to handle xpu cases. Make sure the two returned buffer sizes match each other.

cc @gujinghui @EikanWang @fengyuan14 @guangyey

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141333

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 5 Unrelated Failures

As of commit d47f443 with merge base 5deca07 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@chunhuanMeng
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Nov 22, 2024
@chunhuanMeng
Copy link
Contributor Author

@pytorchbot label "module: xpu"

@pytorch-bot pytorch-bot bot added the module: xpu Intel XPU related issues label Nov 22, 2024
@guangyey guangyey requested a review from EikanWang November 22, 2024 07:35
@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Nov 22, 2024
@guangyey
Copy link
Collaborator

Could you add a UT for XPU?

@chunhuanMeng
Copy link
Contributor Author

Could you add a UT for XPU?

Will test these cases in project torch-xpu-ops

@EikanWang
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

@chunhuanMeng chunhuanMeng changed the title Add conditions to allow XPU to use zero buffer Fix mismatched tensor metadata between FakeTensor and Intel XPU concrete tensors when running F.logsigmoid Nov 26, 2024
@chunhuanMeng chunhuanMeng changed the title Fix mismatched tensor metadata between FakeTensor and Intel XPU concrete tensors when running F.logsigmoid Fix mismatched tensor metadata between FakeTensor and Intel XPU concrete tensor when running F.logsigmoid Nov 26, 2024
Copy link
Collaborator

@EikanWang EikanWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. fix the lint issue.

@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 26, 2024
min = torch.minimum(self.new_zeros(()), self)
z = torch.exp(-torch.abs(self))
if self.is_cuda:
if self.is_cuda or self.is_xpu:
Copy link
Collaborator

@guangyey guangyey Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add is_xpu to tensor doc, refer to

pytorch/torch/_tensor_docs.py

Lines 6688 to 6693 in 6e61ff4

add_docstr_all(
"is_cuda",
r"""
Is ``True`` if the Tensor is stored on the GPU, ``False`` otherwise.
""",
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add is_xpu to pyi, refer to

pytorch/tools/pyi/gen_pyi.py

Lines 1205 to 1208 in 5ca75ac

"is_cpu": ["is_cpu: _bool"],
"is_cuda": ["is_cuda: _bool"],
"is_leaf": ["is_leaf: _bool"],
"is_nested": ["is_nested: _bool"],

@guangyey
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

@guangyey guangyey added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 28, 2024
@guangyey guangyey added this to the 2.6.0 milestone Nov 28, 2024
@EikanWang
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

@EikanWang
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

@guangyey
Copy link
Collaborator

guangyey commented Dec 2, 2024

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Tried to rebase and push PR #141333, but it was already up to date. Try rebasing against main by issuing:
@pytorchbot rebase -b main

@ezyang
Copy link
Contributor

ezyang commented Dec 2, 2024

@pytorchbot merge -f "this looks fine to force land"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…ete tensor when running `F.logsigmoid` (pytorch#141333)

Fixes pytorch#141332
`F.logsigmoid` will return two outputs: `output` and `buffer`.
For `F.logsigmoid` cpu path, it will use buffer to store some intermediate values and use them when computing gradients, so it returns a `buffer` tensor with nonzero size. For cuda and xpu paths, buffer is useless, so the `buffer ` tensor size of xpu `F.logsigmoid`  will be zero, just like cuda. The root cause of the issue is that the codes in `decompositions.py` (ref:https://github.com/pytorch/pytorch/blob/main/torch/_decomp/decompositions.py#L2803) only handle the cuda cases, when the a fake tensor with device is xpu run to here, it will use the cpu path and return a `buffer` with nonzero size, which is conflict to the  implementation of intel xpu concrete tensor. Therefore this pr add conditions to handle xpu cases. Make sure the two returned buffer sizes match each other.

Pull Request resolved: pytorch#141333
Approved by: https://github.com/guangyey, https://github.com/EikanWang, https://github.com/ezyang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks Merged module: xpu Intel XPU related issues open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

XPU fake tensor failed when using F.logsigmoid with CrossRefFakeMode()

7 participants