KEMBAR78
[MPS] Add `shifted_chebyshev_polynomial_[tuvw]` by malfet · Pull Request #157488 · pytorch/pytorch · GitHub
Skip to content

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Jul 2, 2025

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157488

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 5 Pending

As of commit 41a44e7 with merge base 0f9c1b3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) module: inductor release notes: mps Release notes category labels Jul 2, 2025
malfet added a commit that referenced this pull request Jul 2, 2025
For eager and inductor

ghstack-source-id: 3cc7415
Pull Request resolved: #157488
@malfet malfet requested review from dcci and manuelcandales July 2, 2025 19:52
@malfet malfet added the topic: improvements topic category label Jul 2, 2025
@malfet malfet requested a review from jansel July 2, 2025 19:52
@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2025

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.


Caused by:

@malfet
Copy link
Contributor Author

malfet commented Jul 2, 2025

@pytorchbot merge -f "Lint + MPS are green"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

@malfet your PR has been reverted as part of the stack under #157464.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Jul 3, 2025
[ghstack-poisoned]
@malfet
Copy link
Contributor Author

malfet commented Jul 3, 2025

@pytorchbot merge -f "Let's land this one first than"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Jul 5, 2025
They might have been slow on CUDA-11.3, but this version of CUDA is long gone. More fundamental underlying issue were linear complexity of the recursive polynomial definitions for higher order polynomials, for example see this loop from implementation of Chebyshev polynomial of the first kind
https://github.com/pytorch/pytorch/blob/7081b8233a64c350c64e9f00c9b9d00e52020241/aten/src/ATen/native/Math.h#L2969-L2973
which were tested by `test_compare_cpu` using following values (as sample index 16)
https://github.com/pytorch/pytorch/blob/7081b8233a64c350c64e9f00c9b9d00e52020241/torch/testing/_internal/opinfo/core.py#L2079

Luckily chebyshev polynomials for absolute values higher than 1 pretty quickly reach infinity, see below
```
python3 -c "import torch;print(torch.special.chebyshev_polynomial_v(torch.nextafter(torch.tensor(1.0), torch.tensor(2.0)), torch.tensor(1e6)))"
tensor(nan)
```
Which is not the case for Laguerre polynomials, but it's probably fine to just limit it to 1e7

Before
```
$ PYTORCH_TEST_WITH_SLOW=1 python test_ops.py -k chebyshev_polynomial_
ssssssss..ssssss..ssssss..ssssssssssssssssssssss..ssssss/home/ubuntu/py3.10-nightly/lib/python3.10/site-packages/torch/backends/cuda/__init__.py:131: UserWarning: This API is going to be deprecated, please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:78.)
  return torch._C._get_cublas_allow_tf32()
....ssssssssssss..ssssss..ssssss............ssssssssssssssssssssssssssssssssssss..ssssssssssssss..ssssss..ssssssssssssssssssssssssssssss..ssssss....ssssssssssss..ssssss..ssssss............ssssssssssssssssssssssssssssssssssss..ssssss..ssssssssssssss..ssssss..ssssss..ssssssssssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssss..ssssssssssssss
----------------------------------------------------------------------
Ran 432 tests in 8.575s

OK (skipped=344)
```
After
```
$ PYTORCH_TEST_WITH_SLOW=1 python test_ops.py -k chebyshev_polynomial_
ssssssss........................ssssssssssssssss......../home/ubuntu/pytorch/torch/backends/cuda/__init__.py:131: UserWarning: This API is going to be deprecated, please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /home/ubuntu/pytorch/aten/src/ATen/Context.cpp:78.)
  return torch._C._get_cublas_allow_tf32()
........................................................................................xxxxxxxx................ssssssssssssssssssssssss........................................................................................................ssssssss........................ssssssss........................................................................................ssssssss
----------------------------------------------------------------------
Ran 432 tests in 45.580s

OK (skipped=72, expected failures=8)
```

Fixes #79528

Pull Request resolved: #157464
Approved by: https://github.com/Skylion007, https://github.com/dcci
ghstack dependencies: #157488
@github-actions github-actions bot deleted the gh/malfet/428/head branch August 3, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) Merged module: inductor release notes: mps Release notes category Reverted topic: improvements topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants