Fix docstring for clip_grads_with_norm_ to reflect clamping behavior #158200

RajeshvShiyal · 2025-07-13T11:42:58Z

Fix docstring for clip_grads_with_norm_ to reflect clamping behavior
This PR updates the docstring for torch.nn.utils.clip_grads_with_norm_ to accurately reflect the implementation behavior. The current documentation suggests that gradients are always scaled by:

grad = grad * (max_norm / (total_norm + eps))

However, the actual implementation clamps the scale coefficient to a maximum of 1.0, ensuring gradients are only scaled down, not up. This PR corrects the formula and adds a clarifying note to avoid confusion for users.

Updated the formula in the docstring to:

grad = grad * min(max_norm / (total_norm + eps), 1.0)

Added a note explaining the rationale for clamping (to prevent gradient amplification).
Ensured consistency with the behavior of clip_grad_norm_.

Fixes #151554

pytorch-bot · 2025-07-13T11:43:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158200

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 7e384dc with merge base fac0be7 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, lf.linux.12xlarge, unstable) (gh) (#158876)
sccache: error: couldn't connect to server

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RajeshvShiyal · 2025-07-13T11:46:32Z

@pytorchbot label "release notes: python_frontend"
@pytorchbot label "topic: not user facing"

RajeshvShiyal · 2025-07-14T03:00:40Z

Hello @spzala,
I have tried to raise a pull request. Please review the pull request.

mikaylagawarecki

Please fix lint

torch/nn/utils/clip_grad.py

mikaylagawarecki · 2025-07-15T22:40:55Z

lint is still failing

spzala

@RajeshvShiyal thanks for the PR and addressing review comments quickly. Seems like your new changes failing lint. Make sure to run it successfully locally.

RajeshvShiyal · 2025-07-16T03:12:07Z

Hello @mikaylagawarecki, @spzala,

Following error seems infra specific.

Collecting uv==0.1.45
Downloading uv-0.1.45-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (32 kB)
Requirement already satisfied: setuptools in /var/lib/jenkins/ci_env/lib/python3.9/site-packages (79.0.1)
Downloading uv-0.1.45-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.8 MB)
25l ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/12.8 MB ? eta -:--:--
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.8/12.8 MB 180.6 MB/s eta 0:00:00
25hInstalling collected packages: uv
Attempting uninstall: uv
Found existing installation: uv 0.7.20
Uninstalling uv-0.7.20:
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/var/lib/jenkins/ci_env/bin/uv'
Check the permissions.

main()

File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
run_cmd_or_die(f"docker exec -t {container_name} /exec")
File "/home/ec2-user/actions-runner/_work/pytorch/pytorch/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t 131240263c68dfcc261644d3bf605add8ddee94f1a4d3080a27b482d269f0ce3 /exec failed with exit code 1

Error: Process completed with exit code 1.

Note: I have also run lintrunner locally, but no lint errors found on my local machine.

spzala · 2025-07-16T13:19:21Z

@RajeshvShiyal thanks much for verifying locally. I agree the error seems related to build process. Hopefully, rerunning test shouldn't have it.

mikaylagawarecki · 2025-07-16T16:00:42Z

@pytorchbot rebase

pytorchmergebot · 2025-07-16T16:02:13Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

This PR updates the docstring for torch.nn.utils.clip_grads_with_norm_ to accurately reflect the implementation behavior. The current documentation suggests that gradients are always scaled by: grad = grad * (max_norm / (total_norm + eps)) However, the actual implementation clamps the scale coefficient to a maximum of 1.0, ensuring gradients are only scaled down, not up. This PR corrects the formula and adds a clarifying note to avoid confusion for users. Updated the formula in the docstring to: grad = grad * min(max_norm / (total_norm + eps), 1.0) Added a note explaining the rationale for clamping (to prevent gradient amplification). Ensured consistency with the behavior of clip_grad_norm_.

pytorchmergebot · 2025-07-16T16:02:17Z

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

mikaylagawarecki · 2025-07-16T23:07:49Z

The lint failure is real https://github.com/pytorch/pytorch/actions/runs/16324470514/job/46120316635?pr=158200

spzala · 2025-07-16T23:27:53Z

The lint failure is real https://github.com/pytorch/pytorch/actions/runs/16324470514/job/46120316635?pr=158200

@mikaylagawarecki yes, it's real this time :) Thanks! The previous failure only had ERROR: Could not install packages due to an OSError.....
@RajeshvShiyal please fix. Thanks!

I have run the lintrunner locally. It was simulated. Now it would be resolved.

RajeshvShiyal · 2025-07-17T05:15:40Z

Hello @mikaylagawarecki, @spzala,

I have run lintrunner locally and resolved the errors.

Note: Lint also suggested me below changes, so, those changes also done.

used below format

] = _group_tensors_by_device_and_dtype(
    [grads]
)  # type: ignore[assignment]

instead this

] = _group_tensors_by_device_and_dtype([grads])  # type: ignore[assignment]

RajeshvShiyal · 2025-07-17T18:22:16Z

Hello @mikaylagawarecki, @spzala,

Still framework/infra specific lint error.

spzala · 2025-07-17T18:29:26Z

@RajeshvShiyal the error seems real if you see here, https://github.com/pytorch/pytorch/actions/runs/16336731501/job/46188865026?pr=158200#step:15:163
It seems related to changes you made unrelated to your PR changes. I would suggest to keep only your original changes. Thanks!

RajeshvShiyal · 2025-07-17T18:36:38Z

Hello @mikaylagawarecki, @spzala,

Changes which are not related to this PR reverted back. Thank you

RajeshvShiyal · 2025-07-19T01:44:29Z

Hello @mikaylagawarecki, @spzala

Now following one failure. It seems specific to "sudo: setrlimit(RLIMIT_STACK): Operation not permitted"

Removed tuple of ints from supported dtype for parameter dim

Update _torch_docs.py

RajeshvShiyal · 2025-07-19T07:35:02Z

Hello sorry for below commits, As I was trying to raise PR for another issue by mistake those changes committed in this PR.

RajeshvShiyal · 2025-07-25T10:57:38Z

Hello @mikaylagawarecki, @spzala

Still, one failure which is same as previous one.

mikaylagawarecki · 2025-07-25T14:48:50Z

@pytorchbot merge

pytorchmergebot · 2025-07-25T14:52:17Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…158200) Fix docstring for clip_grads_with_norm_ to reflect clamping behavior This PR updates the docstring for torch.nn.utils.clip_grads_with_norm_ to accurately reflect the implementation behavior. The current documentation suggests that gradients are always scaled by: grad = grad * (max_norm / (total_norm + eps)) However, the actual implementation clamps the scale coefficient to a maximum of 1.0, ensuring gradients are only scaled down, not up. This PR corrects the formula and adds a clarifying note to avoid confusion for users. Updated the formula in the docstring to: grad = grad * min(max_norm / (total_norm + eps), 1.0) Added a note explaining the rationale for clamping (to prevent gradient amplification). Ensured consistency with the behavior of clip_grad_norm_. Fixes #151554 Pull Request resolved: #158200 Approved by: https://github.com/mikaylagawarecki

RajeshvShiyal requested review from albanD, jbschlosser and mikaylagawarecki as code owners July 13, 2025 11:42

pytorchbot added the open source label Jul 13, 2025

pytorch-bot bot added the release notes: python_frontend python frontend release notes category label Jul 13, 2025

RajeshvShiyal marked this pull request as draft July 14, 2025 02:53

RajeshvShiyal marked this pull request as ready for review July 14, 2025 02:55

albanD removed their request for review July 14, 2025 15:45

mikaylagawarecki reviewed Jul 14, 2025

View reviewed changes

torch/nn/utils/clip_grad.py Outdated Show resolved Hide resolved

albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 14, 2025

RajeshvShiyal marked this pull request as draft July 15, 2025 04:33

RajeshvShiyal marked this pull request as ready for review July 15, 2025 04:38

RajeshvShiyal requested a review from mikaylagawarecki July 15, 2025 04:41

mikaylagawarecki approved these changes Jul 15, 2025

View reviewed changes

spzala reviewed Jul 16, 2025

View reviewed changes

RajeshvShiyal added 5 commits July 16, 2025 16:02

Update clip_grad.py

d363d17

Update clip_grad.py

092ad8b

Update clip_grad.py

ccf3ba6

Update clip_grad.py

3b13263

pytorchmergebot force-pushed the main branch from 73c6009 to 3b13263 Compare July 16, 2025 16:02

RajeshvShiyal added 4 commits July 17, 2025 10:15

Update clip_grad.py

e982d05

I have run the lintrunner locally. It was simulated. Now it would be resolved.

Update clip_grad.py

f1c0d8a

Update clip_grad.py

4f2c749

Update clip_grad.py

dfdf9e9

Update clip_grad.py

1caabef

Merge branch 'pytorch:main' into main

1954167

RajeshvShiyal added 5 commits July 19, 2025 10:20

Update _torch_docs.py

9cdcee4

Removed tuple of ints from supported dtype for parameter dim

Update _torch_docs.py

46cec2b

Merge pull request #2 from RajeshvShiyal/RajeshvShiyal-patch-3

3d5191a

Update _torch_docs.py

Merge branch 'pytorch:main' into main

6068bdf

Update _torch_docs.py

7e384dc

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 25, 2025

pytorchmergebot added the merging label Jul 25, 2025

pytorchmergebot closed this in 4c0d5ad Jul 25, 2025

pytorchmergebot added Merged and removed merging labels Jul 25, 2025

Fix docstring for clip_grads_with_norm_ to reflect clamping behavior #158200

Fix docstring for clip_grads_with_norm_ to reflect clamping behavior #158200

Uh oh!

Conversation

RajeshvShiyal commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158200

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

RajeshvShiyal commented Jul 13, 2025

Uh oh!

RajeshvShiyal commented Jul 14, 2025

Uh oh!

mikaylagawarecki left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikaylagawarecki commented Jul 15, 2025

Uh oh!

spzala left a comment

Choose a reason for hiding this comment

Uh oh!

RajeshvShiyal commented Jul 16, 2025

Uh oh!

spzala commented Jul 16, 2025

Uh oh!

mikaylagawarecki commented Jul 16, 2025

Uh oh!

pytorchmergebot commented Jul 16, 2025

Uh oh!

pytorchmergebot commented Jul 16, 2025

Uh oh!

mikaylagawarecki commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

spzala commented Jul 16, 2025

Uh oh!

RajeshvShiyal commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RajeshvShiyal commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

spzala commented Jul 17, 2025

Uh oh!

RajeshvShiyal commented Jul 17, 2025

Uh oh!

RajeshvShiyal commented Jul 19, 2025

Uh oh!

RajeshvShiyal commented Jul 19, 2025

Uh oh!

RajeshvShiyal commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikaylagawarecki commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

RajeshvShiyal commented Jul 13, 2025 •

edited

Loading

pytorch-bot bot commented Jul 13, 2025 •

edited

Loading

mikaylagawarecki commented Jul 16, 2025 •

edited

Loading

RajeshvShiyal commented Jul 17, 2025 •

edited

Loading

RajeshvShiyal commented Jul 17, 2025 •

edited

Loading

RajeshvShiyal commented Jul 25, 2025 •

edited

Loading