Documentation Clarification Needed for Clamping of Scale Coefficient in clip_grads_with_norm_

### 📚 The doc issue

In the current documentation for `torch.nn.utils.clip_grads_with_norm_`, the formula for the scale coefficient is as follows:

$$
\text{grad} = \text{grad} \times \frac{\text{maxnorm}}{\text{totalnorm} + 1e-6}
$$

However, in practical usage, it appears that the scale coefficient is clamped to a maximum value of 1, which prevents the gradient from scaling the gradient up. This behavior, while important for the correct functionality of gradient clipping, is not currently mentioned in the documentation.

**Suggested Change:**
It would be helpful to explicitly mention in the documentation that the scale coefficient is clamped to 1. This will provide more clarity to users about how the function operates in practice and ensure there are no misunderstandings regarding its behavior.

### Suggest a potential alternative/fix

The formula should be updated as follows:

$$
\text{grad} = \text{grad} \times min( \frac{\text{maxnorm}}{\text{totalnorm} + 1e-6}, 1)
$$

cc @svekars @sekyondaMeta @AlannaBurke @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation Clarification Needed for Clamping of Scale Coefficient in clip_grads_with_norm_ #151554

📚 The doc issue

Suggest a potential alternative/fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Documentation Clarification Needed for Clamping of Scale Coefficient in clip_grads_with_norm_ #151554

Description

📚 The doc issue

Suggest a potential alternative/fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions