KEMBAR78
Fix RMSNorm doc per #136597 by mikaylagawarecki · Pull Request #136727 · pytorch/pytorch · GitHub
Skip to content

Conversation

@mikaylagawarecki
Copy link
Contributor

@mikaylagawarecki mikaylagawarecki commented Sep 26, 2024

Fixes #136597 (remove incorrect sqrt around RMS(x))

Stack from ghstack (oldest at bottom):

Screenshot 2024-09-26 at 11 46 32 AM

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136727

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 25c7820 with merge base 13b0baf (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@mikaylagawarecki mikaylagawarecki added release notes: nn release notes category topic: docs topic category labels Sep 26, 2024
mikaylagawarecki added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 23a7a88
Pull Request resolved: #136727
@mikaylagawarecki mikaylagawarecki marked this pull request as ready for review September 26, 2024 15:54
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Do we actually have the epsilon inside the square root? Or we add it outside? We had issues with this kind of things in the optimizer making the eps actually zero so you might want to double check with @janeyx99 on this. (not for this PR obviously)

@mikaylagawarecki
Copy link
Contributor Author

It's really inside

auto result = input.mul(at::rsqrt(at::pow(input, 2).mean(dims_to_reduce_ref, /*keep_dim=*/true).add_(eps_val)));

@mikaylagawarecki
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 26, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Comment with id 2378345814 not found

Details for Dev Infra team Raised by workflow job

@mikaylagawarecki
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@d-kleine
Copy link
Contributor

@mikaylagawarecki @albanD I have noticed a small detail: It says

  • $RMS[x]$ where $RMS(x)$

Using [] and () might be confusing for some readers of the docs. The formula in the RMSNorm paper is

$$y_i = \frac{x_i}{\text{RMS}(x)} \gamma_i, \quad \text{where} \quad \text{RMS}(x) = \sqrt{\epsilon + \frac{1}{n} \sum_{i=1}^{n} x_i^2}$$

y_i = \frac{x_i}{\text{RMS}(x)} \gamma_i, \quad \text{where} \quad \text{RMS}(x) = \sqrt{\epsilon + \frac{1}{n} \sum_{i=1}^{n} x_i^2}

What do you think about that?

@github-actions github-actions bot deleted the gh/mikaylagawarecki/264/head branch October 28, 2024 02:08
@d-kleine
Copy link
Contributor

d-kleine commented Nov 4, 2024

When will this be uploaded to the website?

@mikaylagawarecki
Copy link
Contributor Author

It's on main https://pytorch.org/docs/main/generated/torch.nn.RMSNorm.html#torch.nn.RMSNorm (will be in the next release)

@d-kleine
Copy link
Contributor

d-kleine commented Nov 4, 2024

Ah, okay, thanks - I was wondering why it was not on stable yet:
https://pytorch.org/docs/stable/generated/torch.nn.RMSNorm.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: nn release notes category topic: docs topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants