[doc][hackathon] To add Adagrad Optimizer to the documentation #63254

iramazanli · 2021-08-13T20:03:13Z

It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper #63236.

In this PR we are adding description of Adagrad to the documentation. For more details, we refer to the paper
http://jmlr.org/papers/v12/duchi11a.html

cc @vincentqb @iramazanli

facebook-github-bot · 2021-08-13T20:03:18Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/63254
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR

💊 CI failures summary and remediations

As of commit 464a9fa (more details on the Dr. CI page):

2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_xla_linux_bionic_py3_6_clang9_build (1/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_macos_build.sh
Auto-merging .circleci/scripts/binary_macos_build.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_definitions.py
Auto-merging .circleci/cimodel/data/pytorch_build_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py
Auto-merging .circleci/cimodel/data/pytorch_build_data.py
CONFLICT (add/add): Merge conflict in .bazelrc
Auto-merging .bazelrc
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

pytorch_linux_xenial_py3_6_gcc5_4_build (2/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_macos_build.sh
Auto-merging .circleci/scripts/binary_macos_build.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_definitions.py
Auto-merging .circleci/cimodel/data/pytorch_build_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py
Auto-merging .circleci/cimodel/data/pytorch_build_data.py
CONFLICT (add/add): Merge conflict in .bazelrc
Auto-merging .bazelrc
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

codecov · 2021-08-25T17:59:28Z

Codecov Report

Merging #63254 (4564618) into master (49b782b) will decrease coverage by 0.40%.
The diff coverage is n/a.

❗ Current head 4564618 differs from pull request most recent head 464a9fa. Consider uploading reports for the commit 464a9fa to get more accurate results

@@            Coverage Diff             @@
##           master   #63254      +/-   ##
==========================================
- Coverage   66.73%   66.33%   -0.41%     
==========================================
  Files         695      703       +8     
  Lines       90833    92223    +1390     
==========================================
+ Hits        60618    61175     +557     
- Misses      30215    31048     +833

albanD · 2021-08-30T15:54:18Z

I don't see initial_accumulator_value and lr_decay?

Also should we mention what is the behavior when the gradient is sparse?

iramazanli · 2021-08-30T16:14:25Z

I don't see initial_accumulator_value and lr_decay?

Also should we mention what is the behavior when the gradient is sparse?

The reason initial_accumulator_value is skipped is because it is skipped in Args section here :

pytorch/torch/optim/adagrad.py

Line 12 in f79df24

Args:

which i tried to keep consistent with pseudocode.
The reason I didn't add lr_decay is little bit conceptual. Indeed any optimization algorithm could be called with any kind of learning rate schedulers, and for each type of scheduler look of pseudocode documented here might change. In particular, i choose \gamma instead of \gamma_t in all these algorithms to illustrate focus on optimization algorithm rather other learning rate / scheduler.

Let me know if you have feedbacks regarding these two point.

Regarding sparse behavior, i would be happy to follow your suggestion.

albanD · 2021-08-30T17:53:06Z

The reason initial_accumulator_value is skipped is because it is skipped in Args section here :

Is that done on purpose? It looks like the args doc need to be updated.

In any case, in the rendered version, the argument is present in the signature, so I think it should be documented properly.

The reason I didn't add lr_decay is little bit conceptual.

I agree with your point that we might want to discourage people from using this argument.
But I feel like the goal of these pseudo code is to explain what the implementation is doing. And so it should reflect all the arguments.
If we want to ask people not to use this (or even deprecate it) that can be mentioned in the args doc below.

iramazanli · 2021-08-30T23:21:59Z

The reason initial_accumulator_value is skipped is because it is skipped in Args section here :

Is that done on purpose? It looks like the args doc need to be updated.

In any case, in the rendered version, the argument is present in the signature, so I think it should be documented properly.

The reason I didn't add lr_decay is little bit conceptual.

I agree with your point that we might want to discourage people from using this argument.
But I feel like the goal of these pseudo code is to explain what the implementation is doing. And so it should reflect all the arguments.
If we want to ask people not to use this (or even deprecate it) that can be mentioned in the args doc below.

It's done ! thanks for the comments.

albanD · 2021-09-02T15:18:14Z

torch/optim/adagrad.py

nit you don't want to assign to gamma here right? The new value should not be used by the next step right?

update is done

albanD · 2021-09-02T15:19:36Z

torch/optim/adagrad.py

Isn't this g_t^2 ?
Ho I guess that works with your definition of G.
But why not just use g_t^2 and remove the diag on the line below?

updated to the shape your recommended

albanD · 2021-09-02T15:20:22Z

torch/optim/adagrad.py

nit missing epsilon

facebook-github-bot · 2021-09-09T21:33:54Z

@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-09-09T22:43:17Z

@iramazanli merged this pull request in d4b09db.

iramazanli added module: optimizer Related to torch.optim docs-hackathon hackathon labels Aug 13, 2021

facebook-github-bot added the cla signed label Aug 13, 2021

iramazanli changed the title ~~To add Adagrad Optimizer to the documentation~~ [doc][hackathon] To add Adagrad Optimizer to the documentation Aug 15, 2021

iramazanli force-pushed the adagrad_algorithm_doc branch 4 times, most recently from 348a347 to 2d807b3 Compare August 25, 2021 13:44

iramazanli force-pushed the adagrad_algorithm_doc branch 8 times, most recently from 46d58f3 to a3b12d1 Compare August 27, 2021 20:51

iramazanli requested a review from albanD August 30, 2021 14:10

iramazanli force-pushed the adagrad_algorithm_doc branch 2 times, most recently from f3e7b0e to 4564618 Compare August 30, 2021 23:20

albanD reviewed Sep 2, 2021

View reviewed changes

iramazanli force-pushed the adagrad_algorithm_doc branch from 4564618 to 9cc53de Compare September 9, 2021 18:39

To add Adagrad Optimizer to the documentation

464a9fa

iramazanli force-pushed the adagrad_algorithm_doc branch from 9cc53de to 464a9fa Compare September 9, 2021 19:20

albanD approved these changes Sep 9, 2021

View reviewed changes

facebook-github-bot closed this in d4b09db Sep 9, 2021

facebook-github-bot added the Merged label Sep 9, 2021

[doc][hackathon] To add Adagrad Optimizer to the documentation #63254

[doc][hackathon] To add Adagrad Optimizer to the documentation #63254

Uh oh!

Conversation

iramazanli commented Aug 13, 2021 • edited by pytorch-probot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Aug 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 2 new failures recognized by patterns

pytorch_xla_linux_bionic_py3_6_clang9_build (1/2)

pytorch_linux_xenial_py3_6_gcc5_4_build (2/2)

Uh oh!

codecov bot commented Aug 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

albanD commented Aug 30, 2021

Uh oh!

iramazanli commented Aug 30, 2021

Uh oh!

albanD commented Aug 30, 2021

Uh oh!

iramazanli commented Aug 30, 2021

Uh oh!

albanD Sep 2, 2021

Choose a reason for hiding this comment

Uh oh!

iramazanli Sep 9, 2021

Choose a reason for hiding this comment

Uh oh!

albanD Sep 2, 2021

Choose a reason for hiding this comment

Uh oh!

iramazanli Sep 9, 2021

Choose a reason for hiding this comment

Uh oh!

albanD Sep 2, 2021

Choose a reason for hiding this comment

Uh oh!

iramazanli Sep 9, 2021

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 9, 2021

Uh oh!

facebook-github-bot commented Sep 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iramazanli commented Aug 13, 2021 •

edited by pytorch-probot bot

Loading

facebook-github-bot commented Aug 13, 2021 •

edited

Loading

codecov bot commented Aug 25, 2021 •

edited

Loading