Improve custom function docs #60312

soulitzer · 2021-06-18T23:36:03Z

Adds some code examples for ctx methods and make requirements of arguments more clear
Type annotations for save_for_backward, mark_dirty, mark_non_differentiable, and set_materialize_grads (BC-breaking?)
Refactor torch.autograd.Function doc

facebook-github-bot · 2021-06-18T23:36:09Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/60312
📄 Preview docs built from this PR

💊 CI failures summary and remediations

As of commit b3d6c9c (more details on the Dr. CI page):

6/6 failures possibly* introduced in this PR
- 1/6 non-scanned failure(s)

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_xla_linux_bionic_py3_6_clang9_build (1/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .github/templates/linux_ci_workflow.yml.j2
Auto-merging .github/templates/linux_ci_workflow.yml.j2
CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .circleci/scripts/setup_ci_environment.sh
Auto-merging .circleci/scripts/setup_ci_environment.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_upload.sh
Auto-merging .circleci/scripts/binary_ios_upload.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_test.sh
Auto-merging .circleci/scripts/binary_ios_test.sh
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

pytorch_linux_xenial_py3_clang7_asan_test1 (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Aug 18 00:43:26 test_remote_message_script_de...yUniqueId(created_on=0, local_id=0) to be created.

Aug 18 00:42:46 frame #13: <unknown function> + 0x1982c720 (0x7f68e5a69720 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:42:46 frame #14: c10::ThreadPool::main_loop(unsigned long) + 0x7f1 (0x7f68c2d2ab91 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:42:46 frame #15: <unknown function> + 0xb8c80 (0x7f690f294c80 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
Aug 18 00:42:46 frame #16: <unknown function> + 0x76ba (0x7f690f92f6ba in /lib/x86_64-linux-gnu/libpthread.so.0)
Aug 18 00:42:46 frame #17: clone + 0x6d (0x7f690f66551d in /lib/x86_64-linux-gnu/libc.so.6)
Aug 18 00:42:46 
Aug 18 00:42:46 ok (7.197s)
Aug 18 00:42:58   test_remote_message_dropped_pickle (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (11.210s)
Aug 18 00:43:09   test_remote_message_dropped_pickle_to_self (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (11.110s)
Aug 18 00:43:19   test_remote_message_script_delay_timeout (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (10.307s)
Aug 18 00:43:26   test_remote_message_script_delay_timeout_to_self (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... [E request_callback_no_python.cpp:559] Received error while processing request type 260: falseINTERNAL ASSERT FAILED at "/var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp":387, please report a bug to PyTorch. Expected OwnerRRef with id GloballyUniqueId(created_on=0, local_id=0) to be created.
Aug 18 00:43:26 Exception raised from getOwnerRRef at /var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp:387 (most recent call first):
Aug 18 00:43:26 frame #0: <unknown function> + 0x1a231c (0x7f794056331c in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #1: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x6d (0x7f7961f56f8d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #2: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x160 (0x7f7940561800 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #3: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x18a (0x7f794055c66a in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #4: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x115 (0x7f794055cd75 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #5: torch::distributed::rpc::RRefContext::getOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, bool) + 0xd62 (0x7f79631bfb92 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #6: torch::distributed::rpc::RequestCallbackNoPython::assignOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, torch::distributed::rpc::GloballyUniqueId const&, c10::intrusive_ptr<c10::ivalue::Future, c10::detail::intrusive_target_default_null_type<c10::ivalue::Future> >) const + 0x223 (0x7f7963184a03 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #7: torch::distributed::rpc::RequestCallbackImpl::processScriptRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x8e3 (0x7f79858bbb13 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
Aug 18 00:43:26 frame #8: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x78d (0x7f7963181b8d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

pytorch_linux_xenial_py3_6_gcc5_4_build (3/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.

CONFLICT (add/add): Merge conflict in .github/templates/linux_ci_workflow.yml.j2
Auto-merging .github/templates/linux_ci_workflow.yml.j2
CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .circleci/scripts/setup_ci_environment.sh
Auto-merging .circleci/scripts/setup_ci_environment.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_upload.sh
Auto-merging .circleci/scripts/binary_ios_upload.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_test.sh
Auto-merging .circleci/scripts/binary_ios_test.sh
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

2 failures not recognized by patterns:

Job	Step	Action
^{linux-bionic-py3.8-gcc9-coverage / calculate-docker-image}	^{Checkout PyTorch}	🔁 rerun
^{Lint / clang-format}	^{Run clang-format}	🔁 rerun

1 job timed out:

pytorch_linux_xenial_py3_clang7_asan_test1

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.2-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

albanD

Read through the custom function part. Will do extending next.

torch/autograd/function.py

albanD · 2021-06-29T20:15:33Z

docs/source/notes/extending.rst

+^^^^^^^^^^^
+In general, implement a custom function if you want to perform computations in your model
+that are not differentiable or rely on non-Pytorch libraries (e.g., NumPy), but
+still wish for your operation to compose with built-in PyTorch operators and


Not sure what you mean by compose with built-in operators?

something like "chain with other ops", but that's also kind of implied by "work with the autograd engine" so I'm just removing it for now

albanD · 2021-06-29T20:16:37Z

docs/source/notes/extending.rst

+custom Module. See the section below for more information on extending :mod:`torch.nn`.
+
+If you'd like to alter the gradients during the backward pass or perform a side
+effect, consider using function or module hooks.


capitalization for Module

Actually maybe link to the hooks here.
Also I think function -> tensor

docs/source/notes/extending.rst

albanD · 2021-06-29T20:20:31Z

docs/source/notes/extending.rst

+comparing the value element-wise with the Jacobian computed numerically using
+finite-differencing.

 .. note::


Is this note still necessary?

Merging it into the section about mark_non_differentiable

docs/source/notes/extending.rst

soulitzer · 2021-06-30T19:04:49Z

docs/source/notes/extending.rst

  before the call, and their use will be registered in the graph. Note that this
  logic won't traverse lists/dicts/any other data structures and will only
-  consider :class:`Tensor` s that are direct arguments to the call. You can
+  consider :class:`Tensor`s that are direct arguments to the call. You can


Note to self: check if the space between the s and class is necessary

From the rendered doc https://docs-preview.pytorch.org/60312/notes/extending.html?highlight=custom%20function
looks like it is necessary. However, with the space it looks bad as well, see a couple lines above where we have ":class:Function s". So I'm okay with just removing the link in the plural case.

Also replacing Tensors -> tensors for consistency (unless we mean to refer to the actual class in which case it would need to be uppercase ":class:Tensor".

albanD

Nice update on the notes!

docs/source/notes/extending.rst

albanD · 2021-07-19T21:53:53Z

docs/source/notes/extending.rst

+custom Module. See the section below for more information on extending :mod:`torch.nn`.
+
+If you'd like to alter the gradients during the backward pass or perform a side
+effect, consider using function or module hooks.


Actually maybe link to the hooks here.
Also I think function -> tensor

docs/source/notes/extending.rst

albanD · 2021-07-19T21:58:45Z

docs/source/notes/extending.rst

+you should explicitly declare this by decorating backward with the
+:func:`~function.once_differentiable`. With this decorator, attempts to
+perform double backward through your function will produce an error.
+See our double backward tutorial for more information on double backward.


Is it already up? Can we link it here?

Soon :) pytorch/tutorials#1603 should be ready for review!

soulitzer · 2021-08-12T22:09:44Z

@albanD It has been a while (sorry!), but I have addressed your latest comments in this latest update, so it should be ready for another look!

Still running into flaky CI issues with the double backward tutorial, so it will need a little longer for that to be landed. We can probably land a followup PR to this once that lands.

The latest build is here: https://15333920-65600975-gh.circle-artifacts.com/0/docs/notes/extending.html

albanD

LGTM

facebook-github-bot · 2021-08-13T21:27:13Z

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-17T21:54:28Z

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-18T18:33:29Z

@soulitzer merged this pull request in 2f615f6.

facebook-github-bot added the cla signed label Jun 18, 2021

soulitzer force-pushed the pr/improve-custom-function-docs branch 2 times, most recently from e151ea3 to cea0cb9 Compare June 22, 2021 19:09

soulitzer marked this pull request as ready for review June 24, 2021 15:32

soulitzer requested a review from albanD as a code owner June 24, 2021 15:32

albanD reviewed Jun 25, 2021

View reviewed changes

soulitzer force-pushed the pr/improve-custom-function-docs branch from d5f437e to 408a039 Compare June 28, 2021 17:54

albanD reviewed Jun 29, 2021

View reviewed changes

soulitzer force-pushed the pr/improve-custom-function-docs branch from beff35b to 116ace3 Compare June 30, 2021 18:48

soulitzer commented Jun 30, 2021

View reviewed changes

soulitzer force-pushed the pr/improve-custom-function-docs branch from 116ace3 to 72e4b59 Compare July 1, 2021 18:31

albanD reviewed Jul 19, 2021

View reviewed changes

soulitzer requested a review from albanD August 12, 2021 15:59

soulitzer force-pushed the pr/improve-custom-function-docs branch from 79df552 to 01224d2 Compare August 12, 2021 18:30

soulitzer and others added 5 commits August 12, 2021 15:20

Improve docs

b341e03

Add changes for extending autograd note

7aa02bf

Address comments

bb2403a

Address comments

26d4321

Address comments

0337521

soulitzer force-pushed the pr/improve-custom-function-docs branch from 01224d2 to 0337521 Compare August 12, 2021 19:20

Fix

34776d3

soulitzer added the hackathon label Aug 12, 2021

albanD approved these changes Aug 13, 2021

View reviewed changes

Fix serialized models

b3d6c9c

facebook-github-bot closed this in 2f615f6 Aug 18, 2021

facebook-github-bot added the Merged label Aug 18, 2021

malfet deleted the pr/improve-custom-function-docs branch September 24, 2021 13:11

Improve custom function docs #60312

Improve custom function docs #60312

Uh oh!

Conversation

soulitzer commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

pytorch_xla_linux_bionic_py3_6_clang9_build (1/3)

pytorch_linux_xenial_py3_clang7_asan_test1 (2/3)

pytorch_linux_xenial_py3_6_gcc5_4_build (3/3)

2 failures not recognized by patterns:

ci.pytorch.org: 1 failed

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soulitzer commented Aug 12, 2021

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Aug 13, 2021

Uh oh!

facebook-github-bot commented Aug 17, 2021

Uh oh!

facebook-github-bot commented Aug 18, 2021

Uh oh!

Reviewers

Assignees

soulitzer commented Jun 18, 2021 •

edited

Loading

facebook-github-bot commented Jun 18, 2021 •

edited

Loading