KEMBAR78
Improve custom function docs by soulitzer · Pull Request #60312 · pytorch/pytorch · GitHub
Skip to content

Conversation

@soulitzer
Copy link
Contributor

@soulitzer soulitzer commented Jun 18, 2021

  • Adds some code examples for ctx methods and make requirements of arguments more clear
  • Type annotations for save_for_backward, mark_dirty, mark_non_differentiable, and set_materialize_grads (BC-breaking?)
  • Refactor torch.autograd.Function doc

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 18, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit b3d6c9c (more details on the Dr. CI page):


  • 6/6 failures possibly* introduced in this PR
    • 1/6 non-scanned failure(s)

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (1/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .github/templates/linux_ci_workflow.yml.j2
Auto-merging .github/templates/linux_ci_workflow.yml.j2
CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .circleci/scripts/setup_ci_environment.sh
Auto-merging .circleci/scripts/setup_ci_environment.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_upload.sh
Auto-merging .circleci/scripts/binary_ios_upload.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_test.sh
Auto-merging .circleci/scripts/binary_ios_test.sh
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_linux_xenial_py3_clang7_asan_test1 (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Aug 18 00:43:26 test_remote_message_script_de...yUniqueId(created_on=0, local_id=0) to be created.
Aug 18 00:42:46 frame #13: <unknown function> + 0x1982c720 (0x7f68e5a69720 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:42:46 frame #14: c10::ThreadPool::main_loop(unsigned long) + 0x7f1 (0x7f68c2d2ab91 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:42:46 frame #15: <unknown function> + 0xb8c80 (0x7f690f294c80 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
Aug 18 00:42:46 frame #16: <unknown function> + 0x76ba (0x7f690f92f6ba in /lib/x86_64-linux-gnu/libpthread.so.0)
Aug 18 00:42:46 frame #17: clone + 0x6d (0x7f690f66551d in /lib/x86_64-linux-gnu/libc.so.6)
Aug 18 00:42:46 
Aug 18 00:42:46 ok (7.197s)
Aug 18 00:42:58   test_remote_message_dropped_pickle (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (11.210s)
Aug 18 00:43:09   test_remote_message_dropped_pickle_to_self (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (11.110s)
Aug 18 00:43:19   test_remote_message_script_delay_timeout (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... ok (10.307s)
Aug 18 00:43:26   test_remote_message_script_delay_timeout_to_self (__main__.FaultyFaultyAgentRpcTestWithSpawn) ... [E request_callback_no_python.cpp:559] Received error while processing request type 260: falseINTERNAL ASSERT FAILED at "/var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp":387, please report a bug to PyTorch. Expected OwnerRRef with id GloballyUniqueId(created_on=0, local_id=0) to be created.
Aug 18 00:43:26 Exception raised from getOwnerRRef at /var/lib/jenkins/workspace/torch/csrc/distributed/rpc/rref_context.cpp:387 (most recent call first):
Aug 18 00:43:26 frame #0: <unknown function> + 0x1a231c (0x7f794056331c in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #1: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x6d (0x7f7961f56f8d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #2: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x160 (0x7f7940561800 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #3: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x18a (0x7f794055c66a in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #4: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x115 (0x7f794055cd75 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
Aug 18 00:43:26 frame #5: torch::distributed::rpc::RRefContext::getOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, bool) + 0xd62 (0x7f79631bfb92 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #6: torch::distributed::rpc::RequestCallbackNoPython::assignOwnerRRef(torch::distributed::rpc::GloballyUniqueId const&, torch::distributed::rpc::GloballyUniqueId const&, c10::intrusive_ptr<c10::ivalue::Future, c10::detail::intrusive_target_default_null_type<c10::ivalue::Future> >) const + 0x223 (0x7f7963184a03 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)
Aug 18 00:43:26 frame #7: torch::distributed::rpc::RequestCallbackImpl::processScriptRemoteCall(torch::distributed::rpc::RpcCommandBase&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x8e3 (0x7f79858bbb13 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
Aug 18 00:43:26 frame #8: torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const + 0x78d (0x7f7963181b8d in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so)

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (3/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .github/templates/linux_ci_workflow.yml.j2
Auto-merging .github/templates/linux_ci_workflow.yml.j2
CONFLICT (add/add): Merge conflict in .github/scripts/generate_ci_workflows.py
Auto-merging .github/scripts/generate_ci_workflows.py
CONFLICT (add/add): Merge conflict in .circleci/scripts/setup_ci_environment.sh
Auto-merging .circleci/scripts/setup_ci_environment.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_upload.sh
Auto-merging .circleci/scripts/binary_ios_upload.sh
CONFLICT (add/add): Merge conflict in .circleci/scripts/binary_ios_test.sh
Auto-merging .circleci/scripts/binary_ios_test.sh
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1


2 failures not recognized by patterns:

Job Step Action
GitHub Actions linux-bionic-py3.8-gcc9-coverage / calculate-docker-image Checkout PyTorch 🔁 rerun
GitHub Actions Lint / clang-format Run clang-format 🔁 rerun

1 job timed out:

  • pytorch_linux_xenial_py3_clang7_asan_test1

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch 2 times, most recently from e151ea3 to cea0cb9 Compare June 22, 2021 19:09
@soulitzer soulitzer marked this pull request as ready for review June 24, 2021 15:32
@soulitzer soulitzer requested a review from albanD as a code owner June 24, 2021 15:32
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through the custom function part. Will do extending next.

@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch from d5f437e to 408a039 Compare June 28, 2021 17:54
^^^^^^^^^^^
In general, implement a custom function if you want to perform computations in your model
that are not differentiable or rely on non-Pytorch libraries (e.g., NumPy), but
still wish for your operation to compose with built-in PyTorch operators and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by compose with built-in operators?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like "chain with other ops", but that's also kind of implied by "work with the autograd engine" so I'm just removing it for now

custom Module. See the section below for more information on extending :mod:`torch.nn`.

If you'd like to alter the gradients during the backward pass or perform a side
effect, consider using function or module hooks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capitalization for Module

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually maybe link to the hooks here.
Also I think function -> tensor

comparing the value element-wise with the Jacobian computed numerically using
finite-differencing.

.. note::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this note still necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging it into the section about mark_non_differentiable

@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch from beff35b to 116ace3 Compare June 30, 2021 18:48
before the call, and their use will be registered in the graph. Note that this
logic won't traverse lists/dicts/any other data structures and will only
consider :class:`Tensor` s that are direct arguments to the call. You can
consider :class:`Tensor`s that are direct arguments to the call. You can
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: check if the space between the s and class is necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the rendered doc https://docs-preview.pytorch.org/60312/notes/extending.html?highlight=custom%20function
looks like it is necessary. However, with the space it looks bad as well, see a couple lines above where we have ":class:Function s". So I'm okay with just removing the link in the plural case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also replacing Tensors -> tensors for consistency (unless we mean to refer to the actual class in which case it would need to be uppercase ":class:Tensor".

@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch from 116ace3 to 72e4b59 Compare July 1, 2021 18:31
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice update on the notes!

custom Module. See the section below for more information on extending :mod:`torch.nn`.

If you'd like to alter the gradients during the backward pass or perform a side
effect, consider using function or module hooks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually maybe link to the hooks here.
Also I think function -> tensor

you should explicitly declare this by decorating backward with the
:func:`~function.once_differentiable`. With this decorator, attempts to
perform double backward through your function will produce an error.
See our double backward tutorial for more information on double backward.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it already up? Can we link it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soon :) pytorch/tutorials#1603 should be ready for review!

@soulitzer soulitzer requested a review from albanD August 12, 2021 15:59
@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch from 79df552 to 01224d2 Compare August 12, 2021 18:30
@soulitzer soulitzer force-pushed the pr/improve-custom-function-docs branch from 01224d2 to 0337521 Compare August 12, 2021 19:20
@soulitzer
Copy link
Contributor Author

@albanD It has been a while (sorry!), but I have addressed your latest comments in this latest update, so it should be ready for another look!

Still running into flaky CI issues with the double backward tutorial, so it will need a little longer for that to be landed. We can probably land a followup PR to this once that lands.

The latest build is here: https://15333920-65600975-gh.circle-artifacts.com/0/docs/notes/extending.html

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@facebook-github-bot
Copy link
Contributor

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@soulitzer merged this pull request in 2f615f6.

@malfet malfet deleted the pr/improve-custom-function-docs branch September 24, 2021 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants