KEMBAR78
Broadcast output requires_grad only if corresponding input requires_grad by ssnl · Pull Request #5061 · pytorch/pytorch · GitHub
Skip to content

Conversation

@ssnl
Copy link
Collaborator

@ssnl ssnl commented Feb 5, 2018

This avoids unnecessary computation when finetuning only certain layers with DataParallel.

Fixes #5041 .

@ssnl ssnl changed the title Broadcast output requires_grad if only corresponding input requires_grad Broadcast output requires_grad only if corresponding input requires_grad Feb 5, 2018
@soumith soumith merged commit 8056399 into pytorch:master Feb 6, 2018
@soumith soumith added the 0.3.1 label Feb 6, 2018
@ssnl ssnl deleted the broadacast_requiers_grad branch February 6, 2018 05:07
.. warning::
Forward and backwrad hooks defined on :attr:`module` and its submodules
won't be invoked anymore, unless the hooks are initialized in the
:meth:`forward` method.

This comment was marked as off-topic.

This comment was marked as off-topic.

if not input_requires_grad:
for output in outputs:
non_differentiables.append(output[idx])
ctx.mark_non_differentiable(*non_differentiables)

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nn.DataParallel ignores requires_grad setting when running

4 participants