-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Closed
Labels
module: autogradRelated to torch.autograd, and the autograd engine in generalRelated to torch.autograd, and the autograd engine in generalmodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
If a leaf tensor's .grad
is None before backward, its AccumulateGrad function may steal a reference to the incoming gradient from whatever backward op produced it (instead of accumulated onto an existing .grad
). @mruberry @ngimel and I are semi-confident AccumulateGrad functions and the autograd engine insert the right leaf stream syncs (such that ops following backward() can safely immediately use stolen .grads) but I should double check the code and PR a dedicated test. Filing so I don't forget.
cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @ngimel
Metadata
Metadata
Assignees
Labels
module: autogradRelated to torch.autograd, and the autograd engine in generalRelated to torch.autograd, and the autograd engine in generalmodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module