-
Notifications
You must be signed in to change notification settings - Fork 732
RNN Transducer Loss Autograd Test #1532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Investigation with autograd test from carolineechen#2. Details
|
65eb141 to
69029b1
Compare
|
Patch: Before: After: Details
|
Patch with same result as comment above
|
|
FIX: The fix for numpy transducer is to not use inplace operations as shown in 8129432: def backward(ctx, grad_output):
grad_output = grad_output.view(-1, 1, 1, 1).to(ctx.grads)
return ctx.grads.mul(grad_output), None, None, None, None, None, None, None, NoneBUG: warp-transducer and warp-rnnt modify the gradient inplace which can lead to backward pass not being reentrant. This means calling the loss function multiple time without calling a forward or using The torchaudio C++ custom autograd function in torchaudio is ok thanks to #1507, see here. BUG: the numpy transducer is issue is different: the jacobian-gradient product is not made and only the gradient is returned. (internal) |
Details
Link to |
|
Trying to reproduce gradcheck. |
|
gradcheck is passing. The option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find! See my comment about fused_log_softmax vs. reuse_logits_for_grads set to False.
80534ac to
619fd57
Compare
|
Failing gradcheck means that will fail. |
| (get_numpy_data_B2_T4_U3_D3, ), | ||
| (get_numpy_data_B1_T2_U3_D5, ), | ||
| ]) | ||
| def test_RNNTLoss_gradcheck(self, data_func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great find, and thanks for looking into this! can you add autograd tests for the functional version as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, added :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a minor comment but otherwise lgtm, thanks for working on this!
| if enable_all_grad: | ||
| i.requires_grad = True | ||
| inputs_.append(i) | ||
| assert gradcheck(loss, inputs, eps=1e-03, atol=1e-03, rtol=1e-03, nondet_tol=0.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check if atol can be reduced any further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be documented in the test code. This looks much lower precision than other autograd tests we have in torchaudio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check if atol can be reduced any further?
unfortunately, that's the lowest for atol and eps as expected for float32. rtol can be reduced to 0, but this seems too good, so let's use the default value (0.001).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be documented in the test code. This looks much lower precision than other autograd tests we have in torchaudio.
As commented here this is due to the use of float32. added a comment above the line.
Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com>
* Update build.sh Picks up 1.9 build from test. * Update build.sh * Update lite interpreter tutorial to beta (pytorch#1549) * Update lite interpreter tutorial to beta * Update lite interpreter to beta * update model export script * address comment and update documentation * add custome build in first paragraph * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Update prototype_source/lite_interpreter.rst Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * replace file name * update ios part Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> * Revert "Update lite interpreter tutorial to beta (pytorch#1549)" (pytorch#1569) This reverts commit a702ca0fafe9d4a1ee0c1e4331de66245ceb3103. * Update build.sh * Update build.sh * updated pipeline tutorial (pytorch#1562) * reduce (pytorch#1546) * Update seq2seq_translation_tutorial.py (pytorch#1532) Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com> * added CPU optimization guide part into tuning_guide (pytorch#1512) * added CPU optimization guide part into tuning_guide * changed non-python command to python comments in CPU specific optimization section * Update tuning_guide.py Changed comment of bash commands to double quote. * Update tuning_guide.py Co-authored-by: Brian Johnson <brianjo@fb.com> * Typo fix (pytorch#1538) Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com> * Typo fix in text sentiment tutorial (pytorch#1543) Trivial typo fix in docs * Update dcgan_faces_tutorial.py (pytorch#1550) Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com> * updated pipeline tutorial Co-authored-by: define_liuyi <793753866@qq.com> Co-authored-by: dhayeah <57786651+dhayeah@users.noreply.github.com> Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com> Co-authored-by: Jing Xu <jing.xu@intel.com> Co-authored-by: Brian Johnson <brianjo@fb.com> Co-authored-by: Andrew C. Freeman <andrew.freeman@cawb.com> Co-authored-by: Davide Fiocco <davidefiocco@users.noreply.github.com> Co-authored-by: universuen <52519513+universuen@users.noreply.github.com> * Update audio manipulation tutorial (pytorch#1566) * add resampling tutorial * update benchmarking and sectioning * remove np import * Update torchaudio tutorial * update resample dtype initialization Co-authored-by: moto <855818+mthrok@users.noreply.github.com> * updated text sentiment tutorial (pytorch#1563) * updated transformer tutorial (pytorch#1565) * Update numeric_suite_tutorial.py s/Logger=/logger_cls=/ * Update profiler recipe doc (1.9) (pytorch#1528) Summary: Update the profiler recipe to use the new API and features Test Plan: make html-noplot Co-authored-by: Brian Johnson <brianjo@fb.com> * Update build.sh Co-authored-by: cccclai <chenlai@fb.com> Co-authored-by: Raziel <129535+raziel@users.noreply.github.com> Co-authored-by: parmeet <parmeetbhatia@fb.com> Co-authored-by: define_liuyi <793753866@qq.com> Co-authored-by: dhayeah <57786651+dhayeah@users.noreply.github.com> Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com> Co-authored-by: Jing Xu <jing.xu@intel.com> Co-authored-by: Andrew C. Freeman <andrew.freeman@cawb.com> Co-authored-by: Davide Fiocco <davidefiocco@users.noreply.github.com> Co-authored-by: universuen <52519513+universuen@users.noreply.github.com> Co-authored-by: Caroline Chen <carolinechen@fb.com> Co-authored-by: moto <855818+mthrok@users.noreply.github.com> Co-authored-by: Nikita Shulga <nshulga@fb.com> Co-authored-by: ilia-cher <30845429+ilia-cher@users.noreply.github.com>
This PR
reuse_logits_for_grads=Falseandfloat32.Follow-up noted below:
reuse_logits_for_grads=Truefor now see remove reuse logits for grads from python interface. #1536. This is used for performance but requiresmark_dirty, see Autograd Follow-Up vincentqb/audio#10.