-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add option to automatically handle unsorted variable-length sequences in RNNs #15225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/nn/modules/rnn.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: don't we always return contiguous tensors from index_select?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, you have a good point, I forgot about that
torch/nn/modules/rnn.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this line. It looks like permute_hidden returns the result and does not modify its inputs, bu this line doesn't use the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch -- it should be hx = self.permute_hidden, I will update that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed and added a test for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm, but I'm not super familiar with the rnn code so it might be useful to get someone that knows it better to look over it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
… in RNNs Fixes pytorch#3584. Motivation: manually sorting sequences, packing them, and then unsorting them is something a lot of users have complained about doing, especially when we can offer library support for them. Overview: we internally sort sequences before packing them and store a list of `unsorted_indices` that represent how to unsort the sequences inside PackedSequence. The packing helper functions return PackedSequence with the `permutation` field and the unpacking helper functions use it to unsort. To implement this, the following changes were made: - PackedSequence now keeps `sorted_indices` and `unsorted_indices`. These two can be thought of as permutations and are inverses of each other. `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how to unsort the sequences. - Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence that maintains the legacy behavior of error-ing out on unsorted-sequences. When `enforce_sorted=True`, these functions maintain their ONNX exportability. - pack_sequence(sequences, enforce_sorted) takes in unsorted sequences. - pack_padded_sequence can take in a padded tensor that represents padded, unsorted sequences. - pad_packed_sequence unsorts the PackedSequence such that it is still the inverse operation of packed_padded_sequence. - RNNs apply `sort_indices` to their input hidden state and apply `unsort_indices` to their output hidden state. This is to ensure that the hidden state batches correspond to the user's ordering of input sequences. NOT BC-Breaking - The default for pack_sequence and pack_padded_sequence is `enforce_sorted=True` to avoid breaking ONNX export. To use the new functionality, pass in `enforce_sorted=False` Testing Plan - Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence, and TestNN.test_variable_sequence (RNN test) to check the behavior of unsorted sequences, sorted sequences, and sorted sequences with enforce_sorted=True
08876a6 to
ee98e08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
ee98e08 to
ff52ed4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
… in RNNs (#15225) Summary: Fixes #3584. Motivation: manually sorting sequences, packing them, and then unsorting them is something a lot of users have complained about doing, especially when we can offer library support for them. Overview: we internally sort sequences before packing them and store a list of `unsorted_indices` that represent how to unsort the sequences inside PackedSequence. The packing helper functions return PackedSequence with the `permutation` field and the unpacking helper functions use it to unsort. To implement this, the following changes were made: - PackedSequence now keeps `sorted_indices` and `unsorted_indices`. These two can be thought of as permutations and are inverses of each other. `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how to unsort the sequences. - Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence that maintains the legacy behavior of error-ing out on unsorted-sequences. When `enforce_sorted=True`, these functions maintain their ONNX exportability. - pack_sequence(sequences, enforce_sorted) takes in unsorted sequences. - pack_padded_sequence can take in a padded tensor that represents padded, unsorted sequences. - pad_packed_sequence unsorts the PackedSequence such that it is still the inverse operation of packed_padded_sequence. - RNNs apply `sort_indices` to their input hidden state and apply `unsort_indices` to their output hidden state. This is to ensure that the hidden state batches correspond to the user's ordering of input sequences. NOT BC-Breaking - The default for pack_sequence and pack_padded_sequence is `enforce_sorted=True` to avoid breaking ONNX export. To use the new functionality, pass in `enforce_sorted=False` Testing Plan - Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence, and TestNN.test_variable_sequence (RNN test) to check the behavior of unsorted sequences, sorted sequences, and sorted sequences with enforce_sorted=True - test/test_jit.py has a test to see if RNNs are exportable with enforce_sorted=True cc colesbury Pull Request resolved: #15225 Reviewed By: soumith Differential Revision: D13507138 Pulled By: zou3519 fbshipit-source-id: b871dccd6abefffca81bc4e3efef1873faa242ef
Fixes #3584.
Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.
Overview: we internally sort sequences before packing them and store a list of
unsorted_indicesthat represent how to unsort the sequences insidePackedSequence. The packing helper functions return PackedSequence with the
permutationfield and the unpacking helper functions use it to unsort.To implement this, the following changes were made:
sorted_indicesandunsorted_indices.These two can be thought of as permutations and are inverses of each other.
sorted_indicesis how the sequences were sorted;unsorted_indicesis howto unsort the sequences.
enforce_sortedargument to pack_sequence and pack_padded_sequencethat maintains the legacy behavior of error-ing out on unsorted-sequences.
When
enforce_sorted=True, these functions maintain their ONNX exportability.unsorted sequences.
inverse operation of packed_padded_sequence.
sort_indicesto their input hidden state and applyunsort_indicesto their output hidden state. This is to ensure that thehidden state batches correspond to the user's ordering of input sequences.
NOT BC-Breaking
enforce_sorted=Trueto avoid breaking ONNX export. To use the newfunctionality, pass in
enforce_sorted=FalseTesting Plan
and TestNN.test_variable_sequence (RNN test) to check the behavior
of unsorted sequences, sorted sequences, and sorted sequences with
enforce_sorted=True
enforce_sorted=True
cc @colesbury