KEMBAR78
Add option to automatically handle unsorted variable-length sequences in RNNs by zou3519 · Pull Request #15225 · pytorch/pytorch · GitHub
Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Dec 14, 2018

Fixes #3584.

Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.

Overview: we internally sort sequences before packing them and store a list of
unsorted_indices that represent how to unsort the sequences inside
PackedSequence. The packing helper functions return PackedSequence with the
permutation field and the unpacking helper functions use it to unsort.

To implement this, the following changes were made:

  • PackedSequence now keeps sorted_indices and unsorted_indices.
    These two can be thought of as permutations and are inverses of each other.
    sorted_indices is how the sequences were sorted; unsorted_indices is how
    to unsort the sequences.
  • Added an enforce_sorted argument to pack_sequence and pack_padded_sequence
    that maintains the legacy behavior of error-ing out on unsorted-sequences.
    When enforce_sorted=True, these functions maintain their ONNX exportability.
  • pack_sequence(sequences, enforce_sorted) takes in unsorted sequences.
  • pack_padded_sequence can take in a padded tensor that represents padded,
    unsorted sequences.
  • pad_packed_sequence unsorts the PackedSequence such that it is still the
    inverse operation of packed_padded_sequence.
  • RNNs apply sort_indices to their input hidden state and apply
    unsort_indices to their output hidden state. This is to ensure that the
    hidden state batches correspond to the user's ordering of input sequences.

NOT BC-Breaking

  • The default for pack_sequence and pack_padded_sequence is
    enforce_sorted=True to avoid breaking ONNX export. To use the new
    functionality, pass in enforce_sorted=False

Testing Plan

  • Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence,
    and TestNN.test_variable_sequence (RNN test) to check the behavior
    of unsorted sequences, sorted sequences, and sorted sequences with
    enforce_sorted=True
  • test/test_jit.py has a test to see if RNNs are exportable with
    enforce_sorted=True

cc @colesbury

@zou3519 zou3519 added the module: bc-breaking Related to a BC-breaking change label Dec 14, 2018
@zou3519 zou3519 requested a review from colesbury December 14, 2018 18:13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: don't we always return contiguous tensors from index_select?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, you have a good point, I forgot about that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this line. It looks like permute_hidden returns the result and does not modify its inputs, bu this line doesn't use the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- it should be hx = self.permute_hidden, I will update that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and added a test for this case.

Copy link
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lgtm, but I'm not super familiar with the rnn code so it might be useful to get someone that knows it better to look over it too.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… in RNNs

Fixes pytorch#3584.

Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.

Overview: we internally sort sequences before packing them and store a list of
`unsorted_indices` that represent how to unsort the sequences inside
PackedSequence. The packing helper functions return PackedSequence with the
`permutation` field and the unpacking helper functions use it to unsort.

To implement this, the following changes were made:
- PackedSequence now keeps `sorted_indices` and `unsorted_indices`.
  These two can be thought of as permutations and are inverses of each other.
  `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how
  to unsort the sequences.
- Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence
  that maintains the legacy behavior of error-ing out on unsorted-sequences.
  When `enforce_sorted=True`, these functions maintain their ONNX exportability.
- pack_sequence(sequences, enforce_sorted) takes in unsorted sequences.
- pack_padded_sequence can take in a padded tensor that represents padded,
  unsorted sequences.
- pad_packed_sequence unsorts the PackedSequence such that it is still the
  inverse operation of packed_padded_sequence.
- RNNs apply `sort_indices` to their input hidden state and apply
  `unsort_indices` to their output hidden state. This is to ensure that the
  hidden state batches correspond to the user's ordering of input sequences.

NOT BC-Breaking
- The default for pack_sequence and pack_padded_sequence is
  `enforce_sorted=True` to avoid breaking ONNX export. To use the new
  functionality, pass in `enforce_sorted=False`

Testing Plan
- Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence,
  and TestNN.test_variable_sequence (RNN test) to check the behavior
  of unsorted sequences, sorted sequences, and sorted sequences with
  enforce_sorted=True
@zou3519 zou3519 removed the module: bc-breaking Related to a BC-breaking change label Dec 20, 2018
@zou3519 zou3519 changed the title Automatically handle unsorted variable-length sequences in RNNs Add option to automatically handle unsorted variable-length sequences in RNNs Dec 20, 2018
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Dec 21, 2018
… in RNNs (#15225)

Summary:
Fixes #3584.

Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.

Overview: we internally sort sequences before packing them and store a list of
`unsorted_indices` that represent how to unsort the sequences inside
PackedSequence. The packing helper functions return PackedSequence with the
`permutation` field and the unpacking helper functions use it to unsort.

To implement this, the following changes were made:
- PackedSequence now keeps `sorted_indices` and `unsorted_indices`.
  These two can be thought of as permutations and are inverses of each other.
  `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how
  to unsort the sequences.
- Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence
  that maintains the legacy behavior of error-ing out on unsorted-sequences.
  When `enforce_sorted=True`, these functions maintain their ONNX exportability.
- pack_sequence(sequences, enforce_sorted) takes in unsorted sequences.
- pack_padded_sequence can take in a padded tensor that represents padded,
  unsorted sequences.
- pad_packed_sequence unsorts the PackedSequence such that it is still the
  inverse operation of packed_padded_sequence.
- RNNs apply `sort_indices` to their input hidden state and apply
  `unsort_indices` to their output hidden state. This is to ensure that the
  hidden state batches correspond to the user's ordering of input sequences.

NOT BC-Breaking
- The default for pack_sequence and pack_padded_sequence is
  `enforce_sorted=True` to avoid breaking ONNX export. To use the new
  functionality, pass in `enforce_sorted=False`

Testing Plan
- Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence,
  and TestNN.test_variable_sequence (RNN test) to check the behavior
  of unsorted sequences, sorted sequences, and sorted sequences with
  enforce_sorted=True
- test/test_jit.py has a test to see if RNNs are exportable with
  enforce_sorted=True

cc colesbury
Pull Request resolved: #15225

Reviewed By: soumith

Differential Revision: D13507138

Pulled By: zou3519

fbshipit-source-id: b871dccd6abefffca81bc4e3efef1873faa242ef
@zou3519 zou3519 deleted the sort-packing branch December 27, 2018 17:08
@ezyang ezyang added the merged label Jun 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sort sequences internally in pack_padded_sequence

5 participants