View source on GitHub
|
Abstract base class for converting between text and integers.
A note on padding:
Because text data is typically variable length and nearly always requires
padding during training, ID 0 is always reserved for padding. To accommodate
this, all TextEncoders behave in certain ways:
encode: never returns id 0 (all ids are 1+)decode: drops 0 in the input idsvocab_size: includes ID 0New subclasses should be careful to match this behavior.
Attributes | |
|---|---|
vocab_size
|
Size of the vocabulary. Decode produces ints [1, vocab_size). |
Methods
decode
@abc.abstractmethoddecode( ids )
Decodes a list of integers into text.
encode
@abc.abstractmethodencode( s )
Encodes text into a list of integers.
load_from_file
@classmethod@abc.abstractmethodload_from_file( filename_prefix )
Load from file. Inverse of save_to_file.
save_to_file
@abc.abstractmethodsave_to_file( filename_prefix )
Store to file. Inverse of load_from_file.
View source on GitHub