TensorFlow 2 version
|
View source on GitHub
|
Configuration for parsing a sparse input feature from an Example.
tf.io.SparseFeature(
index_key, value_key, dtype, size, already_sorted=False
)
Note, preferably use VarLenFeature (possibly in combination with a
SequenceExample) in order to parse out SparseTensors instead of
SparseFeature due to its simplicity.
Closely mimicking the SparseTensor that will be obtained by parsing an
Example with a SparseFeature config, a SparseFeature contains a
value_key: The name of key for aFeaturein theExamplewhose parsedTensorwill be the resultingSparseTensor.values.index_key: A list of names - one for each dimension in the resultingSparseTensorwhoseindices[i][dim]indicating the position of thei-th value in thedimdimension will be equal to thei-th value in the Feature with key namedindex_key[dim]in theExample.size: A list of ints for the resultingSparseTensor.dense_shape.
For example, we can represent the following 2D SparseTensor
SparseTensor(indices=[[3, 1], [20, 0]],
values=[0.5, -1.0]
dense_shape=[100, 3])
with an Example input proto
features {
feature { key: "val" value { float_list { value: [ 0.5, -1.0 ] } } }
feature { key: "ix0" value { int64_list { value: [ 3, 20 ] } } }
feature { key: "ix1" value { int64_list { value: [ 1, 0 ] } } }
}
and SparseFeature config with 2 index_keys
SparseFeature(index_key=["ix0", "ix1"],
value_key="val",
dtype=tf.float32,
size=[100, 3])
Fields:
index_key: A single string name or a list of string names of index features. For each key the underlying feature's type must beint64and its length must always match that of thevalue_keyfeature. To representSparseTensors with adense_shapeofrankhigher than 1 a list of lengthrankshould be used.value_key: Name of value feature. The underlying feature's type must bedtypeand its length must always match that of all theindex_keys' features.dtype: Data type of thevalue_keyfeature.size: A Python int or list thereof specifying the dense shape. Should be a list if and only ifindex_keyis a list. In that case the list must be equal to the length ofindex_key. Each for each entryiall values in theindex_key[i] feature must be in[0, size[i]).already_sorted: A Python boolean to specify whether the values invalue_keyare already sorted by their index position. If so skip sorting. False by default (optional).
Attributes | |
|---|---|
index_key
|
|
value_key
|
|
dtype
|
|
size
|
|
already_sorted
|
|
TensorFlow 2 version
View source on GitHub