official/vision/docs/read_custom_datasets.md
TFRecord, a simple
format for storing a sequence of binary records, is the default and recommended
data format supported by TensorFlow Model Garden (TMG) for performance reasons.
The
tf.train.Example
message (or protobuf) is a flexible message type that represents a {"string": value} mapping. It is designed for use with TensorFlow and is used throughout
the higher-level APIs such as TFX.
If your dataset is already encoded as tf.train.Example and in TFRecord format,
please check the various
dataloaders
we have created to handle standard input formats for classification, detection
and segmentation. If the dataset is not in the recommended format or not in
standard structure that can be handled by the provided
dataloaders,
we have outlined the steps in the following sections to
- Encode the data using the
tf.train.Example
message, and then serialize, write, and read
tf.train.Example
messages to and from .tfrecord files.
- Customize the dataloader to reads, decodes and parses the input data.
The primary reason for converting a dataset into TFRecord format in TensorFlow is to improve input data reading performance during training. Reading data from disk or over a network can be a bottleneck in the training process, and using the TFRecord format can help to streamline this process and improve overall training speed.
The TFRecord format is a binary format that stores data in a compressed, serialized format. This makes it more efficient for reading, as the data can be read quickly and without the need for decompression or deserialization.
Additionally, the TFRecord format is designed to be scalable and efficient for large datasets. It can be split into multiple files and read from multiple threads in parallel, improving overall input pipeline performance.
To convert a dataset into TFRecord format in TensorFlow, you need to
first convert the data to TensorFlow's Feature format;
then create a feature message using tf.train.Example;
and lastly serialize the tf.train.Example message into a TFRecord file using tf.io.TFRecordWriter. The tf.train.Example holds the protobuf message (the data).
More concretely,:
1. Convert your data to TensorFlow's Feature format using tf.train.Feature:
A tf.train.Feature is a dictionary containing data types that can be
serialized to a TFRecord format. The tf.train.Feature message type can accept
one of the following three types:
Based on the type of values in the dataset, the user must first convert them
into above types. Below are the simple helper functions that help in the
conversion and return a tf.train.Feature object. Refer to the helper builder
class
here.
tf.train.Int64List: This type is used to represent a list of 64-bit integer values. Below is the example of how to put int data into an Int64List.
def add_ints_feature(self, key: str,
value: Union[int, Sequence[int]]) -> TfExampleBuilder:
....
return self.add_feature(key,tf.train.Feature(
int64_list=tf.train.Int64List(value=_to_array(value))))
tf.train.BytesList: This type is used to represent a list of byte strings, which can be used to store arbitrary data as a string of bytes.
def add_bytes_feature(self, key: str,
value: BytesValueType) -> TfExampleBuilder:
....
return self.add_feature(key, tf.train.Feature(
bytes_list=tf.train.BytesList(value=_to_bytes_array(value))))
tf.train.FloatList: This type is used to represent a list of floating-point values. Below is a conversion example.
def add_floats_feature(self, key: str,
value: Union[float, Sequence[float]]) -> TfExampleBuilder:
....
return self.add_feature(key,tf.train.Feature(
float_list=tf.train.FloatList(value=_to_array(value))))
Note: The exact steps for converting your data to TensorFlow's Feature format will depend on the structure of your data. You may need to create multiple Feature objects for each record, depending on the number of features in your data. </dd></dl>
2. Map the features using tf.train.Example:
Fundamentally, a tf.train.Example is a {"string": tf.train.Feature}
mapping. From above we have tf.train.Feature values, we can now map them in a
tf.train.Example. The format for keys to features mapping of tf.train.Example
varies based on the use case.
For example,
feature = {
'feature0': _int64_feature(feature0),
'feature1': _int64_feature(feature1),
'feature2': _bytes_feature(feature2),
'feature3': _float_feature(feature3),
}
tf.train.Example(features=tf.train.Features(feature=feature))
The sample usage of helper builder class is
>>> example_builder = TfExampleBuilder()
>>> example = (
example_builder.add_bytes_feature('feature_a', 'foobarbaz')
.add_ints_feature('feature_b', [1, 2, 3])
.example
3. Serialize the data:
<dd><dl>Serialize the tf.train.Example message into a TFRecord file, use
TensorFlow API’s tf.io.TFRecordWriter and SerializeToString()to serialize
the data. Here is some code to iterate over annotations, process them and write
into TFRecords. Refer to the
code
here.
def write_tf_record_dataset(output_path, tf_example_iterator,num_shards):
writers = [
tf.io.TFRecordWriter(
output_path + '-%05d-of-%05d.tfrecord' % (i, num_shards))
for i in range(num_shards)
]
....
for idx, record in enumerate(
tf_example_iterator):
if idx % LOG_EVERY == 0:
tf_example = process_features(record)
writers[idx % num_shards].write(tf_example.SerializeToString())
Here is an example of how to create a TFRecords file in TensorFlow. In this example, we Convert raw COCO dataset to TFRecord format. The resulting TFRecords file can then be used to train the model.
With a customized dataset in TFRecord, a customized Decoder is typically needed. The decoder decodes a TF Example record and returns a dictionary of decoded tensors. Below are some essential steps to customize a decoder.
To create a custom data loader for new dataset , user need to follow the below steps:
Create class CustomizeDecoder(decoder.Decoder).The CustomizeDecoder class should be a subclass of the generic decoder interface and must implement all the abstract methods. In particular, it should have the implementation of abstract method decode, to decode the serialized example into tensors.
The constructor defines the mapping between the field name and the value from an input tf.Example. There is no limit on the number of fields to decode based on the usecase.
Below is the tf.Example decoder for classification task and Object Detection. Here we define two fields for image bytes and labels for classification tasks whereas ten fields for Object Detection.
class Decoder(decoder.Decoder):
def __init__(self):
self._keys_to_features = {
'image/encoded':
tf.io.FixedLenFeature((), tf.string, default_value=''),
'image/class/label':
tf.io.FixedLenFeature((), tf.int64, default_value=-1)
}
....
Sample Constructor for Object Detection :
class Decoder(decoder.Decoder):
def __init__(self):
self._keys_to_features = {
'image/encoded': tf.io.FixedLenFeature((), tf.string),
'image/height': tf.io.FixedLenFeature((), tf.int64, -1),
'image/width': tf.io.FixedLenFeature((), tf.int64, -1),
'image/object/bbox/xmin': tf.io.VarLenFeature(tf.float32),
'image/object/bbox/xmax': tf.io.VarLenFeature(tf.float32),
'image/object/bbox/ymin': tf.io.VarLenFeature(tf.float32),
'image/object/bbox/ymax': tf.io.VarLenFeature(tf.float32),
'image/object/class/label': tf.io.VarLenFeature(tf.int64),
'image/object/area': tf.io.VarLenFeature(tf.float32),
'image/object/is_crowd': tf.io.VarLenFeature(tf.int64),
}
....
The implementation method decode() decodes the serialized example
into tensors. It takes in a serialized string tensor argument that encodes the
data. And returns decoded tensors i.e a dictionary of field key name and decoded
tensor mapping. The output will be consumed by methods in Parser.
class Decoder(decoder.Decoder):
def __init__(self):
....
def decode(self,
serialized_example: tf.train.Example) -> Mapping[str,tf.Tensor]:
return tf.io.parse_single_example(
serialized_example, self._keys_to_features)
Creating a Decoder is an optional step and it varies with the use case. Below are some use cases where we have included the Decoder and Parser based on the requirements.
| Use case | Decoder/Parser |
|---|---|
| Classification | Both Decoder and Parser |
| Object Detection | Only Decoder |