Overview

This CodeLab demonstrates how to build a fused TFLite LSTM model for MNIST recognition using Keras, and how to convert it to TensorFlow Lite.

The CodeLab is very similar to the Keras LSTM CodeLab. However, we're creating fused LSTM ops rather than the unfused versoin.

Also note: We're not trying to build the model to be a real world application, but only demonstrate how to use TensorFlow Lite. You can a build a much better model using CNN models. For a more canonical lstm codelab, please see here.

Step 0: Prerequisites

It's recommended to try this feature with the newest TensorFlow nightly pip build.

!pip install tf-nightly

Step 1: Build the MNIST LSTM model.

import numpy as np
import tensorflow as tf

model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(28, 28), name='input'),
    tf.keras.layers.LSTM(20, time_major=False, return_sequences=True),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax, name='output')
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

Step 2: Train & Evaluate the model.

We will train the model using MNIST data.

# Load MNIST dataset.
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)

# Change this to True if you want to test the flow rapidly.
# Train with a small dataset and only 1 epoch. The model will work poorly
# but this provides a fast way to test if the conversion works end to end.
_FAST_TRAINING = False
_EPOCHS = 5
if _FAST_TRAINING:
  _EPOCHS = 1
  _TRAINING_DATA_COUNT = 1000
  x_train = x_train[:_TRAINING_DATA_COUNT]
  y_train = y_train[:_TRAINING_DATA_COUNT]

model.fit(x_train, y_train, epochs=_EPOCHS)
model.evaluate(x_test, y_test, verbose=0)

Step 3: Convert the Keras model to TensorFlow Lite model.

run_model = tf.function(lambda x: model(x))
# This is important, let's fix the input size.
BATCH_SIZE = 1
STEPS = 28
INPUT_SIZE = 28
concrete_func = run_model.get_concrete_function(
    tf.TensorSpec([BATCH_SIZE, STEPS, INPUT_SIZE], model.inputs[0].dtype))

# model directory.
MODEL_DIR = "keras_lstm"
model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)

converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
tflite_model = converter.convert()

Step 4: Check the converted TensorFlow Lite model.

Now load the TensorFlow Lite model and use the TensorFlow Lite python interpreter to verify the results.

# Run the model with TensorFlow to get expected results.
TEST_CASES = 10

# Run the model with TensorFlow Lite
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

for i in range(TEST_CASES):
  expected = model.predict(x_test[i:i+1])
  interpreter.set_tensor(input_details[0]["index"], x_test[i:i+1, :, :])
  interpreter.invoke()
  result = interpreter.get_tensor(output_details[0]["index"])

  # Assert if the result of TFLite model is consistent with the TF model.
  np.testing.assert_almost_equal(expected, result, decimal=5)
  print("Done. The result of TensorFlow matches the result of TensorFlow Lite.")

  # Please note: TfLite fused Lstm kernel is stateful, so we need to reset
  # the states.
  # Clean up internal states.
  interpreter.reset_all_variables()

Step 5: Let's inspect the converted TFLite model.

Let's check the model, you can see the LSTM will be in it's fused format.