tensorflow/compiler/mlir/tosa/g3doc/legalization.md
This document provides pseudo-code lowerings from TensorFlow and TensorFlow Lite MLIR Dialects (https://www.tensorflow.org/mlir/dialects) to the TOSA Dialect (https://mlir.llvm.org/docs/Dialects/TOSA/).
The documentation is a work-in-progress: sections with missing legalizations are in the process of being written.
The pseudo-code syntax used in this document is described below.
int8: signed 8-bit integer uint8: unsigned 8-bit integer int16: signed 16-bit integer int32: signed 32-bit integer int64: signed 32-bit integer uint32: unsigned 32-bit integer float32: IEEE-754 32-bit floating point format float64: IEEE-754 64-bit floating point format bool: boolean
In pseudo-code, symbol starting with "%" indicates it’s a value. A value is evaluated by an operator at run time, and operator can consume and can only consume a list of values as operands. Note value’s tensor type is determined at compile time. Only the evaluation happens at run time One can easily construct a data flow subgraph by looking at the producer/consumer.
Tensor type is an attribute determined by legalization at compile time, describing the shape and element data type. It’s noted as tensor<shape, dtype>, or shorthanded as tensor<%t.type>
In pseudocode an TOSA operator is prototyped as following format.
%<output_value> = tosa.<OPERATOR>(%<input_value>) {<attribute = …}
For the purposes of brevity and clarity in this document, the pseudocode allows the following notation on value attribute.
| Shorthand | Description |
|---|---|
%t.shape | Shape vector for the tensor |
%t.shape[i] | Size of dimension i for the tensor |
%t.rank | Rank of the tensor |
%t.dtype | Datatype of the tensor |
%t.scale | Quantized scaling parameter (float64) |
%t.zp | Quantized zero-point (int64) |
%t.signed | Boolean indicating the type is signed |
%t.num_bits | Number of bits in the datatype |
%t.num_elements | Number of elements in the tensor |
%t.type | Tuple of tensor<%t.shape, %t.dtype> |
%t.size | For tensor lists: the number of tensors in the list |
Where the TOSA Specification allows the use of named dimensions, the following names may be used.
| Name | Description |
|---|---|
N | Batch dimension |
H | Height dimension |
W | Width dimension |
C | Channel dimension |
M | Depthwise multiplier |
Each of these may be prefixed with I for the input dimension or O for the
output dimension or K for kernel dimensions.
The following pseudocode helper functions are used to cannonicalize arguments from different frameworks to the TOSA dialect.
Wherever %tensor.as_constant() is specified, a constant vector will be created to hold the value in the %tensor at compile time. This only succeeds if %tensor is fed by a constant type operator. If constant matching fails, the lowering will fail and be terminated.
The following pseudo-code helper functions are used to cannonicalize arguments from different frameworks to the TOSA dialect.
vector<int64> get_padding_values_from_explicit_pad_attr(vector<int64> explicit_pad,
tensorflow::TensorFormat data_format_tf)
{
int64 pad_before, pad_after
vector<int64> computed_paddings
for (int32 i = 0; i < 2; i++) {
int64 dim = GetTensorSpatialDimIndex(4, data_format_tf, i)
pad_before = explicit_pad[dim * 2]
pad_after = explicit_pad[dim * 2 + 1]
computed_paddings.push_back(pad_before)
computed_paddings.push_back(pad_after)
}
return computed_paddings
}
Calculate explicit padding array based on pad type
vector<int64> get_padding_values_from_pad_type(tensorflow::Padding padding, tensorflow::TensorFormat data_format,
uint32 first_filter_spatial_dim, type input_type, type filter_type
vector strides, vector dilations)
{
assert(padding != tensorflow::Padding::EXPLICIT);
vector<int64> computed_padding;
// Padding over H and W dimensions
for (int32 i = 0; i < 2; i++) {
int32 ifm_dim = get_tensor_spatial_dim_index(4, data_format, i);
int32 filter_dim = first_filter_spatial_dim + i;
int32 dim_dilation = dilations[ifm_dim];
int32 dim_stride = strides[ifm_dim];
int64 op_size, pad_before_tf, pad_after_tf;
tensorflow::GetWindowedOutputSizeVerbose(input_type.shape[ifm_dim], filter_type.shape[filter_dim],
dim_dilation, dim_stride, padding,
// Outputs
&op_size, &pad_before_tf, &pad_after_tf);
computed_paddings.push_back(pad_before_tf);
computed_paddings.push_back(pad_after_tf);
}
return computed_paddings;
}
// Cannonicalize scalar axis attributes to a scalar positive axis attribute
int32 positive_axis(int32 axis, int32 rank)
{
if (axis < 0)
axis += rank;
return axis;
}
void compute_scale_32(float64 scale, int32& multiplier, int32& shift)
{
/* Generates mantissa and shift values where mantissa is in [-1.0,-0.5] or
[0.5, 1.0] such that
multiplier = mantissa*2^shift */
const float64 mantissa = std::frexp(scale, &shift);
auto shifted_m = std::round(mantissa * (int64(1) << 31));
assert(shifted_m <= (int64(1) << 31)); // can't be greater that 1.0
if (shifted_m == (int64(1) << 31)) {
shifted_m /= 2;
shift++;
}
// TOSA expect right shift to be positive, and embed (1 << 31) into right
// shift bits
shift = (-shift) + 31;
assert(shifted_m <= std::numeric_limits<int32>::max());
multiplier = static_cast<int32>(shifted_m);
}
Value lower_batch_to_space_nd_op(Value %input, Value %block_shape, Value %crops, shape_t output_shape)
{
vector <size_t> block_shape(%block_shape.rank)
vector std::pair<size_t, size_t> crops_arr
size_t remaining_shape_rank = %input.rank - %block.rank - 1
size_t crops_dim = %crops.shape[0]
for (int32 i = 0; i < crops_dim; i++) {
crops[i] = std::make_pair(%crops.as_constant()[i * crops_dim + 0],
%crops.as_constant()[i * crops_dim + 1])
}
// Step 1: Reshape input to
// [block_shape[0],
// ...
// [block_shape[M-1],
// [batch / prod(block_shape)]
// [input_shape[1],
// ...
// [input_shape[N-1]
vector <size_t> a1_shape(%block.rank + %input.rank)
for (int32 i = 0; i < %block.rank; i++) {
a1_shape[i] = %block.shape[i]
}
a1_shape[%block.rank] = %input.shape.[0] / %block.num_elements
for (int32 i = 1; i < %input.rank; i++) {
a1_shape[i + %block.rank] = %input.shape[i]
}
// Step 2. Permute to shape:
// [ batch / prod(block_shape) ],
// [ input_shape[1] ], [ block_shape[0] ]
// ...
// [ input_shape[M] ], [ block_shape[M-1]
// + remaining_input_shapes input_shape[M+1 .. N-1]
vector <size_t> a2_perm(%block.rank + %input.rank)
a2_perm[0] = %block.rank
for (int32 i = 0; i < %block.rank; i++) {
a2_perm[1 + i * 2 + 0] = %block.rank + 1 + i
a2_perm[1 + i * 2 + 1] = i
}
// Step 3. Reshape to
// [ batch / prod(block_shape) ],
// [input_shape[1] * block_shape[0] ],
// ..
// [input_shape[M * block_shape[M-1],
// + remaining input shapes [input_shape[M+1.. N-1]]
vector <size_t> a3_shape(%input.rank)
%a3_shape[0] = %input.shape[0] / %block.num_elements
for (int32 i = 0; i < %block.rank; i++) {
a3_shape[i + 1] = %input.shape[i + 1] * %block.shape[i]
}
for (int32 i = 0; remaining_block_shape; i++) {
a3_shape[1 + %block.rank + 1] = %input.shape[%block.rank + 1 + i]
}
// Step 4 Crop the start/end dimensions using slice
vector <size_t> a4_begin(%input.rank), a4_size(%input.rank)
for (int32 i = 0; i < %input.rank; i++) {
if (i == 0 || i > crop_dims) {
a4_begin[i] = 0
a4_size[i] = output_shape[i]
} else {
a4_begin[i] = %crops[i-1].first
a4_size[i] = crops[i - 1].first - crops[i - 1].second
}
}
%a1_reshape = tosa.RESHAPE(%input) {new_shape=a1_shape}
%a2_transpose = tosa.TRANSPOSE(%a1_reshape) {perms=a2_perm}
%a3_reshape = tosa.RESHAPE(%a2_transpose) {new_shape=a3_shape}
%output = tosa.SLICE(%a3_reshape) {begin=a4_begin, size=a4_size}
return %output
}
Value lower_concatv2_op(Type output_type, Value %values, int32 axis)
{
int32 tosa_axis = positive_axis(axis)
assert(%values.size >= 2)
// Convert scalar inputs to a tensor
if (%values:0.size == 0) {
for (int32 i = 0; i < %values.size; i++) {
%values:i = tosa.RESHAPE(%values:i) {new_shape=1}
}
}
for (int32 i=0; i < %values.size(); i++) {
%val = %values:i
if (%val.zp != output_type.zp || %val.scale != output_type.scale) {
float64 rescale_scale = %val.scale / output_type.scale
%values:i = tosa.RESCALE(%val) {scale=rescale_scale, input_zp=%values:0.zp, output_zp=output_type.zp}
}
}
%concat_op = tosa.CONCAT(%values:0, %values:1) {axis=tosa_axis}
for (int32 i = 2; i < %values.size; i++) {
%concat_op = tosa.CONCAT(%concat_op, %values:i) {axis=tosa_axis}
}
return %concat_op
}
Value lower_depth_to_space_op(Value %input, size_t block_size[], Format_t data_format)
{
assert(data_format == 'NHWC')
vector <size_t> a2_shape = {%input.shape[0],
%input.shape[1],
%input.shape[2],
block_size[0],
block_size[1],
%input.shape[3] / (block_size[0] * block_size[1])}
vector <size_t> a4_shape = {%input.shape[0],
%input.shape[1] * block_size[0],
%input.shape[2] * block_size[1],
%input.shape[3] / (block_size[0] * block_size[1])}
%a2_reshape = tosa.RESHAPE(%input) {new_shape=a2_shape}
%a3_transpose = tosa.TRANSPOSE(%a2_reshape) {perms={0, 1, 3, 2, 4, 5}}
%output = tosa.RESHAPE(%a3_transpose) {new_shape=a4_shape}
return %output
}
Value lower_elu_op(Value %value)
{
// elu(x) = x < 0 ? (exp(x) - 1) : x
// Create constants for 0/1 and reshape to match the rank
// of %value
%one_const = tosa.CONST() {value={1}}
%zero_const = tosa.CONST() {value={0}}
vector bcast_shape
for (int32 i = 0; i < %value.rank; i++) {
bcast_shape.push_back(1)
}
%one_reshape = tosa.RESHAPE(%one_const) {new_shape=bcast_shape}
%zero_reshape = tosa.RESHAPE(%zero_const) {new_shape=bcast_shape}
%exp_in = tosa.EXP(%value)
%sub = tosa.SUB(%exp_in, %one_reshape)
%ge = tosa.GREATER_EQUAL(%value, %zero_reshape)
%output = tosa.SELECT(%ge, %value, %sub)
return %output
}
Value lower_expand_dims(Value %input, int32 axis)
{
vector<size_t> reshape_dims
if (axis < 0 || axis >= %input.rank) {
// Insert at the end of the tensor
axis += %input.rank
for (int32 i = 0; i < input.rank; i++) {
reshape_dims.push_back(%input.shape[i])
}
} else {
for (int32 i= 0 ; i < %input.rank; i++) {
if (i == axis) {
reshape_dims.push_back(1)
}
reshape_dims.push_back(%input.shape[i])
}
}
%output = tosa.RESHAPE(%input) {new_shape=reshape_dims}
return %output
}
Value lower_fake_quant_op(Value %inputs, type output_type, float64 min, float64 max,
int64 num_bits, bool narrow_range)
{
assert(num_bits == 8 || num_bits == 16)
int64 qmax = (1L << (num_bits - 1)) - 1;
int64 qmin = -(1L << (num_bits - 1))
if (narrow_range) {
qmin = qmin + 1
}
float64 scale = (max - min) / float64(qmax - qmin)
int64 zeropoint = (int64)std::round((-min) / scale + float64(qmin))
%quantized = lower_quantize_op(%inputs.type, %inputs, 1.0 / scale, zeropoint)
%dequantized = lower_dequantize_op(output_type, %quantized_op, scale, zeropoint)
return %dequantized
}
Value lower_floor_div(Value %lhs, Value %rhs)
{
%recip = tosa.RECIPROCAL(%rhs)
%mul = tosa.MUL(%lhs, %recip)
%output = tosa.FLOOR(%mul)
return %output
}
Value lower_floor_mod(Value %lhs, Value %rhs)
{
%recip = tosa.RECIPROCAL(%rhs)
%mul = tosa.MUL(%lhs, %recip)
%floor = tosa.FLOOR(%mul)
%output = tosa.SUB(%mul, %floor)
return %output
}
Value lower_quantize_op(Type output_type, Value %input, float64 scale, int64 zeropoint)
{
%const_scale = tosa.CONST() {value={scale}}
%const_zp = tosa.CONST() {value={zeropoint}}
%op1_mul_in_scale = tosa.MUL(%input, %const_scale)
%op2_add_op1_zp = tosa.ADD(%op1_mul_in_scale, %const_zp)
%op3_cast_op2 = tosa.CAST(%op2_add_op1_zp) // f32->%output.dtype
}
Value lower_dequantize_op(Value %input, float64 scale, int64 zeropoint)
{
%const_scale = tosa.CONST() {value={scale}}
%const_zp = tosa.CONST() {value={(float64)zeropoint}}
%op1_cast_in = tosa.CAST(%input) // %input.dtype->f32
%op2_sub_op1_zp = tosa.SUB(%op1_cast_in, %const_zp)
%op3_mul_op2_scale = tosa.MUL(%op2_sub_op1_zp, %const_scale)
}
Value lower_log_softmax_op(Value %logits)
{
%op1 = tosa.EXP(%logits)
%op2 = tosa.REDUCE_SUM(%op1) {axis=(%logits.rank-1)}
%op3 = tosa.RECIPROCAL(%op2)
%op4 = tosa.MUL(%op1, %op3)
%op5 = tosa.LOG(%op4)
return %op5
}
Value lower_pack_op(Value %input[], size_t axis)
{
size_t concat_axis = positive_axis(axis)
size_t input_tensor_rank = %input[0].rank
// Convert any rank 0 to rank 1 with reshape
if (input_tensor_rank == 0) {
for (int32 i = 0; i < %input.size; i++) {
%input[i] = tosa.RESHAPE(%input[i], {1})
}
}
vector<size_t> output_shape
for (int32 i = 0; i < input_tensor_rank; i++) {
output_shape.push_back(%input[0].shape[i]
}
output_shape[concat_axis] = output_shape[concat_axis] * %input.size
// First pair of tensors
%concat = tosa.CONCAT(%input[0], %input[1]) {axis=concat_axis}
// Remaining tensors
for (int32 i = 2; i < %input.size; i++) {
%concat = tosa.CONCAT(%concat, %input[i]) {axis=concat_axis}
}
if (input_tensor_rank == 0) {
// No reshape needed for rank 0, already done
%output = %concat
} else
%reshape = tosa.RESHAPE(%concat) {new_shape=output_shape}
if (concat_axis == input_tensor_rank) {
// Output shape is [A, B, C, .. n] in this case,
// need to reshape to [N, A, B, C, ..] with perm [1, 2, 3, .. 0]
concat_axis = 0
vector <size_t> perms
for (int32 i = 0; i < %input[0].rank; i++)
perms.push_back(i + 1)
perms.push_back(0)
%output = tosa.TRANSPOSE(%reshape) {perms=perms}
} else {
%output = %reshape
}
return %output
}
Value lower_reduce_op<tosa_op_t OP>(Value %input, shape_t output_shape, Value %axes, bool keep_dims, float64 input_scale=1.0f, int32 input_zp=0, float64 output_scale=1.0f, int32 output_zp=0)
{
vector axes_vec = %axes.as_constant();
// Special case of no axes means no transformation
if (axes_vec.size() == 0) {
return tosa.IDENTITY(%input)
}
bool is_quantized = isa<QuantizedType>(%input.dtype) ? true : false
shape_t shape = %input.shape;
%output = %input;
if (is_quantized) {
%output = tosa.RESCALE(%output) {scale=input_scale, input_zp=input_zp, output_zp=0}
}
for (int32 i = 0; i < axes_vec.size(); i++) {
int32 axis = positive_axis(axes_vec[i], %input.rank);
shape[axis] = 1;
%output = tosa.OP(%output) {axis=axis}
}
if (!keep_dims) {
%output = tosa.RESHAPE(%output) {new_shape=output_shape}
}
if (is_quantized) {
%output = tosa.RESCALE(%output) {scale=output_scale, input_zp=0, output_zp=output_zp}
}
return %output;
}
Value lower_resize_op(Value %images, Value %size, shape output_shape, dtype output_dtype, mode_t mode)
{
int32 input_height = %input.shape[1]
int32 input_width = %input.shape[2]
int32 output_height = %output.shape[1]
int32 output_width = %output.shape[2]
float64 in_center_h = static_cast<float64>(input_height - 1) / 2.0
float64 in_center_w = static_cast<float64>(input_width - 1) / 2.0
float64 out_center_h = static_cast<float64>(output_height - 1) / 2.0
float64 out_center_w = static_cast<float64>(output_width - 1) / 2.0
float64 fp_stride_y, fp_stride_x
if (align_corner && output_height > 1)
fp_stride_y = static_cast<float64>(input_height - 1) / static_cast<float64>(output_height - 1)
else
fp_stride_y = static_cast<float64>(input_height) / static_cast<float64>(output_height)
if (align_corner && output_width > 1)
fp_stride_x = static_cast<float64>(input_width - 1) / static_cast<float64>(output_width - 1)
else
fp_stride_x = static_cast<float64>(input_width) / static_cast<float64>(output_width)
float64 fp_offset_y = fp_offset_y = 0.0f
if (half_pixel_centers) {
fp_offset_y = fp_stride_y * 0.5f - 0.5f
fp_offset_x = fp_stride_x * 0.5f - 0.5f
}
if (dtype == float)
%op1_resize_in = tosa.RESIZE(%input) {stride={fp_stride_y, fp_stride_x}, offset={fp_offset_y, fp_offset_x}, shift=0, resize_mode=mode}
else {
int32 shift = 10
float64 unit = static_cast<float64>(1 << shift)
int32 stride_y = fp_stride_y * unit
int32 stride_x = fp_stride_x * unit
int32 offset_y = fp_offset_y * unit
int32 offset_x = fp_offset_x * unit
%op1_resize_in = tosa.RESIZE(%input) {stride={stride_y, stride_x}, offset={offset_y, offset_x}, shift=shift, resize_mode=mode}
if (mode == "BILINEAR") {
%const_zero = tosa.CONST() {value={0}}
%const_twenty = tosa.CONST() {value={20}}
%op2_ge_op1 = tosa.GREATER_EQUAL(%op1_resize_in, %const_zero)
%op3_abs_op1 = tosa.ABS(%op1_resize_in)
%op4_rshift_op3 = tosa.ARITHMETIC_RIGHT_SHIFT(%op3_abs_op1, %const_twenty)
%op5_negate_op4 = tosa.NEGATE(%op4_rshift_op3)
%op6_select_op2_op4_op5 = tosa.SELECT(%op2_ge_op1, %op4_rshift_op3, %op5_negate_op4)
%op7_cast_op6 = tosa.CAST(%op6_select_op2_op4_op5) // i32/i48->%output.dtype
}
}
}
Value lower_reverse_v2_op(Value %tensor, Value %axis)
{
Value %output = %tensor
if (%axis.num_elements == 0) {
%output = tosa.IDENTITY(%tensor)
} else {
for (int32 i = 0; i < %axis.shape[0]; i++) {
size_t axis_val = positive_axis(%axis.as_constant()[i])
%output = tosa.REVERSE(%output) {axis=%axis_val}
}
}
return %output
}
Value lower_round_op(Value %x)
{
%half = tosa.CONST() {value={0.5}}
%add = tosa.ADD(%x, %half)
%output = tosa.FLOOR(%add)
return %output
}
Value lower_selectv2_op(Value %condition, Value %t, Value %e, shape output_shape)
{
// Reshape condition so that ranks match to support
// broadcasting (if necessary)
if (%condition.rank != output_shape.size) {
vector <size_t> cond_shape = %condition.shape
for (int32 i = 0; i < (output_shape.size - %condition.rank); i++) {
cond_shape.push_front(1)
}
%condition = tosa.RESHAPE(%condition) {new_shape=cond_shape}
}
%output = tosa.SELECT(%condition, %t, %e)
return %output
}
Value lower_shape_op(Value %input)
{
vector <size_t> input_shape = %input.shape
%shape = tosa.CONST() {value={input_shape}}
return %shape
}
Value lower_space_to_batch_nd_op(Value %input, Value %block_shape, Value %padding)
{
size_t block_rank = %block.shape[0]
size_t remaining_shape_rank = %input.rank - block_rank - 1;
// Step 1. Pad based on paddings operand (flattened representation of [input.rank][2]-shaped array)
vector <size_t> a1_padding
a1_padding[0] = 0
a1_padding[1] = 0
for (int32 i = 0; i < %padding.shape[0]; i++) {
a1_padding[i + 2] = %padding.as_constant()[i]
}
%a1_pad = tosa.PAD(%input) {padding=a1_padding}
// Step 2. Reshape to
// [batch + padded_shape[1] / block_shape[0], block_shape[0], ...
// padded_shape[M] / block_shape[M-1], block_shape[M-1]] +
// remaining_shape
vector <size_t> a2_shape(1 + block_rank * 2 + remaining_shape_rank)
a2_shape[0] = %input.shape[0]
for (int32 i = 0; i < block_rank; i++) {
a2_shape[1 + i * 2 + 0] = %a1_pad.shape[1 + i] / block_shape.as_constant()[i]
a2_shape[1 + i * 2 + 1] = block_shape.as_constant()[i]
}
for (int32 i = 0; i < remaining_shape_rank; i++) {
a2_shape[1 + block_rank * 2 + i] = %input.shape[1 + block_rank + i]
}
%a2_reshape = tosa.RESHAPE(%a1_pad) {new_shape=a2_shape}
// Step 3 transpose to
// block-shape +
// [batch] +
// [padded_shape[1] / block_shape[0],
// ...
// [padded_shape[M] / block_shape[M-1]] +
// remaining_shape
vector <size_t> a3_perm(%a2_reshape.rank)
size_t block_num_elems = 1
for (int32 i = 0; i < block_rank; i++) {
a3_perm[i] = 1 + 2 * i + 1
a3_perm[block_rank + 1 + i] = 2 * i + 1
block_num_elems *= %block.as_constant()[i]
}
a3_perm[block_rank] = 0
for (int32 i = (1 + block_rank * 2); i < %a2_reshape.rank; i++) {
a3_perm[i] = i
}
%a3_reshape = tosa.RESHAPE(%a2_reshape) {perm=a3_perm}
// Step 4. Reshape transposed tensor to
// [ batch * prod(block_shape)] +
// [ padded_shape[1] / block_shape[0],
// ...,
// padded_shape[M] / block_shape[M-1]] +
// remaining_shape
vector <size_t> a4_shape(%input.rank)
a4_shape[0] = batch_size * block_num_elements
for (int32 i = 0; i < block_rank; i++) {
a4_shape[i + 1] = %a1_pad.shape[i + 1] / %block.as_constant()[i]
}
for (int32 i = 0; i < remaining_block_shape; i++) {
a4_shape[1 + block_rank + i] = %input.shape[1 + block_rank + i]
}
%output = tosa.RESHAPE(%a3_reshape) {new_shape=a4_shape}
return %output
}
Value lower_space_to_depth_op(Value %input, size_t block_size[], Format_t data_format)
{
assert(data_format == 'NHWC')
vector <size_t> a2_shape = {%input.shape[0],
%input.shape[1] / block_size[0],
%block_size[0],
%input_shape[2] / block_size[1],
%block_size[1],
%input_shape[3]}
%a2_reshape = tosa.RESHAPE(%input) {new_shape=a2_shape}
%a3_transpose = tosa.TRANSPOSE(%a2_reshape) {perm={0, 1, 3, 2, 4, 5}}
vector <size_t> a4_shape = {%input.shape[0],
%input_shape[1] / block_size[0],
%input_shape[2] / block_size[1],
%input_shape[3] * block_size[0] * block_size[1]}
%output = tosa.RESHAPE(%a3_transpose) {new_shape=%a4_shape}
return %output
}
Value lower_split_op(Value %value, size_t axis, size_t num_split)
{
Value %output[]
size_t slice_size = %value.shape[axis] / num_split
for (int32 i = 0; i < num_split; i++) {
vector <size_t> begin_vals, size_vals
for (int32 j = 0; j < %value.rank; j++) {
if (j == axis) {
begin_vals.push_back(slice_size * i)
size_vals.push_back(slice_size)
} else {
begin_vals.push_back(0)
size_vals.push_bac(%value.shape[j])
}
%output[i] = tosa.SLICE(%value) {start=begin_vals, size=size_vals}
}
}
%output_list = tosa.IDENTITYN(%output)
return %output_list
}
Value lower_splitv_op(Value %value, vector <size_t> size_split, size_t axis)
{
Value %output[]
size_t curr_split_start = 0
for (int32 i = 0; i < size_split.size(); i++) {
vector <size_t> begin_vals, size_vals
for (int32 j = 0; j < %value.rank; j++) {
if (j == axis) {
begin_vals.push_back(curr_split_start)
size_vals.push_back(size_split[i])
} else {
begin_vals.push_back(0)
size_vals.push_back(input.shape[j])
}
}
%output[i] = tosa.SLICE(%value) {start=begin_vals, size=size_vals}
curr_split_start += size_split[i]
}
%output_list = tosa.IDENTITYN(%output)
return %output_list
}
Value lower_squeeze_op(Value %input, vector<size_t> squeeze_dims)
{
vector <size_t> reshape_dims
if (squeeze_dims.size() == 0) {
// Remove all 1-dims
for (int32 i = 0; i < %input.rank; i++) {
if (%input.shape[i] != 1) {
reshape_dims.push_back(%input_shape[i])
}
}
} else {
// Remove the specified dimensions
for (int32 i = 0; i < %input.rank; i++) {
if (!squeeze_dims.find(i) || %input.shape[i] != -1) {
reshape_dims.push_back(%input_shape[i])
}
}
}
%output = tosa.RESHAPE(%input) {new_shape=reshape_dims}
return %output
}
Value lower_strided_slice_op(Value %input, Value %begin_val, Value %end_val, Value %strides_val,
size_t begin_mask, size_t end_mask, size_t ellipsis_mask,
size_t new_axis_mask, size_t shrink_axis_mask)
{
// Note: does not implement ellipsis_mask or reverse stride at this time
assert(ellipsis_mask == 0)
vector <size_t> begin(%begin_val.as_constant()), end(%end_val.as_constant()), strides(%strides_val.as_constant())
vector <size_t> a1_start, a1_size, a2_shape, a3_start, a3_size, a4_shape
for (int32 i = 0; i < %input.rank; i++) {
if (begin_mask & (1 << i)) {
begin[i] = 0
}
if (end_mask & (1 << i)) {
end[i] = %input.shape[i]
}
// Wrap around index if begin and end are negative
if (begin[i] < 0) {
begin[i] += %input.shape[i]
}
if (end[i] < 0) {
end[i] += %input.shape[i]
}
a1_start[i] = begin[i]
a1_size[i] = end[i] - begin[i]
a2_shape[i*2 + 0] = a1_size[i] / strides[i]
a2_shape[i*2 + 1] = strides[i]
a3_start[i*2 + 0] = 0
a3_start[i*2 + 1] = 0
if (shrink_axis_mask & (1 << i)) {
a3_size[i*2 + 0] = 1
} else {
a3_size[i*2 + 0] = a1_size[i] / strides[i]
}
a3_size[i*2 + 1] = 1
if (!(shrink_axis_mask & (1 << i))) {
if (new_axis_mask & (1 << i)) {
a4_shape.push_back(1)
a4_shape.push_back((a1_size[i] / strides[i]))
}
}
// Step 1: Slice the input array
%a1_slice = tosa.SLICE(%input) {start=a1_start, size=a1_size}
// Step 2: Reshape the sliced array: 2x as many dimensions as %input
%a2_reshape = tosa.RESHAPE(%a1_slice) {new_shape=a2_shape}
// Step 3: Take a slice of the [0] index along each of the strided dimensions (even dimensions)
%a3_slice = tosa.SLICE(%a2_reshape) {start=a3_start, size=a3_size}
// Step 4: Reshape the now-strided tensor back down to the desired number of dimensions
%output = tosa.RESHAPE(%a3_slice) {new_shape=a4_shape}
return %output
}
Value lower_unpack_op(Value %value, size_t axis, uint64_t num)
{
axis = positive_axis(axis)
Value %output_arr[]
// Step 1: transpose 'axis' to left-most dimension, if necessary
Value %transposed_value
if (axis != 0) {
vector <size_t> perms
perms.push_back(axis)
for (int32 i = 0; i < %input.rank; i++) {
if (i != axis)
perms.push_back(i)
}
%transposed_value = tosa.TRANSPOSE(%value) {perms=perms}
} else {
%transposed_value = %value
}
// Step 2: Slice [N, A, B, C] into [N] [A, B, C]
for (int32 i = 0; i < %transposed_value.rank; i++) {
vector <size_t> begin_vals, size_vals, shape_vals
begin_vals.push_back(i)
size_vals.push_back(1)
for (int32 j = 1; j < %transposed_value.rank; j++) {
begin_vals.push_back(0)
size_vals.push_back(transposed_value.shape[j])
shape_vals.push_back(transposed_value.shape[j])
}
%slice = %tosa.SLICE(%transposed_value) {begin=begin_vals, size=size_vals}
%output_arr[i] = %tosa.RESHAPE(%slice) {new_shape=shape_vals} {begin=begin_vals, size=size_vals}
}
// Combine array of sliced tensors into a list of tensors
%output = tosa.IDENTITYN(%output_arr)
return %output
}
vector<int64> get_transpose_conv2d_padding_values_from_pad_type(tensorflow::Padding padding, tensorflow::TensorFormat data_format,
uint32 first_filter_spatial_dim, type input_type, type filter_type
vector strides, vector dilations)
{
int64 pad_before, pad_after;
vector<int64> computed_padding
for (int32 i = 0; i < 2; i++) {
int64 ifm_dim = GetTensorSpatialDimIndex(4, data_format, i);
int64 ofm_dim = GetTensorSpatialDimIndex(4, data_format, i);
int64 filter_dim = first_filter_spatial_dim + 1
int64 ifm_size = input_shape[ifm_dim]
int64 ofm_size = output_dims[ofm_dim]
int64 filter_size = filter.shape[filter_dim]
int64 dim_dilation = dilations[i]
int64 dim_stride = strides[i]
int32 effective_filter_size = (filter_size - 1) * dim_dilation + 1
int32 total_padding = ((ifm_size - 1) * dim_stride + effective_filter_size - ofm_size)
total_padding = total_padding > 0 ? total_padding : 0
pad_before = total_padding / 2
pad_after = total_padding - pad_before
computed_padding.push_back(pad_before)
}
return computed_padding
}
Value lower_fused_activation(Value %input, string activation)
{
bool is_quantized = isa<QuantizedType>(%input.dtype) ? true : false
if (is_quantized) {
if (activation == "NONE") {
return %input
}
else if (activation == "RELU") {
int32 quantized_0 = %input.zp
int32 quantized_max = %input.storage_max
return tosa.CLAMP(%input) {min_int=quantized_0, max_int=quantized_max}
}
else if (activation == "RELU6") {
int32 quantized_0 = %input.zp
int32 quantized_6 = %input.zp + (6.0 / %input.scale)
return tosa.CLAMP(%input) {min_int=quantized_0, max_int=quantized_6}
}
else if (activation == "RELU_N1_TO_1") {
int32 quantized_n1 = %input.zp + (-1.0 / %input.scale)
int32 quantized_1 = %input.zp + (1.0 / %input.scale)
return tosa.CLAMP(%input) {min_int=quantized_n1, max_int=quantized_1}
}
}
else {
if (activation == "NONE") {
return %input
}
else if (activation == "RELU") {
return tosa.RELUN(%input) {max_fp=numeric_limit<float32>::max()}
}
else if (activation == "RELU6") {
return tosa.RELUN(%input) {max_fp=6.0}
}
else if (activation == "RELU_N1_TO_1") {
return tosa.CLAMP(%input) {min_fp=-1.0, max_fp=1.0}
}
else if (activation == "TANH") {
return tosa.TANH(%input)
}
}
}
Value get_table_const_tensor(function func)
{
array<int16, 513> table_array
for (int32 i = -256; i <= 256; i++) {
table_array[i] = func(i)
}
return tosa.CONST() {value=table_array}
}
Value lower_gather_op(Value %params, Value %indices, int32 batch_dims, int32 axis)
{
assert batch_dims <= %indices.rank
assert axis >= batch_dims
int32 N = W = K = C = 1
for (int32 i = 0; i < batch_dims; i++) N *= %params.shape[i]
for (int32 i = batch_dims; i < %indices.rank; i++) W *= %indices.shape[i]
K = %params.shape[axis]
for (int32 i = batch_dims; i < axis; i++) C *= %params.shape[i]
for (int32 i = (axis + 1); i < %params.rank; i++) C *= %params.shape[i]
vector<int32> params_idx_batch, params_idx_left, params_idx_indices, params_idx_right
for (int32 i = 0; i < %params.rank; i++) {
if (i < batch_dims && i < axis)
params_idx_batch.push_back(i)
else if (i < axis)
params_idx_left.push_back(i)
else if (i < (axis + 1))
params_idx_indices.push_back(i)
else
params_idx_right.push_back(i)
}
vector<int32> params_perm = {params_idx_batch, params_idx_left, params_idx_indices, params_idx_right}
vector<int32> result_perm
for (int32 i = 0; i < batch_dims; i++)
result_perm.push_back(i)
for (int32 i = 0; i < params_idx_left.size(); i++)
result_perm.push_back(params_idx_left[i])
for (int32 i = batch_dims; i < %indices.rank; i++)
result_perm.push_back(i)
for (int32 i = 0; i < params_idx_right.size(); i++)
result_perm.push_back(params_idx_right[i])
%const_params_perm = tosa.CONST() {value=params_perm}
%const_result_perm = tosa.CONST() {value=result_perm}
%op1_transpose_params = tosa.TRANSPOSE(%params, %const_params_perm)
%op2_reshape_op1 = tosa.RESHAPE(%op1_transpose_params) {shape={N,K,C}}
%op3_reshape_indices = tosa.RESHAPE(%indices) {shape={N,W}}
%op4_gather_op2_op3 = tosa.GATHER(%op2_reshape_op1, %op3_reshape_indices)
%op5_reshape_op4 = tosa.RESHAPE(%op4_gather_op2_op3) {shape={N,W,C}}
%op6_transpose_op5 = tosa.TRANSPOSE(%op5_reshape_op4, %const_result_perm)
}
Value lower_gather_nd_op(Value %params, Value %indices)
{
int32 N = W = K = C = ND = 1
ND = %indices.shape[%indices.rank - 1]
assert ND < %params.rank
for (int32 i = 0; i < (%indices.rank - 1); i++) W *= %indices.shape[i]
for (int32 i = 0; i < ND; i++) K = %params.shape[i]
for (int32 i = ND; i < %params.rank; i++) C *= %params.shape[i]
vector<int32> flatten_coeff_vec
for (int32 i = 0; i < ND; i++) flatten_coeff_vec.push_back(i)
flatten_coeff_vec.push_back(1)
%const_flatten_coeff = tosa.CONST() {value=flatten_coeff_vec}
%op1_reshape_params = tosa.RESHAPE(%params) {shape={N,K,C}}
%op2_reshape_indices = tosa.RESHAPE(%indices) {shape={W,ND}}
%op3_mul_op2_flatten_coeff = tosa.MUL(%op2_reshape_indices, %const_flatten_coeff)
%op4_rsum_op3 = tosa.REDUCE_SUM(%op3_mul_op2_flatten_coeff) {axis=1}
%op5_reshape_op4 = tosa.RESHAPE(%op4_rsum_op3) {shape={N,W}}
%op6_gather_op1_op5 = tosa.GATHER(%op1_reshape_params, %op5_reshape_op4)
%op7_reshape_op6 = tosa.RESHAPE(%op6_gather_op1_op5) {shape={N,W,C}}
}
Value lower_one_hot_op(Value %indices, Value %depth, Value %on_value, Value %off_value, int32 axis)
{
int32 N = W = C = 1
int32 K = %depth.as_constant()
int32 left_dim = right_dim = 1
for(int32 i : %indices.rank) {
int32 dim = %indices.shape[i]
N *= dim
if (i >= axis)
right_dim *= dim
else
left_dim *= dim
}
%perm_const = tosa.CONST() {value={0, 2, 1}}
%op1_reshape_on_value = tosa.RESHAPE(%on_value) {shape={1, 1, 1}}
%op2_tile_op1 = tosa.TILE(%op1_reshape_on_value) {multiples={N, W, C}}
%op3_reshape_off_value = tosa.RESHAPE(%off_value) {shape={1, 1, 1}}
%op4_tile_op1 = tosa.TILE(%op3_reshape_off_value) {multiples={N, K, C}}
%op5_reshape_indices = tosa.RESHAPE(%indices) {shape={N, W}}
%op6_scatter_op4_op5_op2 = tosa.SCATTER(%op4_tile_op1, %op5_reshape_indices, %op2_tile_op1)
%op7_reshape_op6 = tosa.RESHAPE(%op6_scatter_op4_op5_op2) {shape={left_dim, right_dim, K}}
%op8_transpose_op7 = tosa.TRANSPOSE(%op7_reshape_op6, %perm_const)
%op9_reshape_op8 = tosa.RESHAPE(%op8_transpose_op7) {shape=%output.shape}
}
## MLIR Passes Management
Legalization is built on multiple MLIR passes.
| MLIR Pass Name | Input Dialect | Output Dialect | Description |
| ------------------------- | ------------- | -------------- | --------------- |
| legalize_tf | TensorFlow | TOSA | Legalize |
: : : : TensorFlow :
: : : : dialect to TOSA :
: : : : dialect :
| fuse_tf_bias | TensorFlow | TOSA | Mapping |
: : : : tf.BiasAdd + :
: : : : tf.Conv2D to :
: : : : tosa.CONV2D :
| legalize_tfl | TensorFlow | TOSA | Legalize |
: : Lite : : TensorFlow Lite :
: : : : dialect to TOSA :
: : : : dialect :
| convert_tfl_uint8 | TensorFlow | TensorFlow | Convert |
: : Lite : Lite : quantized uint8 :
: : : : graph to int8 :
: : : : graph :
TF to TOSA legalization could be summarized by following pseudocode:
void legalize_tf_to_tosa(mlir::Module module) { mlir::PassManager pm
// other MLIR passes to optimize TF
pm.addPass(fuse_tf_bias)
pm.addPass(legalize_tf)
// other MLIR passes to optimize TOSA
} ```
TFLite to TOSA legalization could be summarized by following pseudocode:
void legalize_tfl_to_tosa(mlir::Module module)
{
mlir::PassManager pm
// other MLIR passes to optimize TFLite
pm.addPass(convert_tfl_uint8)
pm.addPass(legalize_tfl)
// other MLIR passes to optimize TOSA
}
Each of the passes is described in more detail in the subsequent chapters.
This operator is trivially lowered to tosa.ABS
TensorFlow Dialect
%output = tf.AddN(%inputs)
TOSA Lowering
%output = tosa.ADD(%inputs:0, %inputs:1)
for (int32 i = 2; i < %inputs.size; i++) {
%output = tosa.ADD(%inputs:i, %output)
}
Element-wise addition.
TensorFlow Dialect
%output = tf.Add(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.ADD.
Element-wise addition.
TensorFlow Dialect
%output = tf.Addv2(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.ADD.
Computes the "logical and" of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.all(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_ALL>(%input, %output.shape, %reduction_indices, keep_dims)
Computes the "logical or" of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.any(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_ANY>(%input, %output.shape, %reduction_indices, keep_dims)
Returns the index with the largest value across the given axis of the input tensor.
TensorFlow Dialect
%output = tf.ArgMax(%input, %dimension)
TOSA Lowering
int64 axis = positive_axis(%dimension)
%output = tosa.ARGMAX(%input) {axis=axis}
Returns the index with the smallest value across the given axis of the input tensor.
TensorFlow Dialect
%output = tf.ArgMin(%input, %dimension)
TOSA Lowering
No TOSA lowering defined.
Asserts that the given condition is true.
TensorFlow Dialect
%output = tf.Assert(%condition, %summarize)
TOSA Lowering
No TOSA lowering defined.
Adds a value to the current value of a variable.
TensorFlow Dialect
%output = tf.AssignAddVariableOp(%resource, %value, %dtype)
TOSA Lowering
No TOSA lowering defined.
Subtracts a value to the current value of a variable.
TensorFlow Dialect
%output = tf.AssignSubVariableOp(%resource, %value, %dtype)
TOSA Lowering
No TOSA lowering defined.
Assigns a new value to a variable.
TensorFlow Dialect
%output = tf.AssignVariableOp(%resource, %value, %dtype)
TOSA Lowering
No TOSA lowering defined.
Performs average pooling on the input.
TensorFlow Dialect
%output = tf.AvgPool(%value) {ksize, strides, padding, data_format}
TOSA Lowering
assert(data_format == "NHWC")
tosa_padding =
get_padding_values_from_pad_type(%input, ksize, padding, data_format,
FORMAT_OHWI, strides, {1, 1, 1, 1})
%output = tosa.AVG_POOL2D(%value) {ksize=ksize, strides=strides, padding=tosa_padding}
Multiplies slices of two tensors in batches.
TensorFlow Dialect
%output = tf.BatchMatMul(%x, %y, %adj_x, %adj_y)
TOSA Lowering
No TOSA lowering defined.
Multiplies slices of two tensors in batches.
TensorFlow Dialect
%output = tf.BatchMatMulV2(%x, %y, %adj_x, %adj_y)
TOSA Lowering
No TOSA lowering defined.
✗ Deprecated operator.
BatchToSpaceND for N-D tensors of type T.
TensorFlow Dialect
%output = tf.BatchToSpaceND(%input, %block_shape, %crops)
TOSA Lowering
%output = lower_batch_to_space_nd_op(%input, %block_shape, %crops, output.shape)
Training profile: TOSA lowering not yet defined.
Add bias to value.
TensorFlow Dialect
%output = tf.BiasAdd(%bias, %value) {data_format}
TOSA Lowering
assert(data_format == 'NHWC')
%output = tosa.ADD(%value, %bias)
Bitcasts a tensor from one type to another without copying data.
TensorFlow Dialect
%output = tf.BitCast(%input, %dtype)
TOSA Lowering
No TOSA lowering defined.
This operator is trivially lowered to tosa.BITWISE_AND.
This operator is trivially lowered to tosa.BITWISE_OR.
Training profile: TOSA lowering not yet defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.CAST.
This operator is trivially lowered to tosa.CEIL.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined. Training profile: TOSA lowering not yet defined.
No TOSA lowering defined.
Concatenates tensors along one dimension.
TensorFlow Dialect
%output = tf.ConcatV2(%values, %axis)
TOSA Lowering
%output = lower_concatv2_op(%values, %axis)
No TOSA lowering defined.
This operator is trivially lowered to tosa.CONST.
No TOSA lowering defined.
Computes the gradients of convolution with respect to the input.
TensorFlow Dialect
%output = tf.Conv2DBackpropInput(%input_sizes, %filter, %out_backprop) {strides, use_cudnn_on_gpu, padding, explicit_paddings, data_format, dilations}
TOSA Lowering
// Transpose filter from HWIO to OHWI
%tosa_filter = tosa.TRANSPOSE(%filter) {perms={2, 0, 1, 3}}
vector output_shape
for (int32 i = 0; i < input_sizes.size(); i++) {
output_shape.push_back(input_size[i])
}
if (%padding == "EXPLICIT") {
tosa_padding =
get_padding_values_from_explicit_pad_attr(explicit_padding, data_format)
} else {
tosa_padding =
get_transpose_conv2d_padding_values_from_pad_type(%input_sizes, %filter, output_shape, padding, data_format, FORMAT_HWIO, strides, dilations)
}
// Create a zero bias tensor
%zero_bias = tosa.CONST() {value={0}}
%output = tosa.TRANSPOSE_CONV2D(%out_backprop) {weight=%tosa_filter, bias=%zero_bias, outpad=tosa_pading, stride=strides, dilation==dilations, out_shape=out_shape}
Computes a 2-D convolution given 4-D input and filter tensors.
TensorFlow Dialect
%output = tf.Conv2D(%input, %filter) {strides, padding, explicit_paddings, data_format, dilations}
TOSA Lowering
assert(data_format == "NHWC")
// Transpose filter from HWIO to OHWI
%filter_tranpose = tosa.TRANSPOSE(%filter {perms={3, 0, 1, 2}}
if (padding == "EXPLICIT") {
tosa_padding =
get_padding_values_from_explicit_pad_attr(explicit_padding, data_format)
} else {
%tosa_padding =
get_padding_values_from_pad_type(%input, %filter.shape, padding, data_format,
FORMAT_HWIO, strides, dilations)
}
// Create a zero bias tensor
%zero_bias = tosa.CONST() {value={0}}
%output = tosa.CONV2D(%input, %filter_transpose, %zero_bias) {padding=tosa_padding, stride=strides, dilation=dilations}
TOSA lowering to tosa.CONV3D to be defined.
No TOSA lowering defined.
No TOSA lowering defined.
DepthToSpace for tensors of type T.
TensorFlow Dialect
%output = tf.DepthToSpace(%input) {block_size, data_format}
TOSA Lowering
%output = lower_depth_to_space_op(%input, block_size, data_format)
Computes a 2-D depthwise convolution given 4-D input and filter tensors.
TensorFlow Dialect
%output = tf.DepthwiseConv2dNative(%input, %filter) {strides, padding, data_format, dilations}
TOSA Lowering
if (padding == "EXPLICIT") {
tosa_padding =
get_padding_values_from_explicit_pad_attr(explicit_padding, data_format)
} else {
tosa_padding =
get_padding_values_from_pad_type(%input, %filter.shape, padding, data_format,
FORMAT_HWIO, strides, dilations)
}
bias_dim = %filter.shape[2] * %filter.shape[3]
// Create a zero-bias tensor
%zero_bias = tosa.CONST() {value={0} * bias_dim}
%output = tosa.DEPTHWISE_CONV2D(%input, %filter, %zero_bias) {stride=strides, dilation=dilations, padding=padding}
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Computes exponential linear: exp(features) - 1 if <0, features otherwise
TensorFlow Dialect
%output = tf.Elu(%features)
TOSA Lowering
%output = lower_elu_op(%features)
No TOSA lowering defined.
Returns the truth value of (x == y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.Equal(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.EQUAL.
This operator is trivially lowered to tosa.EXP.
Inserts a dimension of 1 into a tensor’s shape
TensorFlow Dialect
%output = tf.ExpandDims(%input, %axis)
TOSA Lowering
%output = lower_expand_dims(%input, %axis.to_constant())
Fake-quantize the 'inputs' tensor, type float to 'outputs' tensor of same type.
TensorFlow Dialect
%output = tf.FakeQuantWithMinMaxArgs(%inputs) {min, max, num_bits, narrow_range}
TOSA Lowering
%output = lower_fake_quant_op(%inputs, %min, %max, %num_bits, %narrow_range)
Fake-quantize the 'inputs' tensor of type float via global flats scalars min.
TensorFlow Dialect
%output = tf.FakeQuantWithMinMaxVars(%inputs, %min, %max) {num_bits, narrow_range}
TOSA Lowering
%output = lower_fake_quant_op(%inputs, %output.type, %min.to_constant(), %max.to_constant(), num_bits, narrow_range)
Fake-quantize the 'inputs' tensor of type float and one of the shapes [d].
TensorFlow Dialect
%output = tf.FakeQuantWithMinMaxVarsPerChannel(%inputs, %min, %max) {num_bits, narrow_range}
No TOSA lowering defined.
Creates a tensor filled with a scalar value
TensorFlow Dialect
%output = tf.Fill(%dims, %value)
TOSA Lowering
int64 total_size = 1
for (int32 i = 0; i < %dims.shape[0]; i++) {
total_size *= %dims[i]
}
vector<%value.dtype> fill_arr(total_size, %value)
%output = tosa.CONST() {value={fill_arr}}
Returns x // y element-wise.
TensorFlow Dialect
%output = tf.FloorDiv(%x, %y)
TOSA Lowering
%output = lower_floor_div(%lhs, %rhs)
Returns element-wise remainder of division when x < 0 xor x < y is true.
TensorFlow Dialect
%output = tf.FloorMod(%x, %y)
TOSA Lowering
%output = lower_floor_mod(%lhs, %rhs)
This operator is trivially lowered to tosa.FLOOR.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Batch normalization.
TensorFlow Dialect
%output = tf.FusedBatchNorm(%x, %scale, %offset, %mean, %variance) {epsilon, data_format, is_training}
assert(data_format == 'NHWC')
assert(is_training == false)
%epsilon_const = tosa.CONST() {value={epsilon}}
%op1 = tosa.SUB(%x, %bmean)
%op2 = tosa.ADD(%variance, %epsilon_const)
%op3 = tosa.RSQRT(%op2)
%op4 = tosa.MUL(%op1, %op3)
%op5 = tosa.MUL(%op4, %scale)
%output = tosa.ADD(%op5, %offset)
Batch normalization.
TensorFlow Dialect
%output = tf.FusedBatchNormV3(%x, %scale, %offset, %mean, %variance) {epsilon, data_format, is_training}
TOSA Lowering
assert(data_format == 'NHWC')
assert(is_training == false)
%epsilon_const = tosa.CONST() {value={epsilon}}
%op1 = tosa.SUB(%x, %bmean)
%op2 = tosa.ADD(%variance, %epsilon_const)
%op3 = tosa.RSQRT(%op2)
%op4 = tosa.MUL(%mean, %op3)
%op5 = tosa.MUL(%op4, %scale)
%output = tosa.ADD(%op5, %offset)
Gather slices from params into a Tensor with shape specified by indices.
TensorFlow Dialect
%output = tf.GatherNd(%params, %indices)
TOSA Lowering
%output = lower_gather_nd_op(%params, %indices)
Gathers slices from params according to indices.
TensorFlow Dialect
%output = tf.Gather(%params, %indices)
TOSA Lowering
%output = lower_gather_op(%params, %indices, 0, 0)
Gathers slices from params axis according to indices.
TensorFlow Dialect
%output = tf.GatherV2(%params, %indices, %axis) {batch_dims}
TOSA Lowering
%output = lower_gather_op(%params, %indices, batch_dims, %axis.to_constant())
Returns the truth value of (x >= y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.GreaterEqual(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.GREATER_EQUAL.
RetruReturns the truth value of (x > y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.Greater(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.GREATER.
No TOSA lowering defined.
Returns a list of tensors with the same shapes and contents as the input.
TensorFlow Dialect
%output = tf.IdentityN(%input)
TOSA Lowering
%output = tosa.IDENTITYN(%input)
Returns a tensor with the same shape and contents as the input.
TensorFlow Dialect
%output = tf.Identity(%input)
TOSA Lowering
%output = tosa.IDENTITY(%input)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.BITWISE_NOT.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Training profile: TOSA lowering not yet defined.
No TOSA lowering defined.
Computes rectified linear: max(features, features * alpha).
TensorFlow Dialect
%output = tf.LeakyRelu(%features) {alpha}
TOSA Lowering
%alpha_tensor = tosa.CONST() {value={alpha}}
%features_alpha = tosa.MUL(%features, %alpha_tensor)
%greater = tosa.GREATER(%features, %features_alpha)
%output = tosa.SELECT(%greater, %features, %features_alpha)
Computes the bitwise left-shift of x by y bits, element-wise.
TensorFlow Dialect
%output = tf.LeftShift(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.LOGICAL_LEFT_SHIFT.
No TOSA lowering defined.
Returns the truth value of (x ⇐ y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.LessEqual(%x, %y)
TOSA Lowering
%output_greater = tosa.GREATER(%x, %y)
%output = tosa.LOGICAL_NOT(%output_greater)
Returns the truth value of (x < y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.LessEqual(%x, %y)
TOSA Lowering
%output_greater_equal = tosa.GREATER_EQUAL(%x, %y)
%output = tosa.LOGICAL_NOT(%output_greater_equal)
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.LOG.
Computes log softmax activations.
TensorFlow Dialect
%output = tf.LogSoftmax(%logits)
TOSA Lowering
%output = lower_log_softmax_op(%logits)
Returns the truth value of x AND y, element-wise.
TensorFlow Dialect
%output = tf.LogicalAnd(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.LOGICAL_AND.
This operator is trivially lowered to tosa.LOGICAL_NOT.
Returns the truth value of x OR y, element-wise.
TensorFlow Dialect
%output = tf.LogicalOr(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.LOGICAL_OR.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Multiply the matrix a by the matrix b
TensorFlow Dialect
%output = tf.MatMul(%a, %b)
TOSA Lowering
%output = tosa.MATMUL(%a, %b)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Computes the maximum of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.Max(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_MAX>(%input, %output.shape, %reduction_indices, keep_dims)
Training profile: TOSA lowering not yet defined.
Performs max pooling on the input.
TensorFlow Dialect
%output = tf.MaxPool(%input) {ksize, strides, padding, data_format}
TOSA Lowering
assert(data_format == "NHWC")
tosa_padding =
get_padding_values_from_pad_type(%input, ksize, padding, data_format,
FORMAT_OHWI, strides, {1, 1, 1, 1})
%output = tosa.MAX_POOL2D(%value) {ksize=ksize, strides=strides, padding=tosa_padding}
This operator is trivially lowered to tosa.MAXIMUM.
Computes the mean of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.Mean(%input, %reduction_indices) {keep_dims}
TOSA Lowering
int32 num_elements_on_axis = 1
for (int32 axis : %reduction_indices) {
num_elements_on_axis *= %input.shape[axis]
}
float32 div_scale = 1.0 / num_elements_on_axis
%cst_div_scale = tosa.CONST() {value={div_scale}}
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_SUM>(%input, %output.shape, %reduction_indices, keep_dims)
%op2_mul_op1 = tosa.MUL(%op1_rsum_in, %cst_div_scale)
Computes the minimum of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.Min(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_MIN>(%input, %output.shape, %reduction_indices, keep_dims)
This operator is trivially lowered to tosa.MAXIMUM.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Returns the product of x and y, element-wise.
TensorFlow Dialect
%output = tf.Mul(%x, %y)
TOSA Lowering This operator is trivially lowered to tosa.MUL.
This operator is trivially lowered to tosa.NEGATE.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Returns the truth value of (x != y) element-wise with broadcasting.
TensorFlow Dialect
%output = tf.NotEqual(%x, %y)
TOSA Lowering
%equal = tosa.EQUAL(%x, %y)
%output = tosa.NOT(%equal)
OneHot operator.
TensorFlow Lite Dialect
%output = tf.OneHot(%indices, %depth, %on_value, %off_value) {axis}
TOSA Lowering
%output = lower_one_hot_op(%indices, %depth, %on_value, %off_value, axis)
No TOSA lowering defined.
Packs a list of N rank-R tensors into one rank-(R+1) tensor.
TensorFlow Dialect
%output = tf.Pack(%values) {axis}
TOSA Lowering
%output = lower_pack_op(%values, axis)
This operator is trivially lowered to tosa.PAD.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Not seen in practice. No lowering needed.
Not seen in practice. No lowering needed.
This operator is trivially lowered to tosa.POW.
Training profile: TOSA lowering not yet defined.
Computes the product of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.Prod(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_PRODUCT>(%input, %output.shape, %reduction_indices, keep_dims)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Returns the rank of the tensor.
TensorFlow Dialect
%output = tf.Rank(%input)
TOSA Lowering
%output = tosa.CONST() {value={%input.rank}}
No TOSA lowering defined.
Returns x / y element-wise for real types.
TensorFlow Dialect
%output = tf.RealDiv(%x, %y)
TOSA Lowering
%recip = tosa.RECIPROCAL(%y)
%output = tosa.MUL(%x, %recip)
No TOSA lowering defined.
This operator is trivially lowered to tosa.RECIPROCAL.
Computes rectified linear 6: min(max(features, 0), 6).
TensorFlow Dialect
%output = tf.Relu6(%features)
TOSA Lowering
%output = tosa.RELUN(%features) {max_val=6}
Training profile: TOSA lowering not yet defined.
Computes rectified linear 6: max(features, 0)
TensorFlow Dialect
%output = tf.Relu(%features)
TOSA Lowering
%output = tosa.RELUN(%features) {max_val=0}
Reshapes a tensor.
TensorFlow Dialect
%output = tf.Reshape(%tensor, %shape)
TOSA Lowering
%output = tosa.RESHAPE(%tensor) {new_shape=%shape.as_constant}
Resizes images to size using bilinear interpolation.
TensorFlow Dialect
%output = tf.ResizeBilinear(%images, %size) {align_corners, half_pixel_centers}
inferred from output shape. TOSA Lowering
%output = lower_resize_op(%images, %size, float, "BILINEAR")
Resizes images to size using nearest neighbor interpolation.
TensorFlow Dialect
%output = tf.ResizeNearestNeighbor(%images, %size) {align_corners, half_pixel_centers}
inferred from output shape. TOSA Lowering
%output = lower_resize_op(%images, %size, %output, float, "NEAREST_NEIGHBOR")
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
No TOSA lowering defined.
Reverses specific dimensions of a tensor.
TensorFlow Dialect
%output = tf.ReverseV2(%tensor, %axis)
TOSA Lowering
%output = lower_reversev2_op(%tensor, %axis)
Computes the bitwise left-shift of x by y bits, element-wise.
TensorFlow Dialect
%output = tf.LeftShift(%x, %y)
TOSA Lowering
if (is_unsigned(%x.dtype)) {
%output = tosa.LOGICAL_RIGHT_SHIFT(%x, %y)
} else {
%output = tosa.ARITHMETIC_RIGHT_SHIFT(%x, %y)
}
Rounds the values of a tensor to the nearest integer, element-wise.
TensorFlow Dialect
%output = tf.Round(%x)
TOSA Lowering
%output = lower_round_op(%x)
Training profile: TOSA lowering not yet defined.
This operator is trivially lowered to tosa.RSQRT.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Selects elements from t or e depending on condition.
TensorFlow Dialect
%output = tf.SelectV2(%condition, %t, %e)
TOSA Lowering
%output = lower_selectv2_op(%condition, %t, %e, %output.shape)
No TOSA lowering defined.
Returns the shape of a tensor.
TensorFlow Dialect
%output = tf.Shape(%input)
TOSA Lowering
%output = lower_shape_op(%input)
This operator is trivially lowered to tosa.SIGMOID.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Returns a slice from input.
TensorFlow Dialect
%output = tf.Slice(%input, %begin, %size)
TOSA Lowering
vector <size_t> output_size
try {
output_size = %size.as_constant()
} except(ConversionFailed) {
output_size = %output.shape
}
%output = tosa.SLICE(%input) {start=begin, size=output_size}
No TOSA lowering defined.
Training profile: TOSA lowering not yet defined.
Computes softmax activations
TensorFlow Dialect
%output = tf.Softmax(%logits)
TOSA Lowering
%op1 = tosa.EXP(%logits)
%op2 = tosa.REDUCE_SUM(op1) {reduce_axis=(%logits.rank - 1)}
%op3 = tosa.RECIPROCAL(%op2)
%output = tosa.MUL(%op1, %op3)
No TOSA lowering defined.
SpaceToBatch for N-D tensors of type T.
TensorFlow Dialect
%output = tf.SpaceToBatchND(%input, %block_shape, %paddings)
TOSA Lowering
%output = lower_space_to_batch_nd_op(%input, %block_shape, %paddings)
SpaceToDepth for tensors of type T.
TensorFlow Dialect
%output = tf.SpaceToDepth(%input) {block_size, data_format}
TOSA Lowering
%output = lower_space_to_depth_op(%input, block_size, data_format)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Splits a tensor into num_split tensors along one dimension
TensorFlow Dialect
%output = tf.Split(%split_dim, %value) {num_split}
TOSA Lowering
%output = lower_split_op(%value, %split_dim.as_constant(), num_split)
Splits a tensor into num_split tensors along one dimension
TensorFlow Dialect
%output = tf.SplitV(%value, %size_splits, %split_dim) {num_split}
TOSA Lowering
%output = lower_splitv_op(%value, %size_splits.as_constant(), %split_dim.as_constant())
No TOSA lowering defined.
Computes the square of x, element-wise.
TensorFlow Dialect
%output = tf.Square(%x)
TOSA Lowering
%output = tosa.MUL(%x, %x)
Computes (x-y)*(x-y) element-wise
TensorFlow Dialect
%output = tf.SquareDifference(%x, %y)
TOSA Lowering
%diff = tosa.SUB(%x, %y)
%output = tosa.MUL(%diff, %diff)
Removes dimensions of size 1 from the shape of a tensor.
TensorFlow Dialect
%output = tf.Squeeze(%input) {squeeze_dims}
TOSA Lowering
%output = lower_squeeze_op(%input, squeeze_dims)
No TOSA lowering defined.
Training profile: TOSA lowering not yet defined.
Training profile: TOSA lowering not yet defined.
Return a strided slice from input.
TensorFlow Dialect
%output = tf.StridedSlice(%input, %begin, %end, %strides) {begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask}
TOSA Lowering
%output = lower_strided_slice_op(%input, %begin, %end, %strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask)
This operator is trivially lowered to tosa.SUB.
Computes the sum of elements across dimensions of a tensor.
TensorFlow Dialect
%output = tf.Sum(%input, %reduction_indices) {keep_dims}
TOSA Lowering
%output = lower_reduce_op<tosa.REDUCE_SUM>(%input, %output.shape, %reduction_indices, keep_dims)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Training profile: TOSA lowering not yet defined.
This operator is trivially lowered to tosa.TANH.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Constructs a tensor by tiling a given tensor.
TensorFlow Dialect
%output = tf.Tile(%input, %multiples)
TOSA Lowering
%output = tosa.TILE(%input) {multiples=%multiples.as_constant()}
No TOSA lowering defined.
No TOSA lowering defined.
Shuffle dimensions of x according to a permutation.
TensorFlow Dialect
%output = tf.Transpose(%x, %perm)
TOSA Lowering
%output = tosa.TRANSPOSE(%x) {perm=%perm.as_constant()}
No TOSA lowering defined.
No TOSA lowering defined.
Unpacks a given dimension of a rank-R tensor into num rank-(R-1) tensors.
TensorFlow Dialect
%output = tf.Unpack(%value) {axis, num}
TOSA Lowering
%output = lower_unpack_op(%value, axis, num)
No TOSA lowering defined.
No TOSA lowering defined. === tf.UnsortedSegmentProd
No TOSA lowering defined. === tf.UnsortedSegmentSum
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Returns a tensor of zeros with the same shape and type as x.
TensorFlow Dialect
%output = tf.ZerosLike(%x)
TOSA Lowering
%output = tosa.CONST() {value={0} * %x.num_elements}
This operator is trivially lowered to tosa.ABS
add_n operator.
TensorFlow Lite Dialect
%sum = tfl.add_n(%inputs)
TOSA Lowering
%output = tosa.ADD(%inputs:0, %inputs:1)
for (int32 i = 2 i < %inputs.size i++) {
%output = tosa.ADD(%inputs:i, %output)
}
Element-wise addition operation.
TensorFlow Lite Dialect
%output = tfl.add(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%result = tosa.ADD(%lhs, %rhs)
If input/output tensors are all quantized typed,
Prepare:
float64 max_scale_2x = 2.0 * max(%lhs.scale, %rhs.scale)
float64 lhs_scale = float64(1 << input_shift) * %lhs.scale / max_scale_2x
float64 rhs_scale = float64(1 << input_shift) * %rhs.scale / max_scale_2x
float64 output_scale = max_scale_2x / (%output.scale * float64(1 << input_shift))
Legalization:
%op1_rescale_lhs = tosa.RESCALE(%lhs) {scale=lhs_scale, input_zp=%lhs.zp, output_zp=0} // %lhs.dtype->i32
%op2_rescale_rhs = tosa.RESCALE(%rhs) {scale=rhs_scale, input_zp=%rhs.zp, output_zp=0} // %rhs.dtype->i32
%op3_add_op1_op2 = tosa.ADD(%op1_rescale_lhs, %op2_rescale_rhs)
%op4_rescale_op3 = tosa.RESCALE(%op3_add_op1_op2) {scale=output_scale} // i32->%output.dtype
ArgMax operator.
TensorFlow Lite Dialect
%output = tfl.arg_max(%input, %dim)
TOSA Lowering
%result = tosa.ARGMAX(%input) {axis=positive_axis(%dim_const.as_constant(), %input.rank)}
No TOSA lowering defined.
Average_pool_2d operator.
TensorFlow Lite Dialect
%output = tfl.average_pool_2d(%input) {filter_height, filter_width, padding, stride_h, stride_w, fused_activation_function}
TOSA Lowering
Prepare:
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, tensor<{filter_height, filter_width}, tosa.int32>,
{1, stride_h, stride_w, 1}, {1, 1, 1, 1})
If input/output tensors are all non-quantized typed,
Legalization:
%avgpool2d = tosa.AVG_POOL2D(%input) {kernel={filter_height, filter_width}, stride={stride_h, stride_w}, padding=tosa_padding}
if(fused_activation != NONE) {
%result = convert_fused_activation(%avgpool2d, fused_activation)
}
else {
%result = %avgpool2d
}
If input/output tensors are all quantized typed,
Legalization:
%avgpool2d = tosa.AVG_POOL2D(%input) {kernel={filter_height, filter_width}, stride={stride_h, stride_w}, padding=tosa_padding, quantization_info={input_zp=%input.zp, output_zp=%output.zp}}
if(fused_activation != NONE) {
%result = convert_fused_activation(%avgpool2d, fused_activation)
}
else {
%result = %avgpool2d
}
No TOSA lowering defined.
BatchToSpaceNd operator.
TensorFlow Lite Dialect
%output = tfl.batch_to_space_nd(%input, %block_shape, %indices)
TOSA Lowering
%result = convert_batch_to_space_nd_op(%input, %block_shape, %indices)
This operator is trivially lowered to tosa.CAST
Ceil operator.
TensorFlow Lite Dialect
%y = tfl.ceil(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%result = tosa.CEIL(%x)
Concatenation operator.
TensorFlow Lite Dialect
%output = tfl.concatenation(%values) {axis}
TOSA Lowering
%result = lower_concatv2_op(%values, axis)
This operator is trivially lowered to tosa.CONST
Convolution operator.
TensorFlow Lite Dialect
%output = tfl.conv_2d(%input, %filter, %bias) {dilation_h_factor, dilation_w_factor, fused_activation_function, padding, stride_h, stride_w}
TOSA Lowering
If input/output tensors are all non-quantized typed,
Prepare:
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, %filter.type,
{1, stride_h, stride_w, 1}, {1, dilation_h_factor, dilation_w_factor, 1})
Legalization:
%conv2d = tosa.CONV2D(%input, %filter, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={dilation_h_factor, dilation_w_factor}}
if(fused_activation != NONE) {
%result = convert_fused_activation(%conv2d, fused_activation_function)
}
else {
%result = %conv2d
}
If input/output tensors are all quantized typed,
Prepare:
float64 output_rescale_scale = (%input.scale * %filter.scale) / %output.scale
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, %filter.type,
{1, stride_h, stride_w, 1}, {1, dilation_h_factor, dilation_w_factor, 1})
Legalization:
%conv2d = tosa.CONV2D(%input, %filter, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={dilation_h_factor, dilation_w_factor}, quantization_info={input_zp=%input.zp, weight_zp=%filter.zp}}
%rescale = tosa.RESCALE(%conv2d) {scale=output_rescale_scale, input_zp=0, output_zp=%output.zp} // %conv2d.dtype->%output.dtype
if(fused_activation != NONE) {
%result = convert_fused_activation(%rescale, fused_activation_function)
}
else {
%result = %rescale
}
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
DepthToSpace operator.
TensorFlow Dialect
%output = tfl.depth_to_space(%input) {block_size}
TOSA Lowering
%output = lower_depth_to_space_op(%input, block_size, "NHWC")
Depthwise-separable convolution operator.
TensorFlow Lite Dialect
%output = tfl.depthwise_conv_2d(%input, %filter, %bias) {dilation_h_factor, dilation_w_factor, fused_activation_function, padding, stride_h, stride_w, depth_multiplier}
TOSA Lowering
If input/output tensors are all non-quantized typed,
Prepare:
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, %filter.type,
{1, stride_h, stride_w, 1}, {1, dilation_h_factor, dilation_w_factor, 1})
Legalization:
%depthwise_conv2d = tosa.DEPTHWISE_CONV2D(%input, %filter, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={dilation_h_factor, dilation_w_factor}}
if(fused_activation != NONE) {
%result = convert_fused_activation(%depthwise_conv2d, fused_activation_function)
}
else {
%result = %depthwise_conv2d
}
If input/output tensors are all quantized typed,
Prepare:
float64 output_rescale_scale = (%input.scale * %filter.scale) / %output.scale
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, %filter.type,
{1, stride_h, stride_w, 1}, {1, dilation_h_factor, dilation_w_factor, 1})
Legalization:
%depthwise_conv2d = tosa.DEPTHWISE_CONV2D(%input, %filter, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={dilation_h_factor, dilation_w_factor}, quantization_info={input_zp=%input.zp, weight_zp=%filter.zp}}
%rescale = tosa.RESCALE(%conv2d) {scale=output_rescale_scale, input_zp=0, output_zp=%output.zp} // %depthwise_conv2d.dtype->%output.dtype
if(fused_activation != NONE) {
%result = convert_fused_activation(%rescale, fused_activation_function)
}
else {
%result = %rescale
}
Dequantize operator.
TensorFlow Lite Dialect
%output = tfl.dequantize(%input)
TOSA Lowering
%result = lower_dequantize_op(%input, %input.scale, %input.zp)
Division operator.
TensorFlow Lite Dialect
%output = tfl.div(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%rcp = tosa.RECIPROCAL(%rhs)
%mul = tosa.MUL(%lhs, %rcp)
Exponential Linear Unit operator.
TensorFlow Lite Dialect
%y = tfl.elu(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%rcp = lower_elu_op(%x)
Embedding lookup operator.
TensorFlow Lite Dialect
%output = tfl.embedding_lookup(%lookup, %value)
This operator is trivially lowered to tosa.EQUAL
Natural exponentiation operator.
TensorFlow Lite Dialect
%y = tfl.exp(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%result = tosa.EXP(%x)
Inserts a dimension of 1 into a tensor’s shape.
TensorFlow Lite Dialect
%output = tfl.expand_dims(%input, %dim)
TOSA Lowering
%result = lower_expand_dims(%input, %dim.as_constant())
No TOSA lowering defined.
FakeQuant operator
TensorFlow Lite Dialect
%output = tfl.fake_quant(%input) {min, max, num_bits, narrow_range}
TOSA Lowering
%result = convert_fake_quant_op(%input, min, max, num_bits, narrow_range)
Fill the tensor with given value.
TensorFlow Lite Dialect
%res = tfl.fill(%dims, %value)
TOSA Lowering
Prepare:
total_size = 1
dim_vec = %dim.as_constant()
for(int32 i = 0 i < dim_vec.size() i++) {
total_size *= dim_vec[i]
}
filled_val = %value.as_constant()[0]
output_type = tensor<dim_vec, filled_val.dtype>
Legalization:
%result = tosa.CONST() {value={filled_val} * total_size}
Floor div operator.
TensorFlow Lite Dialect
%output = tfl.floor_div(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%recip = tosa.RECIPROCAL(%rhs)
%mul = tosa.MUL(%lhs, %recip)
%result = tosa.FLOOR(%mul)
Division remainder.
TensorFlow Lite Dialect
%output = tfl.floor_mod(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%recip = tosa.RECIPROCAL(%rhs)
%mul = tosa.MUL(%lhs, %recip)
%floor = tosa.FLOOR(%mul)
%result = tosa.SUB(%mul, %floor)
This operator is trivially lowered to tosa.FLOOR
Fully connected op.
TensorFlow Lite Dialect
%output = tfl.fully_connected(%input, %filter, %bias) {fused_activation_function}
TOSA Lowering
If input/output tensors are all non-quantized typed,
Prepare:
// input[N, IC] x filter[OC, IC] + bias[OC] -> output[N, OC]
auto input_reshape_shape = {%input.num_elements / %filter.shape[1], %filter.shape[1]}
Legalization:
if(!(%bias)) {
%bias_val = tosa.CONST() {value={0} * %filter.shape[3]}
}
else {
%bias_val = %bias
}
if(%input.rank != 2) {
%input_val = tosa.RESHAPE(%input) {shape=input_reshape_shape}
}
else {
%input_val = %input
}
%fc = tosa.FULLY_CONNECTED(%input_val, %filter, %bias_val)
if(fused_activation != NONE) {
%result = convert_fused_activation(%fc, fused_activation_function)
}
else {
%result = %fc
}
If input/output tensors are all quantized typed,
Prepare:
auto input_reshape_shape = {%input.num_elements / %filter.shape[1], %filter.shape[1]}
float64 output_rescale_scale = (%input.scale * %filter.scale) / %output.scale
Legalization:
if(!(%bias)) {
%bias_val = tosa.CONST() {value={0} * %filter.shape[3]}
}
else {
%bias_val = %bias
}
if(%input.rank != 2) {
%input_val = tosa.RESHAPE(%input) {shape=input_reshape_shape}
}
else {
%input_val = %input
}
%fc = tosa.FULLY_CONNECTED(%input_val, %filter, %bias_val)
%rescale = tosa.RESCALE(%fc) {scale=output_rescale_scale, input_zp=0, output_zp=%output.zp} // %fc.dtype->%output.dtype
if(fused_activation != NONE) {
%result = convert_fused_activation(%rescale, fused_activation_function)
}
else {
%result = %rescale
}
Gather_nd operator.
TensorFlow Dialect
%output = tfl.gather_nd(%params, %indices)
TOSA Lowering
%output = lower_gather_nd_op(%params, %indices)
Gather operator.
TensorFlow Dialect
%output = tfl.gather(%params, %indices) {axis}
TOSA Lowering
%output = lower_gather_op(%params, %indices, 0, axis)
This operator is trivially lowered to tosa.GREATER_EQUAL
This operator is trivially lowered to tosa.GREATER
Hardswish activation function.
TensorFlow Lite Dialect
%output = tfl.hard_swish(%input)
TOSA Lowering
If input/output tensors are all non-quantized typed,
%const_3 = tosa.CONST() {value={3.0}}
%const_rcp6 = tosa.CONST() {value={1.0 / 6.0}}
%op1_add_in_3 = tosa.ADD(%input, %const_3)
%op2_relun_op1 = tosa.RELUN(%op1_add_in_3) {max=6.0}
%op3_mul_in_op2 = tosa.MUL(%input, %op2_relun_op1)
%op4_mul_op3_rcp6 = tosa.MUL(%op3, %const_rcp6)
If input/output tensors are all quantized typed,
Prepare:
float64 input_sample_grain = 1.0 / 64.0
auto hardswish_func = [input_sample_grain](int32 x) -> int32 {
float64 v = (float64)x * input_sample_grain
float64 w = v + 3.0
w = (w < 0.0) ? 0.0 : ((w > 6.0) ? 6.0 : w)
v = (v * w) / 6.0
return std::lround(32768.0 * v)
}
float64 input_rescale_scale = (%input.scale * 128.0) / input_sample_grain
float64 output_rescale_scale = 1.0 / (128.0 * 32768.0 * %output.scale)
int32 quantized_3 = (int32)(std::ceil(3.0 / %input.scale)) + %input.zp
Legalization:
%table_const = get_table_const_tensor(hardswish_func)
%const_3 = tosa.CONST() {value={quantized_3}}
%op1_rescale_in = tosa.RESCALE(%input) {scale=input_rescale_scale, input_zp=%input.zp, output_zp=0} // %input.dtype->i16
%op2_table_op1 = tosa.TABLE(%op1_rescale_in, %table_const)
%op3_rescale_op2 = tosa.RESCALE(%op2_table_op1) {scale=output_rescale_scale, input_zp=0, output_zp=%output.zp} // i32->%output.dtype
%op4_rescale_in = tosa.RESCALE(%input {scale=1.0, input_zp=0, output_zp=0} // %input.dtype->i32
%op5_ge_op4 = tosa.GREATER_EQUAL(%op4_rescale_in, %const_3)
%op6_select_op5_in_op3 = tosa.SELECT(%op5_ge_op4, %input, %op3_rescale_op2)
No TOSA lowering defined.
No TOSA lowering defined.
Leaky Relu Operator.
TensorFlow Lite Dialect
%output = tfl.leaky_relu(%input) {alpha}
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%const_0 = tosa.CONST() {value={0.0}}
%const_alpha = tosa.CONST() {value={alpha}}
%op1_mul_in_alpha = tosa.MUL(%input, %const_alpha)
%op2_ge_in_0 = tosa.GREATER_EQUAL(%input, %const_0)
%op3_select_op2_in_op1 = tosa.SELECT(%op2_ge_in_0, %input, $op1_mul_in_alpha)
If input/output tensors are all quantized typed,
Prepare:
float32 scaled_alpha = (%input.scale * alpha) / %output.scale
float32 scaled_identity = %input.scale / %output.scale
Legalization:
%const_0 = tosa.CONST() {value={0}}
%op1_rescale_in = tosa.RESCALE(%input) {scale=1.0, input_zp=%input.zp} // %input.dtype->i32
%op2_ge_in_0 = tosa.GREATER_EQUAL(%input, %const_0)
%op3_rescale_in_alpha = tosa.RESCALE(%input) {scale=scaled_alpha, input_zp=%input.zp, output_zp=%output_zp} // %input.dtype->%output.dtype
%op4_rescale_in_identity = tosa.RESCALE(%input) {scale=scaled_identity, input_zp=%input.zp, output_zp=%output_zp} // %input.dtype->%output.dtype
%op5_select_op2_op3_op4 = tosa.SELECT(%op2_ge_in_0, %op4_rescale_in_identity, %op3_rescale_in_alpha)
Less_equal operator.
TensorFlow Lite Dialect
%output = tfl.less_equal(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_greater_lhs_rhs = tosa.GREATER(%lhs, %rhs)
%op2_not_op1 = tosa.LOGICAL_NOT(%op1_greater_lhs_rhs)
If input/output tensors are all quantized typed,
Legalization:
assert (%lhs.scale == %rhs.scale) && (%lhs.zp == %rhs.zp)
%op1_rescale_lhs = tosa.RESCALE(%lhs) {scale=1.0, input_zp=%lhs.zp, output_zp=0} // %lhs.dtype->i32
%op2_rescale_rhs = tosa.RESCALE(%rhs) {scale=1.0, input_zp=%rhs.zp, output_zp=0} // %rhs.dtype->i32
%op3_greater_op1_op2 = tosa.GREATER(%op1_rescale_lhs, %op2_rescale_rhs)
%op4_not_op3 = tosa.LOGICAL_NOT(%op3_greater_op1_op2)
Less operator.
TensorFlow Lite Dialect
%output = tfl.less(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_ge_lhs_rhs = tosa.GREATER_EQUAL(%lhs, %rhs)
%op2_not_op1 = tosa.LOGICAL_NOT(%op1_ge_lhs_rhs)
If input/output tensors are all quantized typed,
Legalization:
assert (%lhs.scale == %rhs.scale) && (%lhs.zp == %rhs.zp)
%op1_rescale_lhs = tosa.RESCALE(%lhs) {scale=1.0, input_zp=%lhs.zp, output_zp=0} // %lhs.dtype->i32
%op2_rescale_rhs = tosa.RESCALE(%rhs) {scale=1.0, input_zp=%rhs.zp, output_zp=0} // %rhs.dtype->i32
%op3_ge_op1_op2 = tosa.GREATER_EQUAL(%op1_rescale_lhs, %op2_rescale_rhs)
%op4_not_op3 = tosa.LOGICAL_NOT(%op3_ge_op1_op2)
No TOSA lowering defined.
No TOSA lowering defined.
Log softmax operator.
TensorFlow Lite Dialect
%output = tfl.log_softmax(%input)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%output = lower_log_softmax_op(%logits)
No TOSA lowering defined if input/output tensors are all quantized typed.
This operator is trivially lowered to tosa.LOGICAL_AND
This operator is trivially lowered to tosa.LOGICAL_NOT
This operator is trivially lowered to tosa.LOGICAL_OR
Logistic operator.
TensorFlow Lite Dialect
%y = tfl.logistic(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_sigmoid_in = tosa.SIGMOID(%x)
If input/output tensors are all quantized typed,
Prepare:
float64 input_sample_grain = 1.0 / 16.0
auto sigmoid_func = [input_sample_grain](int32 x) -> int32 {
float64 v = static_cast<float64>(x) * input_sample_grain
v = 1.0 / (1.0 + std::exp(-v))
return std::lround(32768.0 * v)
}
float32 input_rescale_scale = (%x.scale * 128.0) / input_sample_grain
float32 output_rescale_scale = 1.0 / (%y.scale * 32768.0 * 128.0);
Legalization:
%table_const = get_table_const_tensor(sigmoid_func)
%op1_rescale_in = tosa.RESCALE(%x) {scale=input_rescale_scale, input_zp=%x.zp, output_zp=0} // %x.dtype->i16
%op2_table_op1 = tosa.TABLE(%op1_rescale_in, %table_const)
%op3_rescale_op2 = tosa.RESCALE(%op2_table_op1) {scale=output_rescale_scale, input_zp=0, output_zp=%y.zp} // %int32->%y.dtype
No TOSA lowering defined.
No TOSA lowering defined.
Max Pool 2d op.
TensorFlow Lite Dialect
%output = tfl.max_pool_2d(%input) {filter_height, filter_width, padding, stride_h, stride_w, fused_activation_function}
TOSA Lowering
Prepare:
tosa_padding =
get_padding_values_from_pad_type(padding, NHWC, 1,
%input.type, tensor<{filter_height, filter_width}, tosa.int32>,
{1, stride_h, stride_w, 1}, {1, 1, 1, 1})
If input/output tensors are all non-quantized typed,
Legalization:
%maxpool2d = tosa.MAX_POOL2D(%input) {kernel={filter_height, filter_width}, stride={stride_h, stride_w}, padding=tosa_padding}
if(fused_activation != NONE) {
%result = convert_fused_activation(%maxpool2d, fused_activation)
}
else {
%result = %maxpool2d
}
If input/output tensors are all quantized typed,
Legalization:
%maxpool2d = tosa.MAX_POOL2D(%input) {kernel={filter_height, filter_width}, stride={stride_h, stride_w}, padding=tosa_padding, quantization_info={input_zp=%input.zp, output_zp=%output.zp}}
if(fused_activation != NONE) {
%result = convert_fused_activation(%maxpool2d, fused_activation)
}
else {
%result = %maxpool2d
}
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.MAXIMUM
Mean operator.
TensorFlow Lite Dialect
%output = tfl.mean(%input, %axis) {keep_dims}
TOSA Lowering
Prepare:
int32 num_elements_on_axis = 1
for (int32 axis : %reduction_indices) {
num_elements_on_axis *= %input.shape[axis]
}
float32 div_scale = 1.0 / num_elements_on_axis
If input/output tensors are all non-quantized typed,
Legalization:
%cst_div_scale = tosa.CONST() {value={div_scale}}
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_SUM>(%input, %output.shape, %axis, keep_dims)
%op2_mul_op1 = tosa.MUL(%op1_rsum_in, %cst_div_scale)
If input/output tensors are all quantized typed,
Legalization:
%rsum = lower_reduce_op<tosa.REDUCE_SUM>(%op1_rescale_in, %output.shape, %reduction_indices, keep_dims, 1.0f, %input_zp, div_scale * %input.scale / %output.scale, %output.zp)
This operator is trivially lowered to tosa.MINIMUM
No TOSA lowering defined.
Mul operator.
TensorFlow Lite Dialect
%output = tfl.mul(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_mul_in = tosa.MUL(%lhs, %rhs)
If input/output tensors are all quantized typed,
Legalization:
%op1_rescale_lhs = tosa.RESCALE(%lhs) {scale=1.0f, input_zp=%lhs.zp, output_zp=0} // %lhs.dtype->i32
%op2_rescale_rhs = tosa.RESCALE(%rhs) {scale=1.0f, input_zp=%rhs.zp, output_zp=0} // %rhs.dtype->i32
%op3_mul_op1_op2 = tosa.MUL(%op1_rescale_lhs, %op2_rescale_rhs)
%op4_rescale_op3 = tosa.RESCALE(%op3_mul_op1_op2) {scale=%lhs.scale * %rhs.scale / %output.scale, input_zp=0, output_zp=%output.zp} // i32->%output.dtype
This operator is trivially lowered to tosa.NEGATE
No TOSA lowering defined.
No TOSA lowering defined.
Not_equal operator.
TensorFlow Lite Dialect
%output = tfl.not_equal(%lhs, %rhs)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_equal_lhs_rhs = tosa.EQUAL(%lhs, %rhs)
%op2_not_op1 = tosa.LOGICAL_NOT(%op1_equal_lhs_rhs)
If input/output tensors are all quantized typed,
Legalization:
assert (%lhs.scale == %rhs.scale) && (%lhs.zp == %rhs.zp)
%op1_rescale_lhs = tosa.RESCALE(%lhs) {scale=1.0f, input_zp=%lhs.zp, output_zp=0} // %lhs.dtype->i32
%op2_rescale_rhs = tosa.RESCALE(%rhs) {scale=1.0f, input_zp=%rhs.zp, output_zp=0} // %rhs.dtype->i32
%op3_equal_op1_op2 = tosa.EQUAL(%op1_rescale_lhs, %op2_rescale_rhs)
%op4_not_op3 = tosa.LOGICAL_NOT(%op3_equal_op1_op2) // i32->%output.dtype
No TOSA lowering defined.
OneHot operator.
TensorFlow Lite Dialect
%output = tfl.one_hot(%indices, %depth, %on_value, %off_value) {axis}
TOSA Lowering
%output = lower_one_hot_op(%indices, %depth, %on_value, %off_value, axis)
No TOSA lowering defined.
Packs a list of tensors along a dimension into one tensor.
TensorFlow Dialect
%output = tf.pack(%values) {axis}
TOSA Lowering
%output = lower_pack_op(%values, axis)
This operator is trivially lowered to tosa.PAD
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.CONST
Quantize operator
TensorFlow Lite Dialect
%output = tfl.quantize(%input)
TOSA Lowering
Legalization:
if (isa<QuantizedType>(%input.dtype)) {
%op1_rescale_in = tosa.RESCALE(%input) {scale=%input.scale / %output.scale, input_zp=%input.zp, output_zp=%output.zp}
}
else {
%output = lower_quantize_op(%output.dtype, %input, %output.zp, %output.scale)
}
No TOSA lowering defined.
Rank operator
TensorFlow Lite Dialect
%output = tfl.rank(%input)
TOSA Lowering
Legalization:
%const = tosa.CONST() {value={%input.rank}}
Computes the "logical or" of elements across dimensions of a tensor.
TensorFlow Lite Dialect
%output = tfl.reduce_any(%input, %reduction_indices) {keep_dims}
TOSA Lowering
Legalization:
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_ANY>(%input, %output.shape, %reduction_indices, keep_dims)
Max-reduction operator.
TensorFlow Lite Dialect
%output = tfl.reduce_max(%input, %axes) {keep_dims}
TOSA Lowering
Legalization:
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_MAX>(%input, %output.shape, %reduction_indices, keep_dims)
Computes the min reduction along the specified axes.
TensorFlow Lite Dialect
%output = tfl.reduce_min(%input, %axes) {keep_dims}
TOSA Lowering
Legalization:
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_MIN>(%input, %output.shape, %reduction_indices, keep_dims)
Prod-reduction operator.
TensorFlow Lite Dialect
%output = tfl.reduce_prod(%input, %axes) {keep_dims}
TOSA Lowering
If input/output tensors are all float typed,
Legalization:
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_PROD>(%input, %output.shape, %reduction_indices, keep_dims)
No TOSA lowering defined.
Relu6 operator.
TensorFlow Lite Dialect
%y = tfl.relu6(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_relun_in = tosa.RELUN(%input) {max_int=0, max_fp=6.0}
If input/output tensors are all quantized typed,
Legalization:
%op1_rescale_in = tosa.RESCALE(%lhs) {scale=%x.scale / %y.scale, input_zp=%x.zp, output_zp=0} // %x.dtype->i32
%op2_relun_op1 = tosa.RELUN(%op1_rescale_in) {max_int=(6.0 / %y.scale), max_fp=0.0}
%op3_rescale_op2 = tosa.RESCALE(%op2_relun_op1) {scale=1.0, input_zp=0, output_zp=%y.zp // i32->%y.dtype
Relu operator.
TensorFlow Lite Dialect
%y = tfl.relu(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_relun_in = tosa.RELUN(%input) {max_int=0, max_fp=std::numeric_limits<float>::max()}
If input/output tensors are all quantized typed,
Legalization:
%op1_rescale_in = tosa.RESCALE(%lhs) {scale=%x.scale / %y.scale, input_zp=%x.zp, output_zp=0} // %x.dtype->i32
%op2_relun_op1 = tosa.RELUN(%op1_rescale_in) {max_int=std::numeric_limits<int32>::max(), max_fp=0.0}
%op3_rescale_op2 = tosa.RESCALE(%op2_relun_op1) {scale=1.0, input_zp=0, output_zp=%y.zp // i32->%y.dtype
This operator is trivially lowered to tosa.RESHAPE
ResizeBilinear Op.
TensorFlow Lite Dialect
%output = tfl.resize_bilinear(%input, %size) {aligned_corners, half_pixel_centers}
TOSA Lowering
%output = lower_resize_op(%input, %size, %input.dtype, "BILINEAR")
ResizeBilinear Op.
TensorFlow Lite Dialect
%output = tfl.resize_bilinear(%input, %size) {aligned_corners, half_pixel_centers}
TOSA Lowering
%output = lower_resize_op(%input, %size, %input.dtype, "NEAREST_NEIGHBOR")
No TOSA lowering defined.
ReverseV2 Operator.
TensorFlow Lite Dialect
%output = tfl.reverse_v2(%input, %axis)
TOSA Lowering
%output = lower_reversev2_op(%tensor, %axis)
Round operator.
TensorFlow Lite Dialect
%output = tfl.round(%input)
TOSA Lowering
%const_half = tosa.CONST() {value={0.5}}
%op1_add_in_half = tosa.ADD(%input, %const_half)
%op2_floor_op1 = tosa.FLOOR(%op1_add_in_half)
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.SELECT
This operator is trivially lowered to tosa.SELECT
Shape operator
TensorFlow Lite Dialect
%output = tfl.shape(%input)
TOSA Lowering
Legalization:
%const = tosa.CONST() {value=%input.shape}
No TOSA lowering defined.
This operator is trivially lowered to tosa.SLICE
Softmax operator.
TensorFlow Lite Dialect
%output = tfl.softmax(%input)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_exp_in = tosa.EXP(%input)
%op2_rsum_op1 = tosa.REDUCE_SUM(%op1_exp_in) {axis=(%input.rank-1)}
%op3_rcp_op2 = tosa.RECIPROCAL(%op2)
%op4_mul_op1_op3 = tosa.MUL(%op1, %op3)
If input/output tensors are all quantized typed,
Prepare:
float64 exp_sample_grain = 1.0 / 16.0
auto exp_func = [exp_sample_grain](int32 x) -> int32 {
double v = static_cast<float64>(x) * exp_sample_grain
v = v < 0.0 ? std::exp(v) : 1.0
return std::lround(32768.0 * v)
}
float64 one_over_one_plus_x_sample_grain = 1.0 / 256.0
auto one_over_one_plus_x_func = [one_over_one_plus_x_sample_grain](int32 x) -> int32 {
double v = static_cast<float64>(x) * one_over_one_plus_x_sample_grain
v = v < 0.0 ? 1.0 : 1.0 / (1.0 + v)
return std::lround(32768.0 * v)
}
float64 op4_rescale_scale = (%input.scale * 128.0) / exp_sample_grain
float64 op19_rescale_scale = 1.0 / (%output.scale * 256.0)
Legalization:
%const_exp_table = get_table_const_tensor(exp_func)
%const_one_over_one_plus_x_table = get_table_const_tensor(one_over_one_plus_x_func)
%const_3 = tosa.CONST() {value={3}}
%const_34 = tosa.CONST() {value={12+20-8}}
%const_2_to_31 = tosa.CONST() {value={1<<31}}
%const_16 = tosa.CONST() {value={16}}
%op1_rescale_in = tosa.RESCALE(%lhs) {scale=1.0f, input_zp=%x.zp, output_zp=0} // %x.dtype->i32
%op2_rmax_op1 = tosa.REDUCE_MAX(%op1_rescale_in) {axis=(%input.rank-1)}
%op3_sub_op1_op2 = tosa.SUB(%op1_rescale_in, %op2_relun_op1)
%op4_rescale_op3 = tosa.RESCALE(%op3_sub_op1_op2) {scale=op4_rescale_scale, input_zp=0, output_zp=0} // i32->i16
%op5_table_op4 = tosa.TABLE(%op4_rescale_op3, %const_exp_table)
%op6_rshift_op5_3 = tosa.ARITHMETIC_RIGHT_SHIFT(%op5_table_op4, %const_3)
%op7_rsum_op6 = tosa.REDUCE_SUM(%op6_rshift_op5_3) {axis=(%input.rank-1)}
%op8_clz_op7 = tosa.CLZ(%op7_rsum_op6)
%op9_sub_34_op8 = tosa.SUB(%const_34, %op8_clz_op7)
%op10_lshift_op7_op8 = tosa.LOGICAL_LEFT_SHIFT(%op7_rsum_op6, %op8_clz_op7)
%op11_sub_op10 = tosa.SUB(%op10_lshift_op7_op8, %const_2_to_31)
%op12_rshift_op11_16 = tosa.ARITHMETIC_RIGHT_SHIFT(%op11_sub_op10, %const_16)
%op13_cast_op12 = tosa.CAST(%op12_rshift_op11_16) // i32->i16
%op14_table_op13 = tosa.TABLE(%op13_cast_op12, %const_one_over_one_plus_x_table)
%op15_rescale_op14 = tosa.RESCALE(%op14_table_op13) {scale=1.0/128.0, input_zp=0, output_zp=0} // i32->i16
%op16_rescale_op5 = tosa.RESCALE(%op5_table_op4) {scale=1.0/128.0, input_zp=0, output_zp=0} // i32->i16
%op17_mul_op16_op15 = tosa.MUL(%op15_rescale_op14, %op16_rescale_op5)
%op18_rshift_op17_op9 = tosa.ARITHMETIC_RIGHT_SHIFT(%op17_mul_op16_op15, %op9_sub_34_op8)
%op19_rescale_op18 = tosa.RESCALE(%op18_rshift_op17_op9) {scale=op19_rescale_scale, input_zp=0, output_zp=%output.zp}
SpaceToBatchNd operator.
TensorFlow Dialect
%output = tfl.space_to_batch_nd(%input, %block_shape, %paddings)
TOSA Lowering
%output = lower_space_to_batch_nd_op(%input, %block_shape, %paddings)
SpaceToDepth operator.
TensorFlow Dialect
%output = tfl.space_to_depth(%input) {block_size}
TOSA Lowering
%output = lower_space_to_depth_op(%input, block_size, "NHWC")
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Splits a tensor into num_split tensors along one dimension.
TensorFlow Dialect
%output = tfl.split(%split_dim, %value) {num_split}
TOSA Lowering
%output = lower_split_op(%value, %split_dim.as_constant(), num_split)
Splits a tensor into num_split tensors along one dimension.
TensorFlow Dialect
%output = tfl.split_v(%value, %size_splits, %split_dim) {num_splits}
TOSA Lowering
%output = lower_splitv_op(%value, %size_splits.as_constant(), %split_dim.as_constant())
No TOSA lowering defined.
Square operator.
TensorFlow Lite Dialect
%y = tfl.square(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_mul_in = tosa.MUL(%x, %x)
If input/output tensors are all quantized typed,
Legalization:
%op1_rescale_x = tosa.RESCALE(%x) {scale=1.0f, input_zp=%x.zp, output_zp=0} // %x.dtype->i32
%op2_mul_op1_op1 = tosa.MUL(%op1_rescale_x, %op1_rescale_x)
%op3_rescale_op2 = tosa.RESCALE(%op2_mul_op1_op1) {scale=%(x.scale * %x.scale) / %output.scale, input_zp=0, output_zp=%y.zp} // i32->%y.dtype
Squared difference operator.
TensorFlow Lite Dialect
%output = tfl.squared_difference(%lhs, %rhs)
TOSA Lowering
Legalization:
%op1_sub_in = tosa.SUB(%lhs, %rhs)
%op2_mul_op1 = tosa.MUL(%op1_sub_in, %op1_sub_in)
Removes dimensions of size 1 from the shape of a tensor.
TensorFlow Dialect
%output = tfl.squeeze(%input) {squeeze_dims}
TOSA Lowering
%output = lower_squeeze_op(%input, squeeze_dims)
StridedSlice Op.
TensorFlow Dialect
%output = tfl.strided_slice(%input, %begin, %end, %strides) {begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask}
TOSA Lowering
%output = lower_strided_slice_op(%input, %begin, %end, %strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask)
This operator is trivially lowered to tosa.SUB
Sum operator.
TensorFlow Lite Dialect
%output = tfl.sum(%input, %axis) {keep_dims}
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_rsum_in = lower_reduce_op<tosa.REDUCE_SUM>(%input, %output.shape, %axis, keep_dims)
If input/output tensors are all quantized typed,
Legalization:
%rsum = lower_reduce_op<tosa.REDUCE_SUM>(%op1_rescale_in, %output.shape, %reduction_indices, keep_dims, 1.0f, %input_zp, (%input.scale / %output.scale), %output.zp)
Hyperbolic tangent operator.
TensorFlow Lite Dialect
%y = tfl.tanh(%x)
TOSA Lowering
If input/output tensors are all non-quantized typed,
Legalization:
%op1_tanh_in = tosa.TANH(%x)
If input/output tensors are all quantized typed,
Prepare:
float64 input_sample_grain = 1.0 / 32.0
auto tanh_func = [input_sample_grain](int32 x) -> int32 {
float64 v = static_cast<float64>(x) * input_sample_grain
v = std::exp(-2.0 * v)
v = (1.0 - v) / (1.0 + v)
return std::lround(32768.0 * v)
}
float32 input_rescale_scale = (%x.scale * 128.0) / input_sample_grain
float32 output_rescale_scale = 1.0 / (%y.scale * 32768.0 * 128.0);
Legalization:
%table_const = get_table_const_tensor(tanh_func)
%op1_rescale_in = tosa.RESCALE(%x) {scale=input_rescale_scale, input_zp=%x.zp, output_zp=0} // %x.dtype->i16
%op2_table_op1 = tosa.TABLE(%op1_rescale_in, %table_const)
%op3_rescale_op2 = tosa.RESCALE(%op2_table_op1) {scale=output_rescale_scale, input_zp=0, output_zp=%y.zp} // %int32->%y.dtype
This operator is trivially lowered to tosa.TILE
No TOSA lowering defined.
Transpose convolution operator.
TensorFlow Lite Dialect
%output = tfl.transpose_conv(%output_shape, %weights, %input) {padding, stride_h, stride_w}
TOSA Lowering
Prepare:
tosa_padding =
get_transpose_conv2d_padding_values_from_pad_type(%input.type, %weights.type, %output_shape, padding, "NHWC", FORMAT_HWIO, {stride_h, stride_w}, {1, 1})
If input/output tensors are all non-quantized typed,
Legalization:
%bias = tosa.CONST() {value={0.0} * %output.shape[3]}
%conv2d = tosa.TRANSPOSE_CONV2D(%input, %weight, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={1, 1}}
If input/output tensors are all quantized typed,
Prepare:
float64 output_rescale_scale = (%input.scale * %weights.scale) / %output.scale
Legalization:
%bias = tosa.CONST() {value={0} * %output.shape[3]}
%conv2d = tosa.TRANSPOSE_CONV2D(%input, %weight, %bias) {padding=tosa_padding, stride={stride_h, stride_w}, dilation={1, 1}}
%rescale = tosa.RESCALE(%conv2d) {scale=output_rescale_scale, input_zp=0, output_zp=%output.zp} // %conv2d.dtype->%output.dtype
This operator is trivially lowered to tosa.TRANSPOSE
No TOSA lowering defined.
No TOSA lowering defined.
No TOSA lowering defined.
Unpacks a tensor along a dimension into multiple tensors.
TensorFlow Dialect
%output = tfl.unpack(%input) {num, axis}
TOSA Lowering
%output = lower_unpack_op(%input, axis, num)
No TOSA lowering defined.
No TOSA lowering defined.
This operator is trivially lowered to tosa.YIELD
ZerosLike operator.
TensorFlow Dialect
%output = tfl.zeros_like(%input)
TOSA Lowering
%output = tosa.CONST() {value={0} * %input.num_elements}
Legalize (tf.Conv2D + tf.BiasAdd) to tosa.CONV2D. This is currently the only N:1 mapping in TOSA legalization.
From:
%conv2d = tf.Conv2D(%input, %filter) {...}
%bias_add = tf.BiasAdd(%conv2d, %bias)
To:
%conv2d = tosa.CONV2D(%input, %filter, %bias)
This pass does three things: