Back to Tensorrt

skipLayerNormPlugin

plugin/skipLayerNormPlugin/README.md

23.083.3 KB
Original Source

skipLayerNormPlugin

Table Of Contents

Description

Adds a residual tensor, applies layer normalization, i.e., transforms the mean and standard deviation to beta and gamma respectively. Optionally, adds a bias vector before layer-normalization.

Structure

The skipLayerNormPlugin takes two inputs; input and skip.

input For V1, V2, V5, V6, input is a tensor with shape [S, B, E, 1, 1] where S is the sequence length, B is the batch size, E is the hidden size, and the last two dimensions are of size 1. For V3 and V4, input is a tensor with shape [1, E, S', 1] where S' is the accumulated sequence length, E is the hidden size, and the first and last dimensions are of size 1.

skip skip has the same input dimensions as the input. The purpose of this input is to introduce skip (aka. residual) connections to previously computed tensors.

The skipLayerNormPlugin generates the following output:

output output is a tensor with the same shape as the input.

Parameters

skipLayerNormPlugin has plugin creator class SkipLayerNormPluginDynamicCreator and plugin class CustomSkipLayerNormPluginDynamic.

The parameters are defined below and consists of the following attributes:

TypeParameterVersionDescription
inttype_id1, 2, 5, 6Integer encoding the DataType (0: FP32, 1: FP16, 2: INT8)
intld1, 5The leading dimension of the input tensor, corresponding to the hidden size, denoted by E above.
Weightsbeta1, 2, 3, 4, 5, 6, 7, 8The mean to normalize to. Shape: [1, 1, E]
Weightsgamma1, 2, 3, 4, 5, 6, 7, 8The standard deviation to normalize to. Shape: [1, 1, E]
Weightsbias1, 2, 5, 6An optional bias vector to add before normalization. Shape: [1, 1, E]

Additional resources

License

For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.

Changelog

July 2024 Add v5, v6, v7 and v8 plugins that duplicate the behavior of v1, v3, v3 and v4 plugins respectively, but implement the IPluginV3 interface instead of the deprecated IPluginV2DynamicExt interface.

February 2024 Add epsilon to avoid divide by zero.

October 2020 Add V2 plugin that supports variable sequence length. Add v3 plugin that supports int8 interleaved variable sequence length.

November 2019 This is the first release of this README.md file.

Known issues

This plugin only supports GPUs with compute capability >= 7.0. For more information see the CUDA GPU Compute Capability Support Matrix