docs/layouts/shortcodes/generated/model_triton_advanced_section.html
| Key | Default | Type | Description |
|---|---|---|---|
| (none) | String | Authentication token for secured Triton servers. | |
| (none) | String | Compression algorithm for request body. Currently only gzip is supported. When enabled, the request body will be compressed to reduce network bandwidth. |
|
| (none) | Map | Custom HTTP headers as key-value pairs. Example: 'X-Custom-Header:value,X-Another:value2' |
|
| false | Boolean | Whether to flatten the batch dimension for array inputs. When true, shape [1,N] becomes [N]. Defaults to false. | |
| (none) | Integer | Request priority level (0-255). Higher values indicate higher priority. | |
| false | Boolean | Whether this request marks the end of a sequence for stateful models. When true, Triton will release the model's state after processing this request. See Triton Stateful Models for more details. | |
| (none) | String | Sequence ID for stateful models. A sequence represents a series of inference requests that must be routed to the same model instance to maintain state across requests (e.g., for RNN/LSTM models). See Triton Stateful Models for more details. | |
| false | Boolean | Whether this request marks the start of a new sequence for stateful models. When true, Triton will initialize the model's state before processing this request. See Triton Stateful Models for more details. |