Back to Flink

Openai Configuration

docs/layouts/shortcodes/generated/openai_configuration.html

0.4-rc13.4 KB
Original Source
KeyDefaultTypeDescription
api-key

| (none) | String | OpenAI API key for authentication. | |

context-overflow-action

| truncated-tail |

Enum

| Action to handle context overflows.

Possible values:

  • "truncated-tail": Truncates exceeded tokens from the tail of the context.
  • "truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log.
  • "truncated-head": Truncates exceeded tokens from the head of the context.
  • "truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log.
  • "skipped": Skips the input row.
  • "skipped-log": Skips the input row. Records the skipping log.

| |

dimension

| (none) | Long | The size of the embedding result array. | |

endpoint

| (none) | String | Full URL of the OpenAI API endpoint, e.g., https://api.openai.com/v1/chat/completions or https://api.openai.com/v1/embeddings | |

error-handling-strategy

| RETRY |

Enum

| Strategy for handling errors during model requests.

Possible values:

  • "RETRY": Retry sending the request.
  • "FAILOVER": Throw exceptions and fail the Flink job.
  • "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.

| |

max-context-size

| (none) | Integer | Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded. | |

max-tokens

| (none) | Long | The maximum number of tokens that can be generated in the chat completion. | |

model

| (none) | String | Model name, e.g., gpt-3.5-turbo, text-embedding-ada-002. | |

n

| (none) | Long | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. | |

presence-penalty

| (none) | Double | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | |

response-format

| (none) |

Enum

| The format of the response, e.g., 'text' or 'json_object'.

Possible values:

  • "text"
  • "json_object"

| |

retry-fallback-strategy

| FAILOVER |

Enum

| Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry.

Possible values:

  • "FAILOVER": Throw exceptions and fail the Flink job.
  • "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.

| |

retry-num

| 100 | Integer | Number of retry for OpenAI client requests. | |

seed

| (none) | Long | If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed. | |

stop

| (none) | String | A CSV list of strings to pass as stop sequences to the model. | |

system-prompt

| "You are a helpful assistant." | String | The system message of a chat. | |

temperature

| (none) | Double | Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0. | |

top-p

| (none) | Double | The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both. |