api-key |
(none) |
String |
OpenAI API key for authentication. |
context-overflow-action |
truncated-tail |
Enum |
Action to handle context overflows.
Possible values:- "truncated-tail": Truncates exceeded tokens from the tail of the context.
- "truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log.
- "truncated-head": Truncates exceeded tokens from the head of the context.
- "truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log.
- "skipped": Skips the input row.
- "skipped-log": Skips the input row. Records the skipping log.
|
dimension |
(none) |
Long |
The size of the embedding result array. |
endpoint |
(none) |
String |
Full URL of the OpenAI API endpoint, e.g., https://api.openai.com/v1/chat/completions or https://api.openai.com/v1/embeddings |
error-handling-strategy |
RETRY |
Enum |
Strategy for handling errors during model requests.
Possible values:- "RETRY": Retry sending the request.
- "FAILOVER": Throw exceptions and fail the Flink job.
- "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
|
max-context-size |
(none) |
Integer |
Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded. |
max-tokens |
(none) |
Long |
The maximum number of tokens that can be generated in the chat completion. |
model |
(none) |
String |
Model name, e.g., gpt-3.5-turbo, text-embedding-ada-002. |
n |
(none) |
Long |
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. |
presence-penalty |
(none) |
Double |
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
response-format |
(none) |
Enum |
The format of the response, e.g., 'text' or 'json_object'.
Possible values: |
retry-fallback-strategy |
FAILOVER |
Enum |
Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry.
Possible values:- "FAILOVER": Throw exceptions and fail the Flink job.
- "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
|
retry-num |
100 |
Integer |
Number of retry for OpenAI client requests. |
seed |
(none) |
Long |
If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed. |
stop |
(none) |
String |
A CSV list of strings to pass as stop sequences to the model. |
system-prompt |
"You are a helpful assistant." |
String |
The system message of a chat. |
temperature |
(none) |
Double |
Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0. |
top-p |
(none) |
Double |
The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both. |