ChatCompletionsProperties¶

A list of the chat completions property names and values in key/value pairs format.

ChatCompletionsProperties Properties¶

Property	Type	Required	Default
FrequencyPenalty	`number`	No	`0`
MaxTokens	`integer`	No
Messages	`object[]`	Yes
Model	`string`	Yes
ParallelToolCalls	`boolean`	No	`true`
PresencePenalty	`number`	No	`0`
ResponseFormat	`object`	No
Temperature	`number`	No	`0`
Tool	`array`	No
ToolChoice	`object`	No

FrequencyPenalty¶

Parameter to discourage the model from repeating the same words or phrases too frequently within the generated text. A higher frequency_penalty value will result in the model being more conservative in its use of repeated tokens. Valid range of values depends on the model doing the inference.

type: number
default: 0

MaxTokens¶

Parameter to control the maximum number of tokens that can be generated in the chat completion. Upper limit is dependent on the model (e.g. ~4k for GPT3.5, ~32k for GPT4-32k)

type: integer

Messages¶

Message history to be sent to LLM as prompt

type: object[]

Model¶

Name of the LLM. Ignored if the deployment supports only one model.

type: string

ParallelToolCalls¶

Whether to enable parallel function calling during tool use.

type: boolean
default: true

PresencePenalty¶

Parameter to encourage the model to include a diverse range of tokens in the generated text. A higher presence_penalty value will result in the model being more likely to generate tokens that have not yet been included in the generated text. Valid range of values depends on the model doing the inference.

type: number
default: 0

ResponseFormat¶

An object specifying the format that the model must output.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables the older JSON mode, which ensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it.

type: object

Temperature¶

Value affecting token generation in LLM. Higher values like 1.8 will make the output more random, while lower values like 0.2 will make the output more focused and deterministic. Valid range of values depends on the model doing the inference.

type: number
default: 0

Tool¶

A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.

type: array

ToolChoice¶

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

type: object