Chat Completions
Generate chat completions using the OpenAI API.
The chat
endpoint is used to generate chat completions using the OpenAI API. This endpoint is designed to assist with generating responses to messages in a chat application. The model is fine-tuned on a mixture of conversations from the internet, as well as a small number of conversations created in-house. The model is able to generate responses to messages in a conversational style, and can also generate messages in a specific persona.
Body
The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens).
The messages to generate chat completions for, in the chat format.
The model to use for generating completions.
If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE]
message. #magic___^_^___line
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p
but not both.
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature
but not both.
Response
The ID of the completion.
The model used for generating the completion.
The type of the completion object.
The timestamp of the completion creation, in unix epoch time.
The generated completions.
The usage statistics for the completion.
The system fingerprint of the completion.