Create chat completions
Creates a model response for the given chat conversation.
Authorizations
Use the following format for authentication: Bearer <your api key>
Body
Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.
THUDM/GLM-Z1-32B-0414
, THUDM/GLM-4-32B-0414
, THUDM/GLM-4-9B-0414
, THUDM/GLM-4-9B-0414
, deepseek-ai/DeepSeek-R1
, deepseek-ai/DeepSeek-V3
, deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
, deepseek-ai/DeepSeek-R1-Distill-Llama-70B
, deepseek-ai/DeepSeek-R1-Distill-Llama-8B
, Qwen/Qwen2.5-72B-Instruct-128K
, Qwen/Qwen2.5-72B-Instruct
, Qwen/Qwen2.5-32B-Instruct
, Qwen/Qwen2.5-14B-Instruct
, Qwen/Qwen2.5-7B-Instruct
, Qwen/Qwen2.5-Coder-32B-Instruct
, Qwen/QwQ-32B
"deepseek-ai/DeepSeek-V3"
A list of messages comprising the conversation so far.
If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
false
The maximum number of tokens to generate.
1 <= x <= 16384
512
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
null
Determines the degree of randomness in the response.
0.7
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
0.7
50
0.5
Number of generations to return
1
An object specifying the format that the model must output.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.
THUDM/GLM-Z1-32B-0414
, THUDM/GLM-4-32B-0414
, THUDM/GLM-4-9B-0414
, THUDM/GLM-4-9B-0414
, deepseek-ai/DeepSeek-R1
, deepseek-ai/DeepSeek-V3
, deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
, deepseek-ai/DeepSeek-R1-Distill-Llama-70B
, deepseek-ai/DeepSeek-R1-Distill-Llama-8B
, Qwen/Qwen2.5-72B-Instruct-128K
, Qwen/Qwen2.5-72B-Instruct
, Qwen/Qwen2.5-32B-Instruct
, Qwen/Qwen2.5-14B-Instruct
, Qwen/Qwen2.5-7B-Instruct
, Qwen/Qwen2.5-Coder-32B-Instruct
, Qwen/QwQ-32B
"deepseek-ai/DeepSeek-V3"
A list of messages comprising the conversation so far.
If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
false
The maximum number of tokens to generate.
1 <= x <= 16384
512
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
null
Determines the degree of randomness in the response.
0.7
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
0.7
50
0.5
Number of generations to return
1
An object specifying the format that the model must output.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.
Qwen/Qwen2.5-VL-32B-Instruct
, Qwen/Qwen2.5-VL-72B-Instruct
, deepseek-ai/deepseek-vl2
"Qwen2.5-VL-32B-Instruct"
A list of messages comprising the conversation so far.
If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
false
The maximum number of tokens to generate.
1 <= x <= 4096
512
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Determines the degree of randomness in the response.
0.7
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
0.7
50
0.5
Number of generations to return
1
An object specifying the format that the model must output.