创建文本对话请求
Creates a model response for the given chat conversation.
Documentation Index
Fetch the complete documentation index at: https://docs.siliconflow.com/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Body
- LLM
- VLM
Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.
deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-V3.1, deepseek-ai/DeepSeek-V3.1-Terminus, deepseek-ai/DeepSeek-V3.2-Exp, deepseek-ai/DeepSeek-V3.2, deepseek-ai/deepseek-vl2, deepseek-ai/DeepSeek-V4-Flash, deepseek-ai/DeepSeek-V4-Pro, nex-agi/DeepSeek-V3.1-Nex-N1, baidu/ERNIE-4.5-300B-A47B, THUDM/GLM-4-32B-0414, THUDM/GLM-4-9B-0414, zai-org/GLM-4.5, zai-org/GLM-4.5-Air, zai-org/GLM-4.5V, zai-org/GLM-5, zai-org/GLM-5.1, zai-org/GLM-4.7, zai-org/GLM-4.6, zai-org/GLM-4.6V, zai-org/GLM-5V-Turbo, THUDM/GLM-Z1-32B-0414, THUDM/GLM-Z1-9B-0414, tencent/Hunyuan-A13B-Instruct, tencent/Hunyuan-MT-7B, tencent/Hy3-preview, moonshotai/Kimi-K2.5, moonshotai/Kimi-K2.6, moonshotai/Kimi-K2-Instruct, moonshotai/Kimi-K2-Instruct-0905, moonshotai/Kimi-K2-Thinking, inclusionAI/Ling-flash-2.0, inclusionAI/Ling-mini-2.0, inclusionAI/Ring-flash-2.0, meta-llama/Meta-Llama-3.1-8B-Instruct, MiniMaxAI/MiniMax-M2.5, MiniMaxAI/MiniMax-M2.1, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B-Instruct, Qwen/Qwen2.5-72B-Instruct-128K, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen3-14B, Qwen/Qwen3-235B-A22B, Qwen/Qwen3-235B-A22B-Instruct-2507, Qwen/Qwen3-235B-A22B-Thinking-2507, Qwen/Qwen3-30B-A3B-Instruct-2507, Qwen/Qwen3-30B-A3B-Thinking-2507, Qwen/Qwen3-32B, Qwen/Qwen3-8B, Qwen/Qwen3-Coder-30B-A3B-Instruct, Qwen/Qwen3-Coder-480B-A35B-Instruct, Qwen/Qwen3-Next-80B-A3B-Instruct, Qwen/Qwen3-Next-80B-A3B-Thinking, Qwen/Qwen3-Omni-30B-A3B-Captioner, Qwen/Qwen3-Omni-30B-A3B-Instruct, Qwen/Qwen3-Omni-30B-A3B-Thinking, Qwen/Qwen3.5-122B-A10B, Qwen/Qwen3.5-27B, Qwen/Qwen3.5-35B-A3B, Qwen/Qwen3.5-397B-A17B, Qwen/Qwen3.5-9B, Qwen/Qwen3.6-27B, Qwen/Qwen3.6-35B-A3B, ByteDance-Seed/Seed-OSS-36B-Instruct, google/gemma-4-26B-A4B-it, google/gemma-4-31B-it, openai/gpt-oss-120b, openai/gpt-oss-20b "Qwen/Qwen3-32B"
A list of messages comprising the conversation so far.
1 - 10 elementsIf set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
false
The maximum number of tokens to generate. Ensure that input tokens + max_tokens do not exceed the model’s context window. As some services are still being updated, avoid setting max_tokens to the window’s upper bound; reserve ~10k tokens as buffer for input and system overhead. See Models(https://cloud.siliconflow.cn/models) for details.
4096
Switches between thinking and non-thinking modes. Default is True. This field supports the following models:
- Qwen/Qwen3-8B
- Qwen/Qwen3-14B
- Qwen/Qwen3-32B
- wen/Qwen3-30B-A3B
- Qwen/Qwen3-235B-A22B
- tencent/Hunyuan-A13B-Instruct
- zai-org/GLM-5V-Turbo
- zai-org/GLM-4.6V
- zai-org/GLM-4.5V
- deepseek-ai/DeepSeek-V3.1
- deepseek-ai/DeepSeek-V3.1-Terminus
- deepseek-ai/DeepSeek-V3.2-Exp
- deepseek-ai/DeepSeek-V3.2
If you want to use the function call feature for deepseek-ai/DeepSeek-V3.1, you need to set enable_thinking to false.
false
Maximum number of tokens for chain-of-thought output. This field applies to all Reasoning models.
128 <= x <= 327684096
Dynamic filtering threshold that adapts based on token probabilities.This field only applies to Qwen3.
0 <= x <= 10.05
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
null
Determines the degree of randomness in the response.
0.7
The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
0.7
50
0.5
Number of generations to return
1
An object specifying the format that the model must output.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.