Reasoning
Overview
DeepSeek-R1 is a series of advanced language models developed by deepseek-ai, designed to enhance the accuracy of final answers by outputting reasoning chain content (reasoning_content
). This interface is compatible with the deepseek interface, and it is recommended to upgrade the OpenAI SDK to support new parameters when using this model.
Supported Models:
- deepseek-ai/DeepSeek-R1
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Installation and Upgrade
Before using DeepSeek-R1, ensure that the latest version of the OpenAI SDK is installed. You can upgrade it using the following command:
API Parameters
-
Input Parameters:
max_tokens
: The maximum length of the answer (including reasoning chain output). The maximum value formax_tokens
is 16k.
-
Return Parameters:
reasoning_content
: The reasoning chain content, at the same level ascontent
.content
: The final answer content.
-
Usage Recommendations:
- Set
temperature
between 0.5 and 0.7 (recommended value is 0.6) to prevent infinite loops or incoherent outputs. - Set
top_p
to 0.95. - Avoid adding system prompts; all instructions should be included in the user prompt.
- For mathematical problems, include an instruction in the prompt, such as: “Please reason step by step and write the final answer in \boxed.”
- When evaluating model performance, it is recommended to conduct multiple tests and average the results.
- The DeepSeek-R1 series tends to bypass reasoning mode (outputting “\n\n”) for certain queries, which may affect model performance. To ensure adequate reasoning, it is recommended to force the model to start each output with “\n”.
- Set
Context Concatenation
During each round of conversation, the model outputs reasoning chain content (reasoning_content
) and the final answer (content
). In the next round of conversation, the reasoning chain content from the previous round will not be concatenated into the context.
OpenAI Request Example
Streaming Output Request
Non-Streaming Output Request
Notes
- API Key: Ensure you are using the correct API key for authentication.
- Streaming Output: Streaming output is suitable for scenarios where incremental responses are needed, while non-streaming output is better for retrieving the complete response at once.
FAQs
-
How to obtain an API key?
Visit SiliconFlow to register and obtain an API key.
-
How to handle long texts?
You can control the output length by adjusting the
max_tokens
parameter, but note that the maximum length is 16K.