Other Issues
1. Garbled Output from the Model
Currently, some models may produce garbled output when parameters are not set. In such cases, you can try setting parameters like temperature
, top_k
, top_p
, and frequency_penalty
.
Modify the corresponding payload as follows, adjusting for different languages as needed:
2. Explanation of max_tokens
For the LLM models provided by the platform:
-
Models with a
max_tokens
limit of16384
:- deepseek-ai/DeepSeek-R1
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
- deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
-
Models with a
max_tokens
limit of8192
:- Qwen/QwQ-32B-Preview
-
Models with a
max_tokens
limit of4096
:- All other LLM models not mentioned above
If you have special requirements, it is recommended to send feedback via email to contact@siliconflow.com.
3. Explanation of context_length
The context_length
varies across different LLM models. You can search for the specific model on the Models page to view detailed information about the model.
4. Explanation of 429
Error for DeepSeek-R1
and DeepSeek-V3
Models
-
Unverified Users: Limited to
100 requests
per day. If the daily request limit exceeds100
, a429
error will be returned with the message: “Details: RPD limit reached. Could only send 100 requests per day without real name verification.” You can unlock higher rate limits through real-name verification. -
Verified Users: Have higher rate limits. Refer to the Models page for specific values.
If the request limit is exceeded, a
429
error will also be returned.
5. Differences Between Pro and Non-Pro Models
-
For some models, the platform provides both free and paid versions. The free version retains its original name, while the paid version is prefixed with “Pro/” to distinguish them. The rate limits for the free version are fixed, while those for the paid version are variable. Refer to Rate Limits for specific rules.
-
For
DeepSeek R1
andDeepSeek V3
models, the naming differs based on the payment method. ThePro version
only supports payment through “recharge balance,” while thenon-Pro version
supports payment through both “gifted balance” and “recharge balance.”
6. Are There Any Time or Quality Requirements for Custom Voice Uploads in Speech Models?
- cosyvoice2: The uploaded voice must be less than 30 seconds.
- GPT-SoVITS: The uploaded voice should be between 3–10 seconds.
- fishaudio: No specific restrictions.
To ensure optimal voice generation results, it is recommended to upload a voice sample of around 8–10 seconds, with clear pronunciation and no noise or background sounds.
7. Troubleshooting Truncated Model Output
You can troubleshoot truncated output from the following perspectives:
- For API requests:
- max_tokens setting: Set
max_tokens
to an appropriate value. If the output exceeds themax_tokens
value, it will be truncated. The maximummax_tokens
value for the DeepSeek R1 series is 16384. - Enable streaming output requests: For non-streaming requests, long outputs may result in a
504 timeout
. - Set client timeout: Increase the client timeout to prevent truncation due to the timeout being reached before output completion.
- max_tokens setting: Set
- For third-party client requests:
- CherryStdio: The default
max_tokens
is 4096. Users can adjust this by enabling the “Message Length Limit” switch and settingmax_tokens
to an appropriate value.
- CherryStdio: The default