1. Using Stream Mode in Python

1.1 Stream Mode with the OpenAI Library

In general scenarios, it is recommended to use the OpenAI library for Stream Mode.

from openai import OpenAI

client = OpenAI(
    base_url='https://api.ap.siliconflow.com/v1',
    api_key='your-api-key'
)

# Send a request with Stream Mode
response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V2.5",
    messages=[
        {"role": "user", "content": "What does the release of DeepSeek mean for the field of large AI models?"}
    ],
    stream=True  # Enable Stream Mode
)

# Receive and process the response incrementally
for chunk in response:
    chunk_message = chunk.choices[0].delta.content
    print(chunk_message, end='', flush=True)

1.2 Stream Mode with the Requests Library

If you have non-OpenAI scenarios, such as using the SiliconCloud API with the Requests library, please note: In addition to setting stream in the payload, you also need to set stream=True in the request parameters to properly handle responses in streaming mode.

import requests
   
url = "https://api.ap.siliconflow.com/v1/chat/completions"
   
payload = {
        "model": "deepseek-ai/DeepSeek-V2.5",  # Replace with your model
        "messages": [
            {
                "role": "user",
                "content": "What does the release of DeepSeek mean for the field of large AI models?"
            }
        ],
        "stream": True  # This must be set to streaming mode
}

headers = {
        "accept": "application/json",
        "content-type": "application/json",
        "authorization": "Bearer your-api-key"
    }
   
response = requests.post(url, json=payload, headers=headers, stream=True)  # Specify stream mode in the request

# Print the streaming response
if response.status_code == 200: 
    for chunk in response.iter_content(chunk_size=8192): 
        if chunk:
            decoded_chunk = chunk.decode('utf-8')
            print(decoded_chunk, end='')
else:
    print('Request failed with status code:', response.status_code)