1. About DB-GPT

DB-GPT is an open-source AI native data application development framework (AI Native Data App Development framework with AWEL (Agentic Workflow Expression Language) and Agents).

Its purpose is to build infrastructure in the field of large models by developing capabilities like multi-model management (SMMF), Text2SQL optimization, RAG framework and enhancements, multi-agent framework collaboration, AWEL (agentic workflow orchestration), and more. This makes building large model applications centered around databases simpler and more convenient.

2. Obtain API Key

2.1 Open the SiliconCloud official website and register an account (if already registered, simply log in).

2.2 After registration, navigate to API Key, create a new API Key, and copy it for later use.

3. Deploy DB-GPT

3.1 Clone the DB-GPT Source Code

git clone https://github.com/eosphoros-ai/DB-GPT.git

3.2 Create a Virtual Environment and Install Dependencies

# Navigate to the root directory of the DB-GPT source code
cd DB-GPT

# DB-GPT requires Python >= 3.10
conda create -n dbgpt_env python=3.10
conda activate dbgpt_env

# Install dependencies for proxy model support
pip install -e ".[proxy]"

3.3 Configure Basic Environment Variables

# Copy the template env file as .env
cp .env.template .env

3.4 Modify the .env Environment Variable File to Configure the SiliconCloud Model

# Use the proxy model from SiliconCloud
LLM_MODEL=siliconflow_proxyllm
# Specify the model name to use
SILICONFLOW_MODEL_VERSION=Qwen/Qwen2.5-Coder-32B-Instruct
SILICONFLOW_API_BASE=https://api.ap.siliconflow.com/v1
# Enter the API Key obtained in Step 2
SILICONFLOW_API_KEY={your-siliconflow-api-key}

# Configure the Embedding model from SiliconCloud
EMBEDDING_MODEL=proxy_http_openapi
PROXY_HTTP_OPENAPI_PROXY_SERVER_URL=https://api.ap.siliconflow.com/v1/embeddings
# Enter the API Key obtained in Step 2
PROXY_HTTP_OPENAPI_PROXY_API_KEY={your-siliconflow-api-key}
# Specify the Embedding model name
PROXY_HTTP_OPENAPI_PROXY_BACKEND=BAAI/bge-large-zh-v1.5

# Configure the rerank model from SiliconCloud
RERANK_MODEL=rerank_proxy_siliconflow
RERANK_PROXY_SILICONFLOW_PROXY_SERVER_URL=https://api.ap.siliconflow.com/v1/rerank
# Enter the API Key obtained in Step 2
RERANK_PROXY_SILICONFLOW_PROXY_API_KEY={your-siliconflow-api-key}
# Specify the rerank model name
RERANK_PROXY_SILICONFLOW_PROXY_BACKEND=BAAI/bge-reranker-v2-m3

Note that the SILICONFLOW_API_KEY, PROXY_HTTP_OPENAPI_PROXY_SERVER_URL, and RERANK_PROXY_SILICONFLOW_PROXY_API_KEY environment variables are the SiliconCloud API Keys obtained in Step 2. The language model (SILICONFLOW_MODEL_VERSION), embedding model (PROXY_HTTP_OPENAPI_PROXY_BACKEND), and rerank model (RERANK_PROXY_SILICONFLOW_PROXY_BACKEND) can be found in the Model List - SiliconFlow.

3.5 Start the DB-GPT Service

dbgpt start webserver --port 5670

Open the browser and navigate to http://127.0.0.1:5670/ to access the deployed DB-GPT.

4. Use SiliconCloud Models through the DB-GPT Python SDK

4.1 Install the DB-GPT Python Package

pip install "dbgpt>=0.6.3rc2" openai requests numpy

Install additional dependencies for testing.

4.2 Use SiliconCloud’s Large Language Model

import asyncio
import os
from dbgpt.core import ModelRequest
from dbgpt.model.proxy import SiliconFlowLLMClient

model = "Qwen/Qwen2.5-Coder-32B-Instruct"
client = SiliconFlowLLMClient(
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_alias=model
)

res = asyncio.run(
    client.generate(
        ModelRequest(
            model=model,
            messages=[
                {"role": "system", "content": "You are a helpful AI assistant."},
                {"role": "human", "content": "Hello"},
            ]
        )
    )
)
print(res)

4.3 Use SiliconCloud’s Embedding Model

import os
from dbgpt.rag.embedding import OpenAPIEmbeddings

openai_embeddings = OpenAPIEmbeddings(
    api_url="https://api.ap.siliconflow.com/v1/embeddings",
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_name="BAAI/bge-large-zh-v1.5",
)

texts = ["Hello, world!", "How are you?"]
res = openai_embeddings.embed_documents(texts)
print(res)

4.4 Use SiliconCloud’s Rerank Model

import os
from dbgpt.rag.embedding import SiliconFlowRerankEmbeddings

embedding = SiliconFlowRerankEmbeddings(
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_name="BAAI/bge-reranker-v2-m3",
)
res = embedding.predict("Apple", candidates=["苹果", "香蕉", "水果", "蔬菜"])
print(res)

5. Getting Started Guide

Take the data dialogue feature as an example. The data dialogue capability allows natural language conversations with data, primarily supporting structured and semi-structured data, assisting in data analysis and insights. Below is the specific operation process:

1. Add a Data Source

First, select “Data Source” on the left to add a database. DB-GPT currently supports multiple database types. Choose the appropriate database type to add. Here, MySQL is used for demonstration. The test data for this demo can be found in the test examples.

2. Select Dialogue Type

Choose the ChatData dialogue type.

3. Start Data Dialogue

Note: During the conversation, select the corresponding model and database. DB-GPT also provides both preview and edit modes.