Already connected SiliconFlow to Hermes Agent? This guide takes you further, from choosing the right model on SiliconFlow to building your daily assistant on Discord. Plus, a set of practical tips and best practices to help you get more out of every session. New to the setup? Start with the v1 guide first to connect your Hermes Agent with SiliconFlow.

What’s New

The v1 guide covered the step-by-step process of integrating SiliconFlow APIs to Hermes Agent. Since then, Hermes Agent has grown significantly — now ranked #1 in OpenRouter’s Coding Agents Rankings — and introduced more powerful new features worth exploring. This guide will further focus on:

Connecting Hermes Agent to Discord as a daily assistant
Tips & best practices for getting the most out of your Hermes Agent
Choosing the right SiliconFlow model for your use case

Connect Hermes Agent to Discord

With SiliconFlow configured in Hermes Agent, you can deploy it as a Discord bot and start chatting with your AI assistant directly in Discord via DMs or server channels.

Step 1: Create a Discord Application

Go to the Discord Developer Portal and sign in
Click New Application
Enter a name (e.g., Hermes Agent (SiliconFlow API)) and click Create

Step 2: Configure the Bot

In the left sidebar, click Bot
Reset and Copy Bot Token
1. Click Reset Token.
2. ⚠️Copy the token and store it securely. You will need it during the onboarding process.
Under Authorization Flow,
1. Set Public Bot to ON
2. Leave Require OAuth2 Code Grant set to OFF
Scroll down to Privileged Gateway Intents and enable:
1. Server Members Intent
2. Message Content Intent
3. Presence Intent (optional)
Click Save Changes

Step 3: Generate the Invite URL

In the left sidebar, click Installation
Under Installation Contexts, enable Guild Install
Set Install Link to Discord Provided Link
Under Default Install Settings → Guild Install, select:
1. Scopes: bot and applications.commands
2. Permissions: View Channels, Send Messages, Read Message History, Attach Files, etc.

Step 4: Invite the Bot to Your Server

Open the invite URL in your browser.
In the Add to Server dropdown, select your server.
Click Continue, then Authorize.
Complete the CAPTCHA if prompted.

Step 5: Find Your Discord User ID

Hermes Agent uses your User ID to control who can interact with the bot.

In Discord, go to Settings and toggle Developer Mode ON
Right-click your username → Copy User ID

Step 6: Start the Gateway

Run the setup in your terminal:

hermes gateway setup

Select Discord when prompted, then paste your Bot Token and User ID.

Once complete, send the bot a message in Discord to verify your SiliconFlow-powered assistant is live.

For advanced configuration options (free-response channels, session isolation, slash commands, etc.), refer to the Hermes Agent Discord documentation.

Use case: Daily News Report

Tell Hermes bot which website you don’t want to miss, and it’ll check them on a schedule and deliver updates right to your Discord channel. Example:

“Follow the latest news on siliconflow.com every day and report the updates”

Hermes visits the site daily and posts a summary — no manual checking needed.

Tips & Best Practices

Starting Out: Be Specific, Not Vague

Tell Hermes exactly what you need — the more detail, the better the result.

❌ help me with models selection ✅ I need to call a text generation API on SiliconFlow with at least 205K context, cache support, and under $0.5 per million input tokens — what are my options?

When you spell out your requirements — context length, pricing budget, feature needs — Hermes narrows down the options for you in one shot. A vague prompt means back-and-forth clarifications; a specific one means you get what you need right away.

Leveling Up: Run Tasks in Parallel, Stop Waiting in Line

Most people use Hermes one task at a time. But Hermes can spin up multiple sub-agents that work simultaneously — each with its own context — and only the final summaries come back to you. For example, instead of looking things up one by one:

“Check these three things in parallel: (1) which text generation models are available with json mode support on SiliconFlow, (2) how the image generation API request format works for SiliconFlow, (3) what the rate limits are for my SiliconFlow API tier.”

One message, three independent lookups, one consolidated answer. It’s faster and saves tokens because only the key findings are returned to your main conversation.

Going Pro: Keep Your Cache Warm

Here’s something most users don’t realize: LLM providers cache your system prompt prefix. If your context stays stable (sameAGENTS.md, same SOUL.md, same memory), every follow-up message in the session hits the cache — making it significantly cheaper and faster. This matters even more on SiliconFlow. SiliconFlow consistently delivers high cache hit rates — for example, according to OpenRouter’s provider performance data, SiliconFlow achieves the highest cache hit rate among all GLM-5.1 providers at 88.6%, significantly ahead of the competition. For developers, this translates directly to faster response times and lower inference costs on repeated contexts.

What this means in practice:

Don’t change models or rewrite your system prompt mid-session
Use /compress when conversations get long, it trims the token count without losing key context
Pick the right model upfront: use a larger model for complex reasoning and architecture decisions, switch to a lighter one for formatting and boilerplate generation

Run /usage periodically to see where your tokens are going. Small habits, real savings.

Want More Tips?

This covers the essentials to get you productive quickly. Hermes Agent has a rich set of best practices beyond what’s listed here — including CLI power-user shortcuts, context file patterns (AGENTS.md, SOUL.md, .cursorrules), memory management, security best practices, and more. Check out the official documentation: https://hermes-agent.nousresearch.com/docs/guides/tips

Choose the Right Model

Not sure which model to use? Here’s a quick reference by use case.

Model	Best for	Key highlights
DeepSeek-V4-Flash	Fast, cost-effective coding chat / large codebase	1M context · 3 reasoning modes · best value in the V4 series
DeepSeek-V4-Pro	Complex reasoning / large codebase	1M context · #1 open-source on math, STEM & competitive coding · approaches Opus 4.6
GLM-5.1	Long-horizon agentic tasks	58.4 on SWE-Bench Pro · long-horizon execution · iterative self-improvement
Kimi-K2.6	Long-horizon tasks / Frontend generation / multi-agent	Agent swarm architecture · long-horizon coding · prompt-to-frontend generation
MiniMax M2.5	Coding / agentic workflow / office work	SOTA Coding Tied Claude & agentic tool use · trained across 200k+ real-world environments
Qwen3.6-27B	efficient, context-aware coding experiences.	Flagship-level agentic coding performance

Browse the full model library: siliconflow.com/models

Already Using OpenRouter?

Bring Your Own Key

Supports BYOK (Bring Your Own Key) on OpenRouter — your requests draw from your SiliconFlow balance, with OpenRouter waiving platform fees on your first 1M BYOK requests/month Browse the full model library → siliconflow.com/models

Resources

Hermes Agent

Website: https://hermes-agent.nousresearch.com/
Docs: https://hermes-agent.nousresearch.com/docs
Discord: https://discord.gg/NousResearch
Github: https://github.com/NousResearch/hermes-agent

OpenRouter

BYOK Setting: https://openrouter.ai/workspaces/default/byok
SiliconFlow on OpenRouter: https://openrouter.ai/provider/siliconflow

SiliconFlow

Website: https://siliconflow.com
API Documentation: https://docs.siliconflow.com
Model Library: https://siliconflow.com/models
Discord: https://discord.com/invite/7Ey3dVNFpT
X: https://x.com/SiliconFlowAI

​What’s New

​Connect Hermes Agent to Discord

​Step 1: Create a Discord Application

​Step 2: Configure the Bot

​Step 3: Generate the Invite URL

​Step 4: Invite the Bot to Your Server

​Step 5: Find Your Discord User ID

​Step 6: Start the Gateway

​Use case: Daily News Report

​Tips & Best Practices

​Starting Out: Be Specific, Not Vague

​Leveling Up: Run Tasks in Parallel, Stop Waiting in Line

​Going Pro: Keep Your Cache Warm

​Want More Tips?

​Choose the Right Model

​Already Using OpenRouter?

​Bring Your Own Key

​Resources

​Hermes Agent

​OpenRouter

​SiliconFlow