Blog
/
How to Use Hermes Agent with Trinity Large Thinking

How to Use Hermes Agent with Trinity Large Thinking

Learn how to set up Hermes Agent powered by Trinity-Large-Thinking. This guide covers installation, tool configuration, and launching your AI assistant.

Running a high-reasoning AI assistant on your own hardware has never been more accessible. This walkthrough explains how to set up the Hermes agent and get your AI assistant powered by Trinity Large running. Hermes works on Linux, macOS, and Windows, similar to OpenClaw. The ideal setup is typically on a VPS or dedicated machine, but for this walkthrough we're running it locally on a Mac, and it runs smoothly.

First, here's a quick look at what Trinity Large Thinking can actually do inside Hermes.

You can install the Hermes agent with a single command:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

This creates a virtual environment for you and installs all the Python and Node.js dependencies you need.

The first screen you'll encounter is the setup wizard, which provides a straightforward five-step roadmap. To begin, you select your Model and Provider, then configure the Terminal Backend, tweak Agent Settings, and optionally link Messaging Platforms like Telegram or Discord so you can interact with your agent from your phone. Lastly, you set up Tools such as web search or image generation. When you're ready, simply press Enter and work through each step.

Next, we select the inference provider, which is the model backend powering the agent. Hermes supports multiple options, and the overall setup flow remains the same. In this case, we configure it to use trinity-large-thinking, which can be accessed from both openrouter and chat.arcee.ai.

Once the provider is chosen, we enter the model details. We set the model to trinity-large-thinking, keep the remaining provider configuration as required for that endpoint, and let Hermes manage the context settings. In Agent Settings, we set max iterations to 60 so the agent can make up to 60 tool calls per task, switch Tool Progress Display to all so we can monitor everything it's doing in real time, and leave Context Compression at 0.5 so older messages get summarized once we hit half the memory limit.

For Session Reset Policy, Hermes can automatically reset long or inactive conversations while retaining important information first, and you can always type /reset manually at any point. For this setup, we go with the recommended reset policy: inactivity plus a daily reset, whichever comes first. We stick with the defaults of 1440 minutes (24 hours) of inactivity and a reset at 4 AM, which keeps sessions tidy without any extra effort.

For messaging platforms, we skip everything for now. Telegram, Discord, Slack, Matrix, and WhatsApp are all set to no, since they can always be connected later if we want to chat with the agent from a phone.

Then we move to tools configuration, where the agent really comes to life. Web search, browser control, terminal access, file handling, code execution, vision, memory, and more are already enabled by default. We leave Mixture of Agents, RL training, and Home Assistant disabled since those require extra setup, and keep everything else on so the agent is ready to go.

For browser automation, we choose Local Browser, which gives us a free headless Chromium instance running locally with no additional setup. For text-to-speech, we keep the default Microsoft Edge TTS since it's free and works out of the box. For web search, we skip paid options because Hermes already includes built-in DuckDuckGo search.

And that's it — installation complete. We type hermes to launch it, and the dashboard shows all our tools and skills ready to go.

If Hermes opens with the wrong default model selected, the fix is simple. Exit, run hermes model, choose the correct saved provider, and set the model to trinity-large-thinking. Once that's done, launching Hermes again will show the correct model in the status bar.

Now, when we open Hermes, the status bar shows trinity-large-thinking, which confirms the agent is running with the correct model configuration.

To test it, we type a simple message like "hey, how are you doing today?" and the agent responds immediately with a normal greeting.

The status bar now shows:

⚕ trinity-large-thinking │ 11.1K/262.144K │ [░░░░░░░░░░] 4% │ 6m

That confirms Hermes is using trinity-large-thinking, with a large available context window and the current session usage visible directly in the interface.

Here's a video walkthrough that you can pause and reference throughout the setup: