Skip to main content

High-level shape

hrns is a small composition of six packages:
  • main: wires everything together
  • openai: OpenAI-compatible request and streaming client
  • loop: agent loop and tool execution
  • tools: bundled tool implementations
  • skills: skill discovery and the load_skill tool
  • tui: terminal UI and one-shot exec runner

Boot sequence

At startup, main.go does the following:
  1. creates a background context
  2. loads skills from the default global and local roots
  3. creates the load_skill tool from the discovered skills
  4. creates the built-in agent list and loads file-system agents
  5. creates tui.TUIApp with the built-in tools, agents, and skill metadata
  6. starts the bundled runner
Inside tui.Run, startup continues like this:
  1. load ~/.config/hrns/config.json
  2. if the file is missing or has no providers, run onboarding
  3. select the saved current agent if registered, or save one registered agent as current
  4. compose the system message from the selected agent or base prompt plus skill metadata
  5. pick currentProvider
  6. build openai.Client from that provider’s url, key, and skipVerify
  7. create loop.Loop with the current client and tools
  8. choose interactive mode by default, or exec mode when the first CLI argument is exec

Request lifecycle

For each interactive user turn in the TUI:
  1. the TUI appends a user message to the current session
  2. it starts RunLoop with the current message history and chosen model
  3. RunLoop converts registered tools into OpenAI-style function schemas
  4. openai.Client.StreamChatCompletion streams SSE events from /chat/completions
  5. loop emits chunks for assistant text, reasoning, and tool call events
  6. if the model called tools, loop executes them and appends tool messages
  7. the loop repeats until a streamed response finishes without tool calls
  8. the TUI updates its in-memory conversation from agent.Messages()

Stream accumulation

The openai.ChatCompletionAccumulator is a key part of the runtime. It merges partial streamed deltas into complete choices by:
  • concatenating text content
  • preserving structured content when text concatenation does not apply
  • stitching together fragmented tool-call arguments
  • preserving extra provider-specific fields
That is what lets RunLoop wait until the stream ends and then execute fully assembled tool calls.

Tool execution model

Tool execution is synchronous inside the loop:
  • tools are looked up by name in a map
  • arguments are parsed from the tool call’s JSON string
  • the tool returns a string result
  • the result is appended as a tool message
There is no retry layer, timeout layer, or per-tool sandboxing in the loop package itself.

State model

Two kinds of state matter:

Loop state

loop.Loop stores:
  • the client
  • the tool map
  • the last completed message history
  • a chunk channel

TUI state

The TUI stores:
  • the system prompt
  • the tool map
  • the registered slash commands
  • the registered agents
  • the discovered skill metadata used for system prompt context
  • the current conversation history
  • the loaded config
  • the active model string for the session
The TUI is where command registration, onboarding, model persistence, agent selection, and conversation reset happen today. In exec mode, the same startup path is reused, but the app creates a fresh message list with the active system prompt plus one user message from -message, runs the loop once, and exits. If -provider is passed without -model, the selected provider’s saved default model is used.

Current design limits

The architecture is small on purpose, but that comes with hard edges:
  • tool schemas are flat and fully required
  • tool results are plain strings
  • /connect updates saved config, but only /provider <name> rebuilds the live client
  • exec is single-shot only and does not persist conversation state
Those limits are acceptable for experiments and for library-style embedding, where you own the outer application.