Skip to content

Grain Model

Every entity in Komand is an Orleans grain — a virtual actor with isolated state, single-threaded execution, and automatic lifecycle management. Grains are never explicitly created or destroyed; Orleans activates them on first access and deactivates them when idle, persisting state to PostgreSQL automatically.

This page describes each grain type, its key, state, interface, and behaviour.

GrainKey PatternResponsibility
AgentGrain{agentId}AI personality, conversation history, memory
SessionGrain{Channel}:{AccountId}:{SenderId}Conversation thread, agent binding
ToolGrain{executionId}Skill execution lifecycle
CronGrain{AgentId}:{TaskId}Scheduled task execution
SkillRegistryGrain"global" (singleton)Skill catalog and permission validation

The core grain that represents an AI agent. Each agent has its own configuration, conversation history, and long-term memory.

Key: agentId (e.g., "default", "sales-bot")

State:

  • AgentConfig — name, system prompt, model provider, model ID, enabled skills, max context tokens
  • SessionHistories — dictionary of session ID to conversation turns (max turns per session configurable)
  • Memories — dictionary of key to value for long-term recall (max 1,000 entries, 100KB per value)

Interface:

public interface IAgentGrain : IGrainWithStringKey
{
Task ConfigureAsync(AgentConfig config);
Task<AgentConfig?> GetConfigAsync();
Task<OutboundMessage> ProcessMessageAsync(InboundMessage message, string sessionId);
Task<IReadOnlyList<ConversationTurn>> GetHistoryAsync(string sessionId, int maxTurns);
Task ClearHistoryAsync(string sessionId);
Task StoreMemoryAsync(string key, string value);
Task<string?> RecallMemoryAsync(string key);
}

Default Configuration:

SettingDefault
ModelProvider"anthropic"
ModelId"claude-sonnet-4-20250514"
MaxContextTokens100,000

Behaviour:

  • When ProcessMessageAsync is called, the agent appends the user turn to session history and returns a response
  • When session history exceeds MaxTurnsPerSession, the oldest turns are trimmed before adding new ones
  • When active sessions exceed MaxSessionsPerAgent, the oldest session (by LastMessageAt) is evicted

Manages a conversation thread between a user and an agent on a specific channel.

Key: Compound key {Channel}:{ChannelAccountId}:{SenderId}

This key structure ensures that the same user on different channels gets separate sessions, and the same channel with different bot accounts gets separate sessions.

State:

  • SessionId — unique identifier
  • Channel, ChannelAccountId, SenderId — routing metadata
  • BoundAgentId — which agent handles this session (default: "default")
  • MessageCount, StartedAt, LastMessageAt — metrics

Interface:

public interface ISessionGrain : IGrainWithStringKey
{
Task BindAgentAsync(string agentId);
Task<OutboundMessage> HandleMessageAsync(InboundMessage message);
Task<SessionInfo> GetInfoAsync();
Task EndSessionAsync();
}

Behaviour:

  • Lazy initialization: On the first message, the session initializes its state from the inbound message metadata (channel, sender, timestamps)
  • Default binding: New sessions automatically bind to the "default" agent
  • Channel mismatch detection: If a message arrives with a different channel than the session was initialized with, it is rejected
  • Routing: HandleMessageAsync delegates to the bound AgentGrain.ProcessMessageAsync

Executes a skill action with lifecycle tracking. Each execution gets its own grain, providing natural isolation — a misbehaving skill cannot affect other executions.

Key: executionId (string)

State:

  • ToolExecutionRequest — tool name, agent ID, session ID, parameters, timeout, requested timestamp
  • ToolExecutionStatus — status enum tracking the execution lifecycle
  • ToolExecutionResult — output, error, completion timestamp, duration

Interface:

public interface IToolGrain : IGrainWithStringKey
{
Task<ToolExecutionResult> ExecuteAsync(ToolExecutionRequest request);
Task<ToolExecutionStatus> GetStatusAsync();
Task CancelAsync();
}

Status Lifecycle:

Pending → Running → Completed
→ Failed
→ TimedOut
→ Cancelled

Behaviour:

  • Timeout is enforced per-execution and capped at MaxToolExecutionTimeoutMinutes (default: 30)
  • If the execution exceeds its timeout, the status transitions to TimedOut
  • CancelAsync can be called at any point to abort a running execution

Manages scheduled recurring tasks using Orleans reminders — a durable scheduling mechanism that survives silo restarts and grain deactivation.

Key: {AgentId}:{TaskId}

State:

  • CronTaskDefinition — name, description, action type, action parameters, interval, stagger offset, paused flag, created timestamp
  • ExecutionCount, LastExecutedAt

Interface:

public interface ICronGrain : IGrainWithStringKey, IRemindable
{
Task ScheduleAsync(CronTaskDefinition definition);
Task<CronTaskDefinition?> GetDefinitionAsync();
Task PauseAsync();
Task ResumeAsync();
Task UnscheduleAsync();
}

Behaviour:

  • Implements IRemindable to receive Orleans reminder callbacks (reminder name: "cron-tick")
  • Stagger support: A configurable offset prevents all cron tasks from firing at the same instant
  • Pause/Resume: Tasks can be paused without losing their schedule; resuming re-registers the reminder
  • Unschedule: Removes the reminder entirely and resets state
  • Orleans reminders are persisted to PostgreSQL, so they survive silo restarts

Singleton grain managing the global skill catalog. There is exactly one instance of this grain in the entire cluster.

Key: "global" (always a single instance)

State:

  • Dictionary of skillIdSkillDefinition
  • Capacity: max 10,000 skills

Interface:

public interface ISkillRegistryGrain : IGrainWithStringKey
{
Task RegisterSkillAsync(SkillDefinition skill);
Task<SkillDefinition?> GetSkillAsync(string skillId);
Task<IReadOnlyList<SkillDefinition>> ListSkillsAsync(
string? publisherId = null, bool? verifiedOnly = null);
Task UnregisterSkillAsync(string skillId);
Task<bool> ValidateSkillPermissionsAsync(string skillId,
IReadOnlyList<SkillPermission> grantedPermissions);
}

Behaviour:

  • RegisterSkillAsync rejects registration if the registry is at capacity
  • ListSkillsAsync supports filtering by publisher ID and verified status (pagination is handled at the API layer)
  • ValidateSkillPermissionsAsync checks whether a set of granted permissions satisfies all of a skill’s required permissions — returns true only if every required permission is covered

Permissions are granular, not just string labels:

public record SkillPermission(
string Resource, // e.g., "network", "filesystem", "api:salesforce"
string Access, // "read", "write", "execute"
string? Scope // optional: specific URLs, paths, or resource IDs
);

All grain limits are configurable via the GrainLimits configuration section:

SettingDefaultDescription
MaxTurnsPerSession100Max conversation turns per session before trimming
MaxSessionsPerAgent100Max active sessions per agent before eviction
MaxMemoryEntries1,000Max memory key-value pairs per agent
MaxMemoryValueLength100,000Max characters per memory value
MaxSkills10,000Max skills in the global registry
MaxToolExecutionTimeoutMinutes30Max tool execution timeout

Override in appsettings.json:

{
"GrainLimits": {
"MaxTurnsPerSession": 200,
"MaxSessionsPerAgent": 500
}
}

All grain state is persisted to PostgreSQL via Orleans ADO.NET providers. The storage provider is named "komandStore".

ConcernProvider
Grain stateOrleansSqlUtils with AdoNetGrainStorage
ClusteringAdoNetClustering (silo membership)
RemindersAdoNetReminderTable (for CronGrain)

In development mode, all three use in-memory providers — no PostgreSQL required.

Orleans was chosen over alternatives because its grain model maps directly to Komand’s domain:

Domain ConceptOrleans Primitive
An AI agent with memoryAgentGrain with persistent state
A conversation threadSessionGrain keyed by channel + user
A running skillToolGrain with timeout and cancellation
A scheduled taskCronGrain with Orleans reminders
A singleton registrySkillRegistryGrain with "global" key

Each grain processes messages one at a time (single-threaded), which eliminates concurrency bugs. State is automatically persisted and restored. The silo handles activation, deactivation, and distribution across cluster nodes.