ArkAgent
ArkAgent is the core resource in ark-operator. It is analogous to a Kubernetes Deployment — it manages a pool of agent instances all running the same model, system prompt, and tool configuration. The operator creates and maintains a backing Deployment for each ArkAgent.
What an agent is
An agent is a long-running process that:
- Reads its configuration from environment variables injected by the operator
- Connects to configured MCP tool servers at startup
- Polls a Redis Streams task queue for work
- Calls the LLM provider with the task prompt and available tools
- Runs the tool-use loop until the model stops invoking tools
- Returns the result to the queue
The agent process (ark-runtime) has no Kubernetes dependencies. The same binary runs both in-cluster pods and locally via ark run.
Example
apiVersion: arkonis.dev/v1alpha1
kind: ArkAgent
metadata:
name: research-agent
namespace: default
spec:
replicas: 2
model: llama3.2
systemPrompt: |
You are a research agent. Gather and summarize information
accurately and concisely. Always cite your sources.
mcpServers:
- name: web-search
url: https://search.mcp.internal/sse
limits:
maxTokensPerCall: 8000
maxConcurrentTasks: 5
timeoutSeconds: 120
livenessProbe:
type: semantic
intervalSeconds: 60
Spec walkthrough
model
The model identifier passed to your configured LLM provider. Provider auto-detection reads this field:
llama3.2,mistral, or any custom name → OpenAI-compatible (setOPENAI_BASE_URLfor Ollama)gpt-4o,gpt-4-turbo,o1-*,o3-*→ OpenAIclaude-*→ Anthropic
systemPrompt and systemPromptRef
Inline system prompt:
spec:
systemPrompt: |
You are a specialist. Be precise and cite sources.
Reference from a ConfigMap (recommended for long prompts):
spec:
systemPromptRef:
configMapKeyRef:
name: my-prompt
key: system.txt
When the ConfigMap changes, the operator detects the update and restarts agent pods automatically.
replicas
Number of agent pod instances. The operator reconciles the backing Deployment to match. Scale manually:
kubectl patch arkagent research-agent --type=merge -p '{"spec":{"replicas":5}}'
Range: 0–50. Set to 0 to drain the agent pool without deleting the resource.
mcpServers
List of MCP (Model Context Protocol) tool servers to connect at pod startup. The agent runtime establishes SSE connections and registers the tools with the LLM provider. Tool names are prefixed with the server name to prevent collisions (web-search__search, web-search__fetch_page).
Connection failures are non-fatal — the agent starts with a reduced toolset.
tools (inline webhook tools)
Define simple HTTP tools without a full MCP server:
spec:
tools:
- name: fetch_news
description: "Fetch the latest news headlines for a given topic."
url: "http://news-api.internal/headlines"
method: POST
inputSchema: |
{"type":"object","properties":{"topic":{"type":"string"}},"required":["topic"]}
The agent runtime calls the URL when the LLM invokes the tool and returns the response body to the model.
limits
Per-agent resource controls enforced by the agent runtime:
| Field | Default | Description |
|---|---|---|
maxTokensPerCall | 8000 | Token budget (input + output) per LLM call |
maxConcurrentTasks | 5 | Max tasks a single pod processes simultaneously |
timeoutSeconds | 120 | Per-task timeout; task is abandoned and an error returned after this duration |
These are injected as env vars and enforced regardless of what the agent code does.
configRef
Reference an ArkSettings resource to inherit shared prompt fragments and model settings:
spec:
configRef: analyst-base
See ArkSettings for details.
memoryRef
Reference an ArkMemory resource to give the agent persistent memory across tasks:
spec:
memoryRef:
name: research-memory
See ArkMemory for details.
Semantic health checks
Standard Kubernetes probes check process liveness. They cannot tell you whether the LLM is producing useful output.
ArkAgent supports semantic health checks via the livenessProbe.type: semantic field. When enabled, each agent pod exposes a /readyz endpoint that calls the LLM with a validation prompt and returns 503 if the output fails validation.
When /readyz returns 503, Kubernetes marks the pod NotReady and ArkService stops routing tasks to it. The pod recovers automatically if the underlying issue resolves.
spec:
livenessProbe:
type: semantic
intervalSeconds: 60
validatorPrompt: "Reply with exactly one word: HEALTHY"
| Field | Default | Description |
|---|---|---|
type | ping | ping — HTTP reachability only. semantic — enables /readyz LLM validation. |
intervalSeconds | 60 | Interval between semantic validation checks. |
validatorPrompt | (built-in) | Custom prompt for the /readyz check. |
Built-in tools
Every agent pod has access to these built-in tools regardless of MCP or webhook tool configuration:
| Tool | Description |
|---|---|
submit_subtask | Enqueue a new agent task asynchronously and return the task ID. Enables supervisor/worker patterns where a running agent spawns sub-tasks at runtime. |
delegate | (injected in ArkTeam context) Route a task to a specific team role. |
Status
kubectl describe arkagent research-agent
# Status:
# Ready Replicas: 2
# Replicas: 2
# Conditions:
# Type: Available
# Status: True
| Field | Type | Description |
|---|---|---|
replicas | int32 | Total pods managed by this agent |
readyReplicas | int32 | Pods passing both liveness and readiness checks |
conditions | []Condition | Available, Progressing, Degraded |
See also
- ArkAgent spec reference — complete field reference
- ArkSettings — shared prompt fragments
- ArkMemory — persistent memory backends
- Scaling Agents guide — manual and queue-depth scaling