ArkAgent

ArkAgent is the core resource in ark-operator. It is analogous to a Kubernetes Deployment — it manages a pool of agent instances all running the same model, system prompt, and tool configuration. The operator creates and maintains a backing Deployment for each ArkAgent.


What an agent is

An agent is a long-running process that:

  1. Reads its configuration from environment variables injected by the operator
  2. Connects to configured MCP tool servers at startup
  3. Polls a Redis Streams task queue for work
  4. Calls the LLM provider with the task prompt and available tools
  5. Runs the tool-use loop until the model stops invoking tools
  6. Returns the result to the queue

The agent process (ark-runtime) has no Kubernetes dependencies. The same binary runs both in-cluster pods and locally via ark run.


Example

apiVersion: arkonis.dev/v1alpha1
kind: ArkAgent
metadata:
  name: research-agent
  namespace: default
spec:
  replicas: 2
  model: llama3.2
  systemPrompt: |
    You are a research agent. Gather and summarize information
    accurately and concisely. Always cite your sources.
  mcpServers:
    - name: web-search
      url: https://search.mcp.internal/sse
  limits:
    maxTokensPerCall: 8000
    maxConcurrentTasks: 5
    timeoutSeconds: 120
  livenessProbe:
    type: semantic
    intervalSeconds: 60

Spec walkthrough

model

The model identifier passed to your configured LLM provider. Provider auto-detection reads this field:

  • llama3.2, mistral, or any custom name → OpenAI-compatible (set OPENAI_BASE_URL for Ollama)
  • gpt-4o, gpt-4-turbo, o1-*, o3-* → OpenAI
  • claude-* → Anthropic

systemPrompt and systemPromptRef

Inline system prompt:

spec:
  systemPrompt: |
    You are a specialist. Be precise and cite sources.

Reference from a ConfigMap (recommended for long prompts):

spec:
  systemPromptRef:
    configMapKeyRef:
      name: my-prompt
      key: system.txt

When the ConfigMap changes, the operator detects the update and restarts agent pods automatically.

replicas

Number of agent pod instances. The operator reconciles the backing Deployment to match. Scale manually:

kubectl patch arkagent research-agent --type=merge -p '{"spec":{"replicas":5}}'

Range: 0–50. Set to 0 to drain the agent pool without deleting the resource.

mcpServers

List of MCP (Model Context Protocol) tool servers to connect at pod startup. The agent runtime establishes SSE connections and registers the tools with the LLM provider. Tool names are prefixed with the server name to prevent collisions (web-search__search, web-search__fetch_page).

Connection failures are non-fatal — the agent starts with a reduced toolset.

tools (inline webhook tools)

Define simple HTTP tools without a full MCP server:

spec:
  tools:
    - name: fetch_news
      description: "Fetch the latest news headlines for a given topic."
      url: "http://news-api.internal/headlines"
      method: POST
      inputSchema: |
        {"type":"object","properties":{"topic":{"type":"string"}},"required":["topic"]}

The agent runtime calls the URL when the LLM invokes the tool and returns the response body to the model.

limits

Per-agent resource controls enforced by the agent runtime:

Field Default Description
maxTokensPerCall 8000 Token budget (input + output) per LLM call
maxConcurrentTasks 5 Max tasks a single pod processes simultaneously
timeoutSeconds 120 Per-task timeout; task is abandoned and an error returned after this duration

These are injected as env vars and enforced regardless of what the agent code does.

configRef

Reference an ArkSettings resource to inherit shared prompt fragments and model settings:

spec:
  configRef: analyst-base

See ArkSettings for details.

memoryRef

Reference an ArkMemory resource to give the agent persistent memory across tasks:

spec:
  memoryRef:
    name: research-memory

See ArkMemory for details.


Semantic health checks

Standard Kubernetes probes check process liveness. They cannot tell you whether the LLM is producing useful output.

ArkAgent supports semantic health checks via the livenessProbe.type: semantic field. When enabled, each agent pod exposes a /readyz endpoint that calls the LLM with a validation prompt and returns 503 if the output fails validation.

When /readyz returns 503, Kubernetes marks the pod NotReady and ArkService stops routing tasks to it. The pod recovers automatically if the underlying issue resolves.

spec:
  livenessProbe:
    type: semantic
    intervalSeconds: 60
    validatorPrompt: "Reply with exactly one word: HEALTHY"
Field Default Description
type ping ping — HTTP reachability only. semantic — enables /readyz LLM validation.
intervalSeconds 60 Interval between semantic validation checks.
validatorPrompt (built-in) Custom prompt for the /readyz check.

Built-in tools

Every agent pod has access to these built-in tools regardless of MCP or webhook tool configuration:

Tool Description
submit_subtask Enqueue a new agent task asynchronously and return the task ID. Enables supervisor/worker patterns where a running agent spawns sub-tasks at runtime.
delegate (injected in ArkTeam context) Route a task to a specific team role.

Status

kubectl describe arkagent research-agent
# Status:
#   Ready Replicas:  2
#   Replicas:        2
#   Conditions:
#     Type:    Available
#     Status:  True
Field Type Description
replicas int32 Total pods managed by this agent
readyReplicas int32 Pods passing both liveness and readiness checks
conditions []Condition Available, Progressing, Degraded

See also


Apache 2.0 · ARKONIS