v1alpha1 · Apache 2.0

Kubernetes-native
AI agent infrastructure

Deploy, scale, and manage agentic workloads the same way you manage any other workload. kubectl apply and you're done.

terminal
$ kubectl apply -f research-agent.yaml
arkonisdeployment.arkonis.dev/research-agent created

$ kubectl get aodep
NAME              MODEL                      REPLICAS   READY   AGE
research-agent    claude-sonnet-4-20250514   5          5       2m
What ARKONIS stands for
A Agentic LLM-powered agents as first-class resources
R Reconciler Kubernetes control loop at the core
K Kubernetes Runs anywhere k8s runs
O Operator Extends the Kubernetes API
N Native kubectl, RBAC, namespaces — no wrappers
I Inference Model calls, tool use, MCP servers
S System System prompts + semantic health checks

Agents are going to production.
The tooling hasn't caught up.

Running agents as one-off scripts with no lifecycle management

Manually scaling agent processes when load increases

Hardcoding prompts and tool configs inside application code

Restarting stuck agents by hand when they silently produce bad output

Containers had the same problem before Kubernetes. The answer was a new abstraction layer, not a better script.

First-class primitives for agents

ARKONIS extends the Kubernetes API with five new resource types.

ArkonisDeployment
like Deployment

Manages a pool of agent instances running the same model, prompt, and tool configuration. Handles scheduling, scaling, and health checking.

spec:
  replicas: 5
  model: claude-sonnet-4-20250514
  systemPrompt: |
    You are a research analyst...
  mcpServers:
    - name: web-search
      url: https://search.mcp.internal/sse
ArkonisService
like Service

Routes incoming tasks to available agent instances. Decouples task producers from the agent pool with configurable load balancing strategies.

spec:
  selector:
    arkonisDeployment: research-agent
  routing:
    strategy: least-busy
  ports:
    - protocol: A2A
      port: 8080
ArkonisConfig
like ConfigMap

Reusable prompt fragments and model settings. Reference from multiple ArkonisDeployments to keep configuration in one place.

spec:
  temperature: "0.3"
  outputFormat: structured-json
  promptFragments:
    persona: "Expert analyst..."
    outputRules: "Always cite sources."
ArkonisPipeline
novel primitive

Chains agents into directed acyclic graphs where outputs feed into inputs. The primitive for orchestrating multi-agent workflows declaratively.

spec:
  steps:
    - name: research
      arkonisDeployment: research-agent
    - name: writer
      dependsOn: [research]
      arkonisDeployment: writer-agent
ArkonisMemory
like PersistentVolumeClaim

Defines the persistent memory backend for agent instances. Agents remember context across many tasks, not just within a single conversation. Supports Redis and vector stores.

spec:
  backend: vector-store
  vectorStore:
    provider: qdrant
    endpoint: http://qdrant:6333
    collection: research-memories

How it works

Everything fits into the Kubernetes model you already know.

1

Declare your agent in YAML

Describe the model, system prompt, MCP tool servers, and replica count. Same format as any other Kubernetes resource.

2

The operator reconciles

ARKONIS watches for changes and ensures the actual state matches your desired state by creating pods, injecting config, and wiring the task queue.

3

Agents pick up tasks and run

Each agent pod pulls tasks from the queue, calls the configured model with its tools, and writes results back. The operator monitors output quality and restarts agents that degrade.

4

ArgoCD and Flux work out of the box

Promoting a new system prompt is a pull request. Your existing GitOps tooling (ArgoCD, Flux, or anything that speaks kubectl) syncs it to the cluster exactly like any other resource change.

Semantic health checks

Standard liveness probes check whether a process is running. That's not enough for agents. An agent can be running fine while consistently producing wrong output.

ARKONIS introduces semantic readiness probes that periodically send a validation prompt to the agent and verify the response. Kubernetes stops routing tasks to agents that fail the check.

livenessProbe:
  type: semantic
  intervalSeconds: 30
GET /healthz Process is up
GET /readyz Agent produces correct output

Ready to deploy your first agent?

Install the operator, apply your first ArkonisDeployment, and start routing tasks. No custom orchestration code required.

Read the docs →