Kubernetes-native
AI agent infrastructure
Deploy, scale, and manage agentic workloads the same way you manage any other workload. kubectl apply and you're done.
$ kubectl apply -f research-agent.yaml
arkonisdeployment.arkonis.dev/research-agent created
$ kubectl get aodep
NAME MODEL REPLICAS READY AGE
research-agent claude-sonnet-4-20250514 5 5 2m
Agents are going to production.
The tooling hasn't caught up.
Running agents as one-off scripts with no lifecycle management
Manually scaling agent processes when load increases
Hardcoding prompts and tool configs inside application code
Restarting stuck agents by hand when they silently produce bad output
Containers had the same problem before Kubernetes. The answer was a new abstraction layer, not a better script.
First-class primitives for agents
ARKONIS extends the Kubernetes API with five new resource types.
DeploymentManages a pool of agent instances running the same model, prompt, and tool configuration. Handles scheduling, scaling, and health checking.
spec:
replicas: 5
model: claude-sonnet-4-20250514
systemPrompt: |
You are a research analyst...
mcpServers:
- name: web-search
url: https://search.mcp.internal/sse
ServiceRoutes incoming tasks to available agent instances. Decouples task producers from the agent pool with configurable load balancing strategies.
spec:
selector:
arkonisDeployment: research-agent
routing:
strategy: least-busy
ports:
- protocol: A2A
port: 8080
ConfigMapReusable prompt fragments and model settings. Reference from multiple ArkonisDeployments to keep configuration in one place.
spec:
temperature: "0.3"
outputFormat: structured-json
promptFragments:
persona: "Expert analyst..."
outputRules: "Always cite sources."
Chains agents into directed acyclic graphs where outputs feed into inputs. The primitive for orchestrating multi-agent workflows declaratively.
spec:
steps:
- name: research
arkonisDeployment: research-agent
- name: writer
dependsOn: [research]
arkonisDeployment: writer-agent
PersistentVolumeClaimDefines the persistent memory backend for agent instances. Agents remember context across many tasks, not just within a single conversation. Supports Redis and vector stores.
spec:
backend: vector-store
vectorStore:
provider: qdrant
endpoint: http://qdrant:6333
collection: research-memories
How it works
Everything fits into the Kubernetes model you already know.
Declare your agent in YAML
Describe the model, system prompt, MCP tool servers, and replica count. Same format as any other Kubernetes resource.
The operator reconciles
ARKONIS watches for changes and ensures the actual state matches your desired state by creating pods, injecting config, and wiring the task queue.
Agents pick up tasks and run
Each agent pod pulls tasks from the queue, calls the configured model with its tools, and writes results back. The operator monitors output quality and restarts agents that degrade.
ArgoCD and Flux work out of the box
Promoting a new system prompt is a pull request. Your existing GitOps tooling (ArgoCD, Flux, or anything that speaks kubectl) syncs it to the cluster exactly like any other resource change.
Semantic health checks
Standard liveness probes check whether a process is running. That's not enough for agents. An agent can be running fine while consistently producing wrong output.
ARKONIS introduces semantic readiness probes that periodically send a validation prompt to the agent and verify the response. Kubernetes stops routing tasks to agents that fail the check.
livenessProbe:
type: semantic
intervalSeconds: 30
GET /healthz
Process is up
GET /readyz
Agent produces correct output
Ready to deploy your first agent?
Install the operator, apply your first ArkonisDeployment, and start routing tasks. No custom orchestration code required.
Read the docs →