Skip to content

kubeagent

Read-only Kubernetes troubleshooting, explained.

kubeagent scans a cluster, finds unhealthy pods, and explains why they're failing — talking to the cluster through the official Kubernetes Go client (client-go), strictly read-only.

What it catches

  • CrashLoopBackOff — containers stuck restarting
  • ImagePullBackOff — bad image or registry auth
  • OOMKilled — hit the memory limit (shown with the container's requests/limits)
  • Pending / Unschedulable — no node can place the pod

Beyond pods

  • Service health — Services with no ready endpoints and LoadBalancers with no address, backing-aware
  • NetworkPolicy hints — which policies select a stuck pod
  • Connectivity diagnostics — actionable "API server unreachable" messages
  • Credential lint — opt-in scan for secrets stored in the clear
  • Resource context — cluster CPU/memory plus per-OOMKill limits
  • Platform facts — CNI, ingress, storage, distro, runtime, cloud

See it

$ kubeagent scan
Cluster: Healthy — 3/3 nodes Ready
Platform: Cilium CNI · Traefik ingress · Kubernetes v1.30 · containerd

Resources (cluster):
  CPU     24.0 cores · req 6.2 (25%) · lim 18.0 (75%) · used 3.1 (12%)
  Memory  96Gi · req 22Gi (22%) · lim 70Gi (72%) · used 18Gi (18%)

⚠ shop/checkout  Deployment  1/3 Degraded  · 12 restarts, last 2m ago
    ⚠ CrashLoopBackOff: Container repeatedly crashes after starting
Service issues:
  ⚠ shop/checkout  ClusterIP  no ready endpoints

Optional --explain makes a single Claude API call to summarize findings in plain English — the deterministic core still works fully offline.


Open source on GitHub · Releases