Kubescape¶
Kubescape (GitHub) is a CNCF Sandbox Kubernetes security platform that performs continuous posture assessment across configuration, vulnerabilities, and runtime behavior. Unlike one-shot CLI scanners (Trivy, kube-bench) that produce ephemeral reports, Kubescape operates as an in-cluster operator that writes findings back as Kubernetes Custom Resources — VulnerabilityManifest and WorkloadConfigurationScan CRDs — making security posture queryable through the standard Kubernetes API.
The operator architecture deploys four cooperating components: the kubescape engine (configuration scanning against NSA/MITRE frameworks), kubevuln (container image vulnerability analysis), a central operator (orchestrates scan scheduling and reacts to resource changes), and a storage backend (persists scan results as CRDs). This separation allows each component to scale and fail independently while sharing a common CRD-based data plane.
What distinguishes Kubescape from purely external scanning tools (Snyk Container, Prisma Cloud) is that it runs entirely in-cluster with no external SaaS dependency, publishes machine-readable CRDs rather than proprietary dashboards, and supports event-driven continuous scanning — rescanning workloads as they change rather than on a fixed schedule.
Overview¶
| Property | Value |
|---|---|
| Namespace | kubescape |
| Type | HelmRelease (chart: kubescape-operator v1.30.4) |
| Layer | Security and cost observability |
| Chart | kubescape-operator v1.30.4 |
| Status | Enabled |
| Source | apps/base/kubescape/ |
Dependencies¶
Upstream — required before Kubescape starts¶
| Service | Reason | Status |
|---|---|---|
metrics-server |
Flux dependsOn |
Active |
kube-prometheus-stack |
Flux dependsOn |
Active |
Downstream — services that depend on Kubescape¶
No known downstream Flux dependencies.
Purpose¶
Kubescape is the platform's security data producer. It continuously scans all workloads for configuration drift (against NSA and MITRE ATT&CK frameworks) and known CVEs, publishing structured findings as CRDs in the kubescape namespace. The security-agent in the kagent namespace queries these CRDs via Kubernetes API tools to answer questions about cluster security posture, surface critical vulnerabilities, and recommend remediations — without needing direct access to container registries or scan engines.
The continuous scanning mode means findings are always current: when a new Deployment is created or an image tag changes, Kubescape rescans automatically rather than waiting for a scheduled interval. This keeps the security-agent's responses accurate to the live cluster state.
Why Kubescape over Trivy Operator or external SaaS scanners: The primary selection criterion was CRD-based output that the kagent security-agent can query through standard kubectl-style tools. Trivy Operator also publishes CRDs (VulnerabilityReport), but Kubescape adds configuration scanning (NSA/MITRE frameworks) alongside vulnerability scanning in a single operator — reducing operational surface area. External SaaS scanners (Snyk, Prisma) would require API credentials, egress network policies, and introduce a dependency on external availability for what should be a cluster-internal capability. Kubescape's CNCF Sandbox status and active community governance reduce vendor lock-in risk.
Features¶
| Feature | Detail |
|---|---|
| Continuous scanning | Operator watches for resource changes (create/update/delete) and triggers immediate rescan rather than relying solely on periodic intervals |
| Vulnerability scanning (kubevuln) | Dedicated kubevuln component analyzes container images for known CVEs and publishes VulnerabilityManifest CRDs queryable by downstream agents |
| Configuration scanning | Evaluates workload configurations against NSA and MITRE ATT&CK frameworks, producing WorkloadConfigurationScan CRDs with per-control pass/fail results |
| Node scanning | Assesses node-level security posture using resource metrics from the metrics-server dependency |
| CRD-based data plane | All findings persisted as native Kubernetes Custom Resources, enabling standard RBAC-gated access from any in-cluster consumer without proprietary APIs |
| Install/upgrade remediation | Helm release configured with 3 retries on both install and upgrade failures, with 10-minute timeout per attempt to handle CRD registration delays |
Architecture¶
Kubescape Operator Deployment Topology¶
graph TD
subgraph flux-system["flux-system namespace"]
HR[HelmRelease: kubescape]
REPO[HelmRepository: kubescape]
end
subgraph kubescape-ns["kubescape namespace"]
OP[operator]
KS[kubescape engine]
KV[kubevuln]
ST[storage]
CRD_VM[CRD: VulnerabilityManifest]
CRD_WCS[CRD: WorkloadConfigurationScan]
end
subgraph kagent-ns["kagent namespace"]
SA[security-agent]
end
subgraph deps["Dependencies"]
MS[metrics-server]
PROM[kube-prometheus-stack]
end
HR -->|"chart: kubescape-operator"| REPO
HR -->|"deploys to"| kubescape-ns
OP -->|"triggers scans"| KS
OP -->|"triggers scans"| KV
KS -->|"writes"| CRD_WCS
KV -->|"writes"| CRD_VM
KS -->|"persists results"| ST
KV -->|"persists results"| ST
SA -->|"queries CRDs via K8s API"| CRD_VM
SA -->|"queries CRDs via K8s API"| CRD_WCS
KS -->|"resource metrics"| MS
PROM -->|"scrapes metrics"| kubescape-ns
Continuous Scan Flow¶
sequenceDiagram
participant K8s as Kubernetes API
participant Op as operator
participant KS as kubescape engine
participant KV as kubevuln
participant ST as storage
participant SA as security-agent (kagent)
K8s->>Op: Resource change event (watch)
Op->>KS: Trigger configuration scan
Op->>KV: Trigger vulnerability scan
KS->>ST: Persist WorkloadConfigurationScan CRD
KV->>ST: Persist VulnerabilityManifest CRD
SA->>K8s: Query VulnerabilityManifest CRDs
K8s-->>SA: Return scan findings
Configuration¶
All values sourced from base/services/environment.env
(base); per-environment overrides in clusters/stages/dev/.../environment.env.
| Parameter | Dev | Prod |
|---|---|---|
KUBESCAPE_CHART_VERSION |
1.30.4 |
1.30.4 |
KUBESCAPE_CPU_LIMIT |
500m |
500m |
KUBESCAPE_CPU_REQUEST |
100m |
100m |
KUBESCAPE_MEMORY_LIMIT |
512Mi |
512Mi |
KUBESCAPE_MEMORY_REQUEST |
128Mi |
128Mi |
KUBEVULN_MEMORY_LIMIT |
1Gi |
1Gi |
Operations¶
kubevuln OOMKilled during large image scan¶
Symptoms: kubectl get pods -n kubescape shows kubevuln pod in CrashLoopBackOff with reason OOMKilled. VulnerabilityManifest CRDs stop updating for newly deployed images. Prometheus alert KubePodCrashLooping fires for kubevuln.
kubectl describe pod -n kubescape -l app.kubernetes.io/component=kubevuln | grep -A5 'Last State'
kubectl top pod -n kubescape -l app.kubernetes.io/component=kubevuln
kubectl get vulnerabilitymanifests -A --sort-by=.metadata.creationTimestamp | tail -5
kubectl logs -n kubescape -l app.kubernetes.io/component=kubevuln --previous --tail=100 | grep -i 'memory\|oom\|killed'
# If OOM on specific large images, check which image triggered it:
kubectl get vulnerabilitymanifestsummaries -A -o json | jq '.items[] | select(.spec.vulnerabilitiesRef.all.status=="incomplete") | .metadata.name'
Continuous scan not triggering on resource changes¶
Symptoms: New deployments or image updates do not produce updated WorkloadConfigurationScan CRDs. kubectl get workloadconfigurationscans -A shows stale timestamps. No errors visible in operator logs.
kubectl get pods -n kubescape -l app.kubernetes.io/component=operator -o wide
kubectl logs -n kubescape -l app.kubernetes.io/component=operator --tail=200 | grep -i 'watch\|trigger\|scan'
# Verify the operator has RBAC to watch resources across namespaces:
kubectl auth can-i watch deployments --as=system:serviceaccount:kubescape:kubescape-operator -A
# Check if scan is queued but storage is backlogged:
kubectl logs -n kubescape -l app.kubernetes.io/component=storage --tail=100 | grep -i 'error\|full\|queue'
# Force a manual rescan to test the pipeline:
kubectl annotate workloadconfigurationscan -n default --all kubescape.io/rescan=$(date +%s) --overwrite
HelmRelease reconciliation failing¶
Symptoms: kubectl get helmrelease kubescape -n flux-system shows status False with message about chart fetch or install timeout. Flux kustomization kubescape shows Ready: False.
kubectl get helmrelease kubescape -n flux-system -o yaml | grep -A10 'status:'
kubectl get helmrepository kubescape -n flux-system -o yaml | grep -A5 'status:'
# Check if chart registry is reachable from the cluster:
kubectl run curl-test --rm -it --image=curlimages/curl --restart=Never -- curl -s https://kubescape.github.io/helm-charts/index.yaml | head -20
# Check if dependsOn services are ready:
kubectl get kustomization metrics-server kube-prometheus-stack -n flux-system -o custom-columns=NAME:.metadata.name,READY:.status.conditions[0].status
# Force reconciliation:
flux reconcile kustomization kubescape --with-source
Configuration scan results missing for specific namespaces¶
Symptoms: kubectl get workloadconfigurationscans -n <target-namespace> returns no resources, but scans exist in other namespaces. No errors in scanner logs.
# Check if namespace is excluded from scanning:
kubectl get configmap -n kubescape -l app.kubernetes.io/component=kubescape -o yaml | grep -A20 'excludeNamespaces\|includeNamespaces'
# Verify scanner RBAC covers the target namespace:
kubectl auth can-i list pods --as=system:serviceaccount:kubescape:kubescape-scanner --namespace=default
# Check scanner logs for skip reasons:
kubectl logs -n kubescape -l app.kubernetes.io/component=kubescape --tail=200 | grep -i 'skip\|exclude\|namespace'
# List all scanned namespaces to identify the gap:
kubectl get workloadconfigurationscans -A --no-headers | awk '{print $1}' | sort -u
Storage component disk pressure or CRD accumulation¶
Symptoms: Storage pod shows high memory usage or restarts. Older VulnerabilityManifest CRDs accumulate without garbage collection. kubectl get vulnerabilitymanifests -A | wc -l shows thousands of stale entries.
kubectl top pod -n kubescape -l app.kubernetes.io/component=storage
kubectl get vulnerabilitymanifests -A --no-headers | wc -l
kubectl get workloadconfigurationscans -A --no-headers | wc -l
# Check for orphaned CRDs (workload no longer exists):
kubectl get vulnerabilitymanifests -A -o json | jq '[.items[] | select(.metadata.ownerReferences == null)] | length'
# Check storage pod logs for pressure signals:
kubectl logs -n kubescape -l app.kubernetes.io/component=storage --tail=100 | grep -i 'pressure\|evict\|memory\|full'
Related¶
apps/base/kubescape/— Kubernetes manifestsbase/services/kubescape.yaml— Flux Kustomizationbase/services/environment.env— environment variables
Generated from service-catalog.json at commit 165b485 · catalog sha 4d088b0b3a67b4c4