Promtail¶
Promtail (GitHub) is the log collection agent purpose-built for Grafana Loki. Unlike general-purpose log shippers (Fluentd, Filebeat, Vector) that serialize logs into structured formats and speak multiple output protocols, Promtail is laser-focused on one job: tailing log files on a node, attaching Kubernetes metadata labels, and pushing them to Loki's HTTP ingest API in its native format. This tight coupling to Loki's label-indexed storage model means Promtail produces exactly the label set Loki expects — no intermediate transformation layer, no schema negotiation.
What distinguishes Promtail from other agents in a Loki-centric stack: it understands the CRI log format natively, can parse the kubelet's pod log directory structure to extract namespace/pod/container without relying on the Kubernetes API, and maintains a positions file to survive restarts without re-reading entire log files. It operates as a pull-based file tailer rather than requiring applications to push logs — meaning it captures output from any container regardless of whether that container is instrumented.
In stacks that also run OpenTelemetry Collector (as this one does), Promtail and OTel Collector are complementary rather than redundant: OTel Collector receives application-emitted OTLP log signals, while Promtail handles the node-level filesystem scrape of stdout/stderr that the kubelet writes to disk.
Overview¶
| Property | Value |
|---|---|
| Namespace | monitoring |
| Type | HelmRelease (chart: promtail v6.17.0) |
| Layer | Logging stack services |
| Status | Enabled |
| Source | apps/base/promtail/ |
Dependencies¶
Upstream — required before Promtail starts¶
| Service | Reason | Status |
|---|---|---|
loki |
Flux dependsOn |
Active |
Downstream — services that depend on Promtail¶
No known downstream Flux dependencies.
Purpose¶
Promtail is the node-level log scraper that ensures every container's stdout/stderr reaches Loki without requiring application-side instrumentation. It runs as a DaemonSet on all nodes — including control-plane — tailing the kubelet's pod log directory and enriching each line with namespace, pod, and container labels extracted directly from the filesystem path. This provides baseline log visibility for every workload in the cluster, independent of whether that workload emits structured OTLP telemetry.
Why Promtail over Fluent Bit or Vector: Promtail's native understanding of Loki's label model eliminates the impedance mismatch that general-purpose shippers introduce. Fluent Bit or Vector would require explicit output plugin configuration, label mapping rules, and tenant header injection — all of which Promtail handles implicitly. The trade-off is single-backend lock-in (Promtail only speaks Loki), but since this platform already commits to the Grafana observability stack and uses OTel Collector for multi-backend routing of application telemetry, Promtail's simplicity for the node-scrape use case outweighs flexibility concerns.
Why not rely solely on OTel Collector for logs: OTel Collector's filelog receiver could replace Promtail in theory, but it requires explicit file path configuration, manual label extraction rules, and doesn't understand Loki's push API natively (it exports via otlphttp/loki). Promtail's single-purpose design means fewer configuration failure modes for the critical path of "every pod's logs reach Loki."
Features¶
| Feature | Detail |
|---|---|
| DaemonSet on all nodes including control-plane | Tolerates node-role.kubernetes.io/control-plane:NoSchedule, ensuring control-plane component logs (etcd, kube-apiserver, scheduler) are collected alongside workload logs. |
| CRI log parsing with filesystem-based label extraction | Uses the cri pipeline stage to parse containerd's log format, then applies a regex against the resolved filename path (/var/log/pods/<namespace>_<pod>_<uid>/<container>/<n>.log) to extract namespace, pod, and container labels without querying the Kubernetes API. |
| Inotify limit tuning via privileged init container | An init container runs sysctl -w fs.inotify.max_user_instances=8192 before Promtail starts, preventing "too many open files" errors on nodes with high pod density where each log file requires an inotify watch. |
| Position tracking for crash-resilient log tailing | Maintains a positions file at /run/promtail/positions.yaml (on an emptyDir volume) to track read offsets per log file, allowing Promtail to resume from where it left off after pod restarts without re-shipping already-ingested lines. |
| Hardened container security context | Runs with readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and drops all Linux capabilities. Log directories are mounted read-only; only the positions emptyDir is writable. |
| ServiceMonitor for self-monitoring | Exposes metrics on port 3101 with a ServiceMonitor (30s interval, prometheus: kube-prometheus label selector), enabling Prometheus to scrape Promtail's own operational metrics (bytes read, lines processed, push failures). |
| Single-tenant Loki push | Configured with tenant_id: fake for single-tenant mode, pushing to http://monitoring-loki:3100/loki/api/v1/push. This matches Loki's auth_enabled: false configuration where all logs share a single tenant namespace. |
Architecture¶
Promtail DaemonSet Topology¶
graph TD
subgraph "Every Node"
INIT["init-inotify<br/>(busybox — sysctl)"]
PT["Promtail<br/>:3101 metrics"]
LOGS["/var/log/pods<br/>(hostPath, readOnly)"]
POS["/run/promtail<br/>(emptyDir)"]
end
subgraph "monitoring namespace"
LOKI["monitoring-loki<br/>:3100"]
PROM["Prometheus"]
end
INIT -->|"sysctl fs.inotify.max_user_instances=8192"| PT
LOGS -->|"read log files"| PT
PT -->|"write positions.yaml"| POS
PT -->|"HTTP POST /loki/api/v1/push :3100"| LOKI
PROM -->|"scrape /metrics :3101<br/>every 30s"| PT
Log Collection Flow¶
sequenceDiagram
participant App as Application Pod
participant CRI as containerd (CRI)
participant FS as /var/log/pods
participant PT as Promtail
participant Loki as monitoring-loki:3100
App->>CRI: stdout/stderr
CRI->>FS: Write CRI-format log file
PT->>FS: Tail log files (inotify)
PT->>PT: Parse CRI format
PT->>PT: Extract namespace/pod/container from path
PT->>Loki: POST /loki/api/v1/push (tenant_id: fake)
PT->>PT: Update positions.yaml
Configuration¶
All values sourced from base/services/environment.env
(base); per-environment overrides in clusters/stages/dev/.../environment.env.
| Parameter | Dev | Prod |
|---|---|---|
PROMTAIL_CHART_VERSION |
6.17.0 |
6.17.0 |
PROMTAIL_CPU_LIMIT |
100m |
500m |
PROMTAIL_CPU_REQUEST |
100m |
100m |
PROMTAIL_MEMORY_LIMIT |
128Mi |
512Mi |
PROMTAIL_MEMORY_REQUEST |
128Mi |
256Mi |
Operations¶
Promtail cannot reach Loki¶
Symptoms: Promtail logs show repeated msg="error sending batch" status=503 or connection refused errors. Grafana shows log gaps for all namespaces. Promtail metrics show promtail_sent_entries_total stalled while promtail_targets_active_total remains normal.
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide
kubectl -n monitoring logs ds/promtail --tail=50 | grep -i 'error\|retry\|connection'
kubectl -n monitoring run curl-test --rm -it --image=curlimages/curl -- curl -s -o /dev/null -w '%{http_code}' http://monitoring-loki:3100/ready
kubectl -n flux-system get kustomization loki -o jsonpath='{.status.conditions[*].message}'
kubectl -n monitoring get pods -l app.kubernetes.io/name=loki -o jsonpath='{.items[*].status.phase}'
Inotify limit exhaustion despite init container¶
Symptoms: Promtail logs show too many open files or inotify_add_watch: no space left on device. New pod logs are not being collected while existing tails continue working. The init container completed successfully but the node has other inotify consumers.
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide
kubectl -n monitoring logs ds/promtail -c init-inotify
kubectl -n monitoring debug $(kubectl -n monitoring get pod -l app.kubernetes.io/name=promtail --field-selector spec.nodeName=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') -o name | head -1) --image=busybox -- cat /proc/sys/fs/inotify/max_user_instances
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | wc -l
Positions file reset causing log re-ingestion¶
Symptoms: Loki shows duplicate log entries after a node drain or pod eviction. promtail_read_bytes_total spikes without a corresponding increase in application log output. Positions file shows all entries starting from byte offset 0.
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o jsonpath='{.items[*].status.containerStatuses[*].restartCount}'
kubectl -n monitoring get events --field-selector involvedObject.kind=DaemonSet,involvedObject.name=promtail --sort-by=.lastTimestamp
kubectl -n monitoring describe pod -l app.kubernetes.io/name=promtail | grep -A5 'Last State'
Labels not extracted — logs appear without namespace/pod metadata¶
Symptoms: Loki queries filtering by {namespace="..."} return no results, but {job="pod-logs"} shows entries. Promtail metrics show promtail_custom_regex_errors_total incrementing. Log lines in Loki only have job and __path__ labels.
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | head -20
kubectl -n monitoring logs ds/promtail --tail=100 | grep -i 'regex\|pipeline\|label'
kubectl -n monitoring exec ds/promtail -- ls /var/log/pods/ | head -10
kubectl -n monitoring exec ds/promtail -- promtail --dry-run --config.file=/etc/promtail/promtail.yaml 2>&1 | head -30
DaemonSet not scheduled on new node¶
Symptoms: A newly joined node has no Promtail pod. kubectl get ds promtail -n monitoring shows DESIRED count lower than total node count, or a pod stuck in Pending.
kubectl -n monitoring get ds promtail -o wide
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
kubectl -n monitoring describe ds promtail | grep -A10 'Tolerations'
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide --sort-by=.spec.nodeName
kubectl -n monitoring describe pod $(kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail --field-selector status.phase!=Running -o name 2>/dev/null | head -1) 2>/dev/null | grep -A5 Events
High memory usage causing OOMKill¶
Symptoms: Promtail pods restarting with OOMKilled exit reason. kubectl top pod shows memory approaching the configured limit. Node has high pod count or pods producing extremely verbose logs.
kubectl -n monitoring top pods -l app.kubernetes.io/name=promtail --sort-by=memory
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o jsonpath='{range .items[*]}{.metadata.name} restarts={.status.containerStatuses[0].restartCount} reason={.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}'
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | wc -l
kubectl -n monitoring port-forward ds/promtail 3101:3101 &
curl -s localhost:3101/metrics | grep promtail_targets_active_total
curl -s localhost:3101/metrics | grep process_resident_memory_bytes
Related¶
apps/base/promtail/— Kubernetes manifestsbase/services/promtail.yaml— Flux Kustomizationbase/services/environment.env— environment variables
Generated from service-catalog.json at commit 165b485 · catalog sha 4d088b0b3a67b4c4