Skip to content

Promtail

Promtail (GitHub) is the log collection agent purpose-built for Grafana Loki. Unlike general-purpose log shippers (Fluentd, Filebeat, Vector) that serialize logs into structured formats and speak multiple output protocols, Promtail is laser-focused on one job: tailing log files on a node, attaching Kubernetes metadata labels, and pushing them to Loki's HTTP ingest API in its native format. This tight coupling to Loki's label-indexed storage model means Promtail produces exactly the label set Loki expects — no intermediate transformation layer, no schema negotiation.

What distinguishes Promtail from other agents in a Loki-centric stack: it understands the CRI log format natively, can parse the kubelet's pod log directory structure to extract namespace/pod/container without relying on the Kubernetes API, and maintains a positions file to survive restarts without re-reading entire log files. It operates as a pull-based file tailer rather than requiring applications to push logs — meaning it captures output from any container regardless of whether that container is instrumented.

In stacks that also run OpenTelemetry Collector (as this one does), Promtail and OTel Collector are complementary rather than redundant: OTel Collector receives application-emitted OTLP log signals, while Promtail handles the node-level filesystem scrape of stdout/stderr that the kubelet writes to disk.

Overview

Property Value
Namespace monitoring
Type HelmRelease (chart: promtail v6.17.0)
Layer Logging stack services
Status Enabled
Source apps/base/promtail/

Dependencies

Upstream — required before Promtail starts

Service Reason Status
loki Flux dependsOn Active

Downstream — services that depend on Promtail

No known downstream Flux dependencies.

Purpose

Promtail is the node-level log scraper that ensures every container's stdout/stderr reaches Loki without requiring application-side instrumentation. It runs as a DaemonSet on all nodes — including control-plane — tailing the kubelet's pod log directory and enriching each line with namespace, pod, and container labels extracted directly from the filesystem path. This provides baseline log visibility for every workload in the cluster, independent of whether that workload emits structured OTLP telemetry.

Why Promtail over Fluent Bit or Vector: Promtail's native understanding of Loki's label model eliminates the impedance mismatch that general-purpose shippers introduce. Fluent Bit or Vector would require explicit output plugin configuration, label mapping rules, and tenant header injection — all of which Promtail handles implicitly. The trade-off is single-backend lock-in (Promtail only speaks Loki), but since this platform already commits to the Grafana observability stack and uses OTel Collector for multi-backend routing of application telemetry, Promtail's simplicity for the node-scrape use case outweighs flexibility concerns.

Why not rely solely on OTel Collector for logs: OTel Collector's filelog receiver could replace Promtail in theory, but it requires explicit file path configuration, manual label extraction rules, and doesn't understand Loki's push API natively (it exports via otlphttp/loki). Promtail's single-purpose design means fewer configuration failure modes for the critical path of "every pod's logs reach Loki."

Features

Feature Detail
DaemonSet on all nodes including control-plane Tolerates node-role.kubernetes.io/control-plane:NoSchedule, ensuring control-plane component logs (etcd, kube-apiserver, scheduler) are collected alongside workload logs.
CRI log parsing with filesystem-based label extraction Uses the cri pipeline stage to parse containerd's log format, then applies a regex against the resolved filename path (/var/log/pods/<namespace>_<pod>_<uid>/<container>/<n>.log) to extract namespace, pod, and container labels without querying the Kubernetes API.
Inotify limit tuning via privileged init container An init container runs sysctl -w fs.inotify.max_user_instances=8192 before Promtail starts, preventing "too many open files" errors on nodes with high pod density where each log file requires an inotify watch.
Position tracking for crash-resilient log tailing Maintains a positions file at /run/promtail/positions.yaml (on an emptyDir volume) to track read offsets per log file, allowing Promtail to resume from where it left off after pod restarts without re-shipping already-ingested lines.
Hardened container security context Runs with readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and drops all Linux capabilities. Log directories are mounted read-only; only the positions emptyDir is writable.
ServiceMonitor for self-monitoring Exposes metrics on port 3101 with a ServiceMonitor (30s interval, prometheus: kube-prometheus label selector), enabling Prometheus to scrape Promtail's own operational metrics (bytes read, lines processed, push failures).
Single-tenant Loki push Configured with tenant_id: fake for single-tenant mode, pushing to http://monitoring-loki:3100/loki/api/v1/push. This matches Loki's auth_enabled: false configuration where all logs share a single tenant namespace.

Architecture

Promtail DaemonSet Topology

graph TD
    subgraph "Every Node"
        INIT["init-inotify<br/>(busybox — sysctl)"]
        PT["Promtail<br/>:3101 metrics"]
        LOGS["/var/log/pods<br/>(hostPath, readOnly)"]
        POS["/run/promtail<br/>(emptyDir)"]
    end

    subgraph "monitoring namespace"
        LOKI["monitoring-loki<br/>:3100"]
        PROM["Prometheus"]
    end

    INIT -->|"sysctl fs.inotify.max_user_instances=8192"| PT
    LOGS -->|"read log files"| PT
    PT -->|"write positions.yaml"| POS
    PT -->|"HTTP POST /loki/api/v1/push :3100"| LOKI
    PROM -->|"scrape /metrics :3101<br/>every 30s"| PT

Log Collection Flow

sequenceDiagram
    participant App as Application Pod
    participant CRI as containerd (CRI)
    participant FS as /var/log/pods
    participant PT as Promtail
    participant Loki as monitoring-loki:3100

    App->>CRI: stdout/stderr
    CRI->>FS: Write CRI-format log file
    PT->>FS: Tail log files (inotify)
    PT->>PT: Parse CRI format
    PT->>PT: Extract namespace/pod/container from path
    PT->>Loki: POST /loki/api/v1/push (tenant_id: fake)
    PT->>PT: Update positions.yaml

Configuration

All values sourced from base/services/environment.env (base); per-environment overrides in clusters/stages/dev/.../environment.env.

Parameter Dev Prod
PROMTAIL_CHART_VERSION 6.17.0 6.17.0
PROMTAIL_CPU_LIMIT 100m 500m
PROMTAIL_CPU_REQUEST 100m 100m
PROMTAIL_MEMORY_LIMIT 128Mi 512Mi
PROMTAIL_MEMORY_REQUEST 128Mi 256Mi

Operations

Promtail cannot reach Loki

Symptoms: Promtail logs show repeated msg="error sending batch" status=503 or connection refused errors. Grafana shows log gaps for all namespaces. Promtail metrics show promtail_sent_entries_total stalled while promtail_targets_active_total remains normal.

kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide
kubectl -n monitoring logs ds/promtail --tail=50 | grep -i 'error\|retry\|connection'
kubectl -n monitoring run curl-test --rm -it --image=curlimages/curl -- curl -s -o /dev/null -w '%{http_code}' http://monitoring-loki:3100/ready
kubectl -n flux-system get kustomization loki -o jsonpath='{.status.conditions[*].message}'
kubectl -n monitoring get pods -l app.kubernetes.io/name=loki -o jsonpath='{.items[*].status.phase}'

Inotify limit exhaustion despite init container

Symptoms: Promtail logs show too many open files or inotify_add_watch: no space left on device. New pod logs are not being collected while existing tails continue working. The init container completed successfully but the node has other inotify consumers.

kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide
kubectl -n monitoring logs ds/promtail -c init-inotify
kubectl -n monitoring debug $(kubectl -n monitoring get pod -l app.kubernetes.io/name=promtail --field-selector spec.nodeName=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') -o name | head -1) --image=busybox -- cat /proc/sys/fs/inotify/max_user_instances
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | wc -l

Positions file reset causing log re-ingestion

Symptoms: Loki shows duplicate log entries after a node drain or pod eviction. promtail_read_bytes_total spikes without a corresponding increase in application log output. Positions file shows all entries starting from byte offset 0.

kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o jsonpath='{.items[*].status.containerStatuses[*].restartCount}'
kubectl -n monitoring get events --field-selector involvedObject.kind=DaemonSet,involvedObject.name=promtail --sort-by=.lastTimestamp
kubectl -n monitoring describe pod -l app.kubernetes.io/name=promtail | grep -A5 'Last State'

Labels not extracted — logs appear without namespace/pod metadata

Symptoms: Loki queries filtering by {namespace="..."} return no results, but {job="pod-logs"} shows entries. Promtail metrics show promtail_custom_regex_errors_total incrementing. Log lines in Loki only have job and __path__ labels.

kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | head -20
kubectl -n monitoring logs ds/promtail --tail=100 | grep -i 'regex\|pipeline\|label'
kubectl -n monitoring exec ds/promtail -- ls /var/log/pods/ | head -10
kubectl -n monitoring exec ds/promtail -- promtail --dry-run --config.file=/etc/promtail/promtail.yaml 2>&1 | head -30

DaemonSet not scheduled on new node

Symptoms: A newly joined node has no Promtail pod. kubectl get ds promtail -n monitoring shows DESIRED count lower than total node count, or a pod stuck in Pending.

kubectl -n monitoring get ds promtail -o wide
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
kubectl -n monitoring describe ds promtail | grep -A10 'Tolerations'
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o wide --sort-by=.spec.nodeName
kubectl -n monitoring describe pod $(kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail --field-selector status.phase!=Running -o name 2>/dev/null | head -1) 2>/dev/null | grep -A5 Events

High memory usage causing OOMKill

Symptoms: Promtail pods restarting with OOMKilled exit reason. kubectl top pod shows memory approaching the configured limit. Node has high pod count or pods producing extremely verbose logs.

kubectl -n monitoring top pods -l app.kubernetes.io/name=promtail --sort-by=memory
kubectl -n monitoring get pods -l app.kubernetes.io/name=promtail -o jsonpath='{range .items[*]}{.metadata.name} restarts={.status.containerStatuses[0].restartCount} reason={.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}'
kubectl -n monitoring exec ds/promtail -- cat /run/promtail/positions.yaml | wc -l
kubectl -n monitoring port-forward ds/promtail 3101:3101 &
curl -s localhost:3101/metrics | grep promtail_targets_active_total
curl -s localhost:3101/metrics | grep process_resident_memory_bytes
See also: docs/adr/010-opentelemetry-collector.md



Generated from service-catalog.json at commit 165b485 · catalog sha 4d088b0b3a67b4c4