Fluentbit is not flushing old/stale node exporter metrics #494

vaitiwari · 2024-04-22T10:08:12Z

Fluentbit is not flushing or removing old node exprter device metrics which are no longer part of the system and still exposes it. Only restarting the node exporter would solve this issue. I am able to reproduce the issue on my local.

Steps:

Port forward any fluentbit pod and curl http://localhost:2021/metrics
Note the number of metrics exposed for example node_network_transmit_colls_total
Now delete some pods and create some new pods on the node where the fluentbit pod is port forwarded.
Notice that increase in number of metrics exposed at http://localhost:2021/metrics for node_network_transmit_colls_total

Expcted Result:
The metrics for deleted pods should have been removed.

Actual Result:
The metrics for deleted pods are still persisted and are exposed. We need to restart the fluentbit pods to get rid of stale metrics.

Please note this issue is causing the memory pressure and pods are getting OOMkilled.

Below is the detail of env used:

Kind K8s version: 1.29.2
Cilium version: v1.15.3
FluentBit version: 3.0.2

Below is the fluentbit configmap:

[PARSER]
        Name docker_no_time
        Format json
        Time_Keep Off
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
  fluent-bit.conf: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level debug
        Parsers_File /fluent-bit/etc/parsers.conf
        Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On

    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

    [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On
    [INPUT]
        name node_exporter_metrics
        tag node_metrics
        scrape_interval 5
        path.procfs /proc
        path.sysfs /sys
        Mem_Buf_Limit 50MB
        filesystem.ignore_mount_point_regex `^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)`
        filesystem.ignore_filesystem_type_regex `^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$`

    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

    [OUTPUT]
        name            prometheus_exporter
        match           node_metrics
        host            0.0.0.0
        port            2021

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluentbit is not flushing old/stale node exporter metrics #494

Fluentbit is not flushing old/stale node exporter metrics #494

vaitiwari commented Apr 22, 2024 •

edited

Loading

Fluentbit is not flushing old/stale node exporter metrics #494

Fluentbit is not flushing old/stale node exporter metrics #494

Comments

vaitiwari commented Apr 22, 2024 • edited Loading

vaitiwari commented Apr 22, 2024 •

edited

Loading