Collecting and Analyzing Kubernetes Logs

Tutorial 2 of 5

1. Introduction

The goal of this tutorial is to help you understand how to collect and analyze logs from your Kubernetes cluster. By the end of this tutorial, you will have a clear understanding of how to use Fluentd to aggregate logs from various sources and use Grafana to visualize the log data.

What you will learn:

  • How to set up and configure Fluentd in a Kubernetes cluster.
  • How to configure Fluentd to collect logs from different sources.
  • How to visualize log data with Grafana.

Prerequisites:

  • Basic knowledge of Kubernetes and its components.
  • A running Kubernetes cluster.
  • Familiarity with Docker containers.
  • Basic understanding of Grafana.

2. Step-by-Step Guide

Setting up Fluentd

Fluentd is a powerful open-source data collector that lets you process data from different sources. We will use it to collect logs from our Kubernetes nodes.

  1. Deploy Fluentd on your Kubernetes cluster. You can use a DaemonSet so Fluentd gets deployed on every node.
kubectl apply -f https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/master/fluentd-daemonset-elasticsearch.yaml

This command deploys Fluentd to every node in your cluster. Fluentd will collect logs from containers and forward them to Elasticsearch.

Configuring Fluentd

You can configure Fluentd by editing the ConfigMap.

kubectl edit configmap fluentd

In this ConfigMap you can specify the sources from where Fluentd should collect logs, and also where to forward them.

Setting up Grafana

Grafana is a popular tool for visualizing data. We will use it to plot our log data.

  1. Deploy Grafana on your Kubernetes cluster.
kubectl apply -f https://raw.githubusercontent.com/grafana/grafana/master/packaging/k8s/grafana-deployment.yaml

3. Code Examples

Example 1: Fluentd Configuration

Here is an example of a Fluentd configuration. It collects logs from all containers and forwards them to Elasticsearch.

<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  tag kubernetes.*
  read_from_head true
  <parse>
    @type json
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>

<match kubernetes.**>
  @type elasticsearch
  host elasticsearch
  port 9200
  logstash_format true
  <buffer>
    @type file
    path /var/log/fluentd-buffers/kubernetes.system.buffer
  </buffer>
</match>

Example 2: Grafana Dashboard

Here's an example of a Grafana dashboard that visualizes the log data. We create a query that counts the number of logs per service.

{
  "panels": [
    {
      "title": "Logs Count",
      "type": "graph",
      "targets": [
        {
          "expr": "count_by(service_name)",
          "legendFormat": "{{service_name}}"
        }
      ]
    }
  ]
}

4. Summary

In this tutorial, we have learned how to use Fluentd to aggregate logs from a Kubernetes cluster and how to use Grafana to visualize those logs.

Next steps for learning:

  • Learn more about Fluentd's and Grafana's advanced features.
  • Learn how to set up alerts in Grafana based on your log data.

Additional resources:

5. Practice Exercises

  1. Exercise 1: Deploy Fluentd and Grafana in your own Kubernetes cluster. Collect logs from all containers and try to visualize them in Grafana.

  2. Exercise 2: Modify the Fluentd configuration to only collect logs from a specific namespace. Verify the changes in Grafana.

  3. Exercise 3: Set up an alert in Grafana that triggers when the rate of logs from a particular service exceeds a certain threshold.

Note: The solutions to these exercises are subjective and depend on your specific environment and requirements. Keep exploring and trying out new configurations and setups.