The goal of this tutorial is to help you understand how to collect and analyze logs from your Kubernetes cluster. By the end of this tutorial, you will have a clear understanding of how to use Fluentd to aggregate logs from various sources and use Grafana to visualize the log data.
Fluentd is a powerful open-source data collector that lets you process data from different sources. We will use it to collect logs from our Kubernetes nodes.
kubectl apply -f https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/master/fluentd-daemonset-elasticsearch.yaml
This command deploys Fluentd to every node in your cluster. Fluentd will collect logs from containers and forward them to Elasticsearch.
You can configure Fluentd by editing the ConfigMap.
kubectl edit configmap fluentd
In this ConfigMap you can specify the sources from where Fluentd should collect logs, and also where to forward them.
Grafana is a popular tool for visualizing data. We will use it to plot our log data.
kubectl apply -f https://raw.githubusercontent.com/grafana/grafana/master/packaging/k8s/grafana-deployment.yaml
Here is an example of a Fluentd configuration. It collects logs from all containers and forwards them to Elasticsearch.
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
<match kubernetes.**>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
</buffer>
</match>
Here's an example of a Grafana dashboard that visualizes the log data. We create a query that counts the number of logs per service.
{
"panels": [
{
"title": "Logs Count",
"type": "graph",
"targets": [
{
"expr": "count_by(service_name)",
"legendFormat": "{{service_name}}"
}
]
}
]
}
In this tutorial, we have learned how to use Fluentd to aggregate logs from a Kubernetes cluster and how to use Grafana to visualize those logs.
Exercise 1: Deploy Fluentd and Grafana in your own Kubernetes cluster. Collect logs from all containers and try to visualize them in Grafana.
Exercise 2: Modify the Fluentd configuration to only collect logs from a specific namespace. Verify the changes in Grafana.
Exercise 3: Set up an alert in Grafana that triggers when the rate of logs from a particular service exceeds a certain threshold.
Note: The solutions to these exercises are subjective and depend on your specific environment and requirements. Keep exploring and trying out new configurations and setups.