Configuring fluentd for logging in Kubernetes in under 15 minutes

Hi there! I am sure if you have landed here on my blog you definitely have either some problems with your existing logging mechanism in kubernetes or you would like to explore fluentd as a choice for logging as I did.

Well, lets talk about the elephant in the room. Logging!!

As an application developer we focus more on developing and deploying our application in the containers and are happy to see the pod status Running or Containers Ready (2/2). Sometimes we even push ourselves a level up and add some sort of readiness probe to make sure that if our containers are actually ready to serve the traffic.
Until that point we are ok to get the status of our containers through the below command to look for any possible errors with the deployment.

kubectl get events <podname> OR kubectl describe po <podname>

Right after that we are curios to see the application logs to see if our k8s service has intercepted the incoming request well and the app works in its most earnest way. In Kubernetes with a container runtime choice of Docker - is configured to write a container’s stdout and stderr to a file on the host system at /var/lib/docker/containers which can be viewed with a command kubectl logs <pod-name>

Potential problems with this approach is that when this node or host system is shutdown in between or there is a planned maintenance at that time these logs are inaccessible. So we need to have a separate log collector, processor and exporter. There is no native way available in kubernetes for this and the most common existing approach is using ELK . (ElasticSearch-Logstash-Kibana)

Logstash primarily routes the data into a single stream and then uses routing logic to send them to the correct destination which can be annoying in cases where I would want to filter the system logs and only dig into application logs within the cluster. This is when I stumbled upon fluentd and familiarized myself with its config file. The best part about the filter directives is that it really gives the flexibility to play around with the rules and have the output stream configured of your choice. You can find my config file here:

nginx pod logs via “kubectl logs nginx” command

To begin my exploration with fluentd, I deployed a simple nginx service on port 80 in the cluster and fetched the service endpoint a few times with curl <svc-IP>:80 and viewed the logs as shown above.

Next up I wanted to deploy fluentd on my cluster and wanted to collect my app logs to an output stream ( I used stdout for my experiment). At this moment , I found fluentd-daemonset-forward.yaml very useful to begin with and created a Daemonset Controller as this allows to keep a watch on the pods deployed across all the nodes in the cluster. Well for a docker container, fluentd looks for its config file on the default location /fluentd/etc/so I had to mount my fluentd config file there and voila the fluent pod was up in the kube-system namespace.

A word of caution at this point, for the fluentd to access filesystem on the node, a service account and clusterRole and clusterRoleBindings needs to be configured additionally like here .

Once the configurations were in place, I tired hitting my nginx service endpoint again via curl and checked upon the fluentd pod logs with this:

kubectl logs -n kube-system <fluentd-podname> | grep nginx

fluentd pod containing nginx application logs

With the list of available directives in a fluentd config file, its really fun to customize the format of logs and /or extract only a part of logs if we are interested in, from match or filter sections of the config file. But more on that later. :)

For the complete sample configuration with the Kubernetes daemonset, view my Github.

Happy exploring !!

The References

Dreamer. Thinker. Cloud Enthusiast