Over a million developers have joined DZone.

Kubernetes Log Analysis With Fluentd, Elasticsearch, and Kibana

Logging is vital in distributed systems of any complexity, and Kibana is the tool for the job today. Find out how to use it here.

· Cloud Zone

Download this eBook outlining the critical components of success for SaaS companies - and the new rules you need to play by.  Brought to you in partnership with NuoDB.

Logging is one of the major challenges with any large deployment on platforms such as Kubernetes, but configuring and maintaining a central repository for log collection can ease the day-to-day operations. For that purpose, the combination of Fluentd, Elasticsearch, and Kibana can create a powerful logging layer on top of Kubernetes clusters.

In this article, we will describe how to log Kubernetes using dedicated Fluentd, Elasticsearch, and Kibana nodes. To do this, we will define our own pods so that when our cluster is created, the standard output and standard error output for each container is ingested into Elasticsearch and Kibana using a Fluentd agent.

elasticsearch fluentd kibana


If you are using Google Cloud Platform, a section at the end describes how to use the default logging option for Google cloud.

Collecting Logging With Fluentd

Fluentd is an open source data collector for unified logging layers. In this article, we will be using Fluentd pods to gather all of the logs that are stored within individual nodes in our Kubernetes cluster (these logs can be found under the /var/log/containers directory in the cluster).

Installing and Configuring Fluentd

First, we will need to install Fluentd (for instructions, use these installation guides).

Next, we will install the fluent-plugin-kubernetes_metadata_filter, which allows Kubernetes to create symlinks to Docker log files in /var/log/containers/*.log.

Now, to view the logs from the entire cluster, we have to launch a single instance of the Fluentd agent on each of the nodes. Below is the Fluentd configuration file, which provides the path for the logging directory:


<source>
    type tail path
    /var/log/containers/*.log
    pos_file Fluentd-docker.pos
    time_format %Y-%m-%dT%H:%M:%S
    tag kubernetes.*
    format json
    read_from_head true
</source>

<filter kubernetes.var.lib.docker.containers.*.*.log>
    type kubernetes_metadata
</filter>

<match **>
    type stdout
</match>


Running Fluentd

Our next step is to run Fluentd on each of our nodes.

Kubelet is the primary "node agent" that runs on each node and is used to launch PodSpec written in YAML or JSON. We need to specify the pod's definition in the Kubernetes manifests directory at /etc/kubernetes/manifests (learn how to set this one up in case you don't have it).

To create the Fluentd pod, use the following YAML configuration (fluentd-pod.yaml):


apiVersion: v1
kind: Pod
metadata:
    name: Fluentd-Elasticsearch
    namespace: kube-system
Labels:
    k8s-app: Fluentd-logging
Spec:
    containers:
    - name: Fluentd-Elasticsearch
    image: &lt; Your fd-image &gt;
Resources:
limits:
    memory: 200Mi
Requests:
    cpu: 100m
    memory: 200Mi
    volumeMounts:
    - name: varlog
    mountPath: /var/log
    - name: varlibdockercontainers
    mountPath: /var/lib/docker/containers
    readOnly: true
    terminationGracePeriodSeconds: 30
    volumes:
    - name: varlog
    hostPath:
    path: /var/log
    - name: varlibdockercontainers
    hostPath:
    path: /var/lib/docker/containers


In this configuration file, you have to provide the Docker image repository from where you will launch the container.

Once you have the manifest in place, restart the kubelet so that it picks up the definition. The nodes will then start to download and run the container defined in the manifest:


$ kubernetes/cluster/kubedown.sh


Then, bring it up again using kubeup.sh as shown:


$ kubernetes/cluster/kubeup.sh


You will then need to define the resources depending on your system. If you wish to change the application label name, make sure that you also change it in the service.

Launching Our Pods

Now, we have to launch the pod manually. We can do this by configuring the fluentd-pod.yaml file and using the "create" command to launch the pod as follows:


$ kubectl create -f /path/to/fluentd-pod.yaml


You can now check that your pod is up and running:


$ kubectl get --namespace=kube-system pod


We can check the logs in the Fluentd container by executing the following command:


$ docker ps

$ docker exec -it &lt;container id&gt; ls /var/log/containers


Use your container ID to see the logs inside the container:


$ docker exec -it d74372231d743 ls /var/log/containers


see logs inside containers

Streaming Logs from Fluentd into Elasticsearch

Now that we have our Fluentd pods up and running, it's time to set up the pipeline into Elasticsearch (see our complete guide to the ELK Stack to learn how to install and use Elasticsearch).

Configuring and Launching Elasticsearch as a Replication Controller

Since there is no need to have Elasticsearch running on each and every node, we will first launch a single instance and run it as a replication controller.

Here is the .yaml file in which our Elasticsearch instance is defined as a replication controller:


apiVersion: v1
kind: ReplicationController

metadata:
    name: Elasticsearch-logging
    namespace: kube-system
    Labels:
    k8s-app: Elasticsearch-logging
    version: v1
    kubernetes.io/cluster-service: "true"

spec:
    replicas: 1
selector:
    k8s-app: Elasticsearch-logging
    version: v1
template:
    metadata:
        labels:
            k8s-app: Elasticsearch-logging
            version: v1
            kubernetes.io/cluster-service: "true"
    spec:
        containers:
        - image: &lt; Your es-image &gt;
        name: Elasticsearch-logging
        ports:
            - containerPort: 9200
            name: es-port
            protocol: TCP
            - containerPort: 9300
            name: transport-port
            protocol: TCP
    volumeMounts:
        - name: es-persistent-storage
        mountPath: /data
volumes:
- name: es-persistent-storage
    emptyDir: {}


To launch the Elasticsearch replication controller, execute the following command:


$ kubectl create -f /path/to/elasticsearch-rc.yaml


We can check that the replication controller has been created and is running as expected using this:


$ kubectl get --namespace=kube-system rc

check replication controller


Make sure that the pod is also up and running with this command:


$ kubectl get --namespace=kube-system pod

make sure pod is running


Creating an Elasticsearch Service for Communicating With Fluentd

To gather the logs from the nodes in our Kubernetes cluster, we need to launch a service that will establish the communication between our Elasticsearch pod and the Fluentd pods running on the nodes.

To ensure that our Fluentd pods will be able to locate the Elasticsearch instance, we are first going to use a Kubernetes service to expose an externally visible name for an endpoint. A Kubernetes service has a single IP address, a DNS scheme, and a SkyDNS add-on (the service launches automatically in the kube-system namespace when we run the kube cluster).

We can then use the container label in the Elasticsearch service definition to point to the Elasticsearch container (see the selector section in service definition below). The service will now be registered in DNS, allowing Fluentd to communicate with the Elasticsearch instance.

This following .yaml file contains the definition for the Elasticsearch service (elasticsearch-svc.yaml):


apiVersion: v1
kind: Service
metadata:
    name: Elasticsearch-logging
    namespace: kube-system
    labels:
<strong>      k8s-app: Elasticsearch-logging</strong>
        kubernetes.io/cluster-service: "true"
        kubernetes.io/name: "Elasticsearch"
spec:
    ports:
        - port: 9200
        protocol: TCP
        targetPort: es-port
selector:
    k8s-app: Elasticsearch-logging


The container kubernetes/fluentd-elasticsearch will constantly look at the containers' log files (/var/lib/docker/containers/*) and will send the data, in Logstash format, to port 9200 on the local node.

In our case, the service specification will group all the containers that have the label k8s- app=Elasticsearch-logging together. It will also map their internal port to 9200.

Launching the Elasticsearch service

To launch the Elasticsearch service, use this command:


$ kubectl create -f /path/to/elasticsearch-rc.yaml


We can check that the service is up and running this way:


$ kubectl get --namespace=kube-system services

check that service is up and running


Querying the Logs With Elasticsearch

With the help of the Elasticsearch service that we have launched, we can now view and query container logs.

We will use the provided cluster IP address to access the Elasticsearch logs and use the 'q=* &pretty' search query to see web browser logs:


$ curl 192.168.3.220:9200/_search?q=*&pretty


And the output:

elasticsearch log output


To search for warning messages, use the following query:


$ curl 192.168.3.220:9200/_search?q=*warning


The output:

elasticsearch warning messages


Analyzing Kubernetes Logs in Kibana

Now that we have our logs stored in Elasticsearch, the next step is to display them in Kibana. To do this, we will need to run Kibana in our cluster. Just as with Elasticsearch, we need only one Kibana instance.

Let's set that up.

Configuring and Launching Kibana as a Replication Controller

Launching Kibana as a replication controller ensures that at least one instance is always running in the cluster.

In our. yaml file (kibana-rc.yaml), we need to specify the Kibana image to use for the pod:


apiVersion: v1
kind: ReplicationController
metadata:
    name: Kibana-logging
    namespace: kube-system
    labels:
        k8s-app: Kibana-logging
        kubernetes.io/cluster-service: "true"
spec:
    replicas: 1
    selector:
        k8s-app: Kibana-logging
    template:
        metadata:
            labels:
                k8s-app: Kibana-logging
                kubernetes.io/cluster-service: "true"
        spec:
            containers:
            - name: Kibana
                image: &lt; Your kb-image &gt;
                ports:
                    - name: Kibana-port
                    containerPort: 5601


To launch the Kibana replication controller, execute the following command:


$ kubectl create -f /path/to/kibana-rc.yaml


To make sure that the Kibana replication controller is up and running, use this:


$ kubectl get --namespace=kube-system rc

check kibana replication controller


Check that the pod is running with the following command:


$ kubectl get --namespace=kube-system pod


Creating a Kibana Service to Communicate With Elasticsearch

To allow our pod to retrieve logs from Elasticsearch, we will need to configure a Kibana service.

In the configuration file (kibana-svc.yaml), use the default port 5601 for both the service and Kibana pod:


apiVersion: v1
kind: Service
metadata:
    name: Kibana-logging
    namespace: kube-system
    labels:
        k8s-app: Kibana-logging
        kubernetes.io/cluster-service: "true"
        kubernetes.io/name: "Kibana"
spec:
    ports:
    - port: 5601
    protocol: TCP
    targetPort: Kibana-port
selector:
    k8s-app: Kibana-logging


This service will select our Kibana pod with the help of the k8s-app : Kibana-logging label that we provided in the pod definition.

Launching and Accessing Kibana

To launch the Kibana service, use this command:


$ kubectl create -f /path/to/kibana-svc.yaml


To verify that the service is up and running, use this:


$ kubectl get –namespace=kube-system services

verify kibana


To view and access the Kibana interface, install an NGINX web server (skip to next step if one is installed already):


$ apt-get install nginx


Change the NGINX configuration files in /etc/nginx/sites-available/default and etc/nginx/sites-enabled/default as shown here:


{
    listen 80 ;

    root /usr/share/nginx;
    index index.html index.htm;
    server_name localhost;

    location / {
        proxy_pass http://< Service-IP >:5601;
    }
}


Specify your service IP address and the port to where you want to listen. In this case, the IP address is 192.168.3.193 and listen port is 80 (default).

Now, you can start the NGINX service:


$ service nginx start


Go to your browser and enter: http://localhost:80. You should be able to see Kibana:

kibana discover

kibana visualize


Logging Kubernetes Using Google Cloud

Since Kubernetes works natively with Google Cloud, users can enable cluster-level logging easily. But if we want to get the logging through Fluentd and Elasticsearch, we can set the environment variables in the configuration file for our cluster as follows (but this works only if you use GCP as your Kubernetes provider):


KUBE_ENABLE_NODE_LOGGING=true
KUBE_LOGGING_DESTINATION=Elasticsearch


Then, you can bring up and start the cluster with the following command:


$ cluster/kube-up.sh


Once the cluster is up and running, Kubernetes will launch the Fluentd and Elasticsearch pods automatically.

Related Refcard:

Learn how moving from a traditional, on-premises delivery model to a cloud-based, software-as-a-service (SaaS) strategy is a high-stakes, bet-the-company game for independent software vendors. Brought to you in partnership with NuoDB.

Topics:
kubernates ,fluentd ,elasticseach ,kibana

Published at DZone with permission of Roi Ravhon, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}