4 Challenges In Kubernetes Log Transport
If you are working with Kubernetes for logging, take a look at some of the issues and challenges that you might face that differ from VMs or bare metal environments.
Join the DZone community and get the full member experience.
Join For FreeFor the past three months, I have been working on PKS observability features. Right now, it's mostly about Kubernetes logging.
Collect logs, and send them to the log server. That looks quite straightforward. Simple and common, isn't it? Agreed, but only partially. I have noticed some new challenges in the container logging, compared to VM or bare metal environments.
Here is the summary. Check it out! See how much it may apply to your Kubernetes projects. (BTW, our PKS project is hiring).
The motivation for this post is more about illustrating the problems and technical difficulties. Not about how to solve them. And if it has anything against the company's policy, changes will be made on-demand.
Normally the log transport workflow is either active or passive.
Active Way: process actively sends log messages to a remote syslog server. And usually, the format of data encoding is rfc5424.
Passive Way: for each process, specify the log paths or file patterns. Log agent periodically scans them and send the captured log messages to the log server.
So you may think the problem has been solved. Not yet, my friends.
Running service in containers are different from VMs or bare metal. New trends are:
- The process will be more ephemeral.
- Process deployment will be more distributed.
What does that mean to the container logging?
Challenge I: Fail To Collect All Critical Logs
When something is wrong, pods may get deleted or recreated quickly. Consequently, the log file associated with that pod/container will be deleted/created quickly.
However, log agent like Fluentd or Logstash detects new log file by scanning the folder or log pattern periodically. And the default scan interval is 60 seconds(see below figure). The scan interval may be too slow to capture the short-lived pods. How about we set the interval to shorter, say 1 second? The performance overhead would be much higher.
Previously this won't be a problem in VM world. When the process gets restarted somehow, log file may be rotated but won't be deleted. So users may experience slowness for receiving logs. But not like this: missing critical log for problematic processes.
How we can solve this? Not sure about the best practice, since here in PKS we are also exploring. Maybe we can start a Kubernetes controller subscribing to pod events. Whenever a pod creation event has been fired, notify log agent immediately. honeycomb-kubernetes-agent is an interesting GitHub repo implementing this idea. Please leave us comments, if you have a better solution.
- Not all logs are redirected to stdout/stderr. If process inside pod writes log to local file instead of stdout/stderr, log agent won’t get it.
Why? It only monitors the log file associated with the pod, like below. And that log file will only capture container's stdout/stderr.
# ls -1 /var/lib/docker/containers/*/*-json.log
ls -1 /var/lib/docker/containers/*/*-json.log
/var/lib/docker/containers/0470.../0470...-json.log
/var/lib/docker/containers/0645.../0645...-json.log
/var/lib/docker/containers/12d2.../12d2...-json.log
...
...
Yes, this logging behavior is anti-pattern for Kubernetes world. However cloud-native movement definitely takes time, not everyone is fashion enough. This is especially true for DB services.
Compared to VM worlds, Pod may move across different worker nodes quite often. But you don't want whenever K8s cluster has one pod change, the log agent needs to be reloaded or restarted. New challenges, right?
Challenge II: Multi-Tenancy For Namespace Logging
Kubernetes workloads are usually running in shared worker VMs. Workloads from different projects are divided by namespaces.
Different projects may have been different preferences for logging. Where the log goes to, and managed by what tools, etc. Need to provide an easy way to configure and with no extra security compromises.
It turns out Kubernetes CRD (CustomResourceDefinition) is a good fit.
- All you need to learn is the standard
kubectl command
. (See kubectl cheatsheet). - RBAC can be applied to this custom resource. So security can be easily enforced.
In PKS, we call this feature as sink resource. Note: this idea has been proposed to Kubernetes community. Hopeful it will be merged into upstream soon.
Challenge III: Support Logging SLA For Different Namespaces
For simplicity, people usually only deploy one log agent as Kubernetes daemonset. It means one pod per Kubernetes worker node. If somehow this pod needs to be reloaded or rescheduled, it will impact all Pods living in this worker node.
Starting from K8s v1.12, each node may run 100 pods. You need to make sure your log agent is fast enough to collect logs from all the pods.
Like any shared environments, you may experience a noisy neighborhood issue. The misbehavior of one Pod will hurt all other pods in the same worker node. Want to disable logging for one problematic namespace? You can easily avoid emitting the log, but not the part of collecting log.
A slow disk may create significant latency for log transport. Failure to handle back-pressure issues may DDoS your log agent.
Challenge IV: Handle Logging From Different Layers
Like the below figure, we have pod logs, K8s logs, and platform logs. Even for "pod logs," we have logs from standard workload or from K8s add-ons.
As you may guess, different types of logs have different characteristics. And they may have different priorities. Not only layer vs. layer, but also different SLA for the same layer.
To provide a K8s solution, how we can address this? Facilitate Ops/Dev to find out the root cause quickly. Meanwhile, minimize the security compromises.
What is PKS? PKS is an enterprise Kubernetes solution from VMware and Pivotal.
Published at DZone with permission of Denny Zhang, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments