DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Containerizing AI: Hands-On Guide to Deploying ML Models With Docker and Kubernetes
  • Serverless vs Containers: Choosing the Right Architecture for Your Application
  • Multi-Cluster Networking With Kubernetes and Docker: Connecting Your Containerized Environment
  • From Monolith to Containers: Real-World Migration Blueprint

Trending

  • Decoding the Secret Language of LLM Tokenizers
  • Evaluating Accuracy in RAG Applications: A Guide to Automated Evaluation
  • Secret Recipe of the Template Method: Po Learns the Art of Structured Cooking
  • The Shift to Open Industrial IoT Architectures With Data Streaming
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Handling Core Dumps Across Clusters

Handling Core Dumps Across Clusters

Keep all your core dumping processes in one place with the help of GCP.

By 
Jayant Chaudhury user avatar
Jayant Chaudhury
·
Updated Aug. 23, 19 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
7.7K Views

Join the DZone community and get the full member experience.

Join For Free

Centralized Core Dump Processing in Kubernetes Applications in GCP

Managing centralized information like logs and core dumps can be a challenge, even with various tools for managing Logs like Logstash, and FluentD. However, we don’t really have an end solution for managing core dumps from our application.

In the Kubernetes ecosystem, every node in a cluster can launch one or many pods. Each Pod can launch as many as one or many container applications.

There will be many circumstances where the application might fail like a memory dump issue.

The definition of Core Dump is as follows:

“Core dumps are generated when the process receives certain signals, such as SIGSEGV, which the kernels sends it when it accesses memory outside its address space. Typically, that happens because of errors in how pointers are used. That means there's a bug in the program. The core dump is useful for finding the bug.”

Hence, by the definition itself, it can really be made out, how important is to understand and debug the issues of Core Dump to make a robust application.

In this blog, I will be describing how we can manage core dumps from a Kubernetes cluster deployed in GCP.

Managing Core Dump in Kubernetes Cluster in GCP

core-dump handling


Phase 1 Approach

The issue with managing the core dumps in each pod is that if the pod, container, or node restarts, then we will be losing all the data which was generated in the core-dump folder. This are important data which cannot be compromised. Also, we would like to manage the core-dumps from every node in a centralized manner to understand the behavior of core-dumps so that we can make our application more robust.

As per the diagram above, we are proposing to introduce a Core Dump Collector Agent in each Node. This Core Dump collector agent will exist as a Daemon Pod/Sidecar container which will collect the data from the location where the core dumps are created and upload the same data to an external piece of storage device.

Each application running inside the pod will write the core dumps in a said location in its file structure. This location will be Mounted Persistent Volume within the Kubernetes cluster. The Core Dump Agent will watch the Mounted Volume Location, so as soon as some files are getting created in Persistent Volume, the same file gets uploaded in the Google Storage which is outside the cluster and we are able to get a centralized path where we can get all core dumps across multiple clusters.

We can also mount to an empty directory as well rather than Persistent Volume. Persistent Volume is taken so that we do not lose any core-dumps in some unforeseen circumstances.

Later, we can create some shell scripts to do housekeeping on the Persistent Volume data as well.

The storage device can be anything: AWS Cloud Storage, Google Storage, Local File Storage, etc. Here, we will discuss how we can manage by storing and centralizing with Google Storage.

We are assuming that your container is generating the core dumps in the path “/var/lib/systemd/coredump.” This path can vary as per the core dump pattern defined in your container. We can create the core dump location specified in the file “/proc/sys/kernel/core_pattern” by doing something like below.

$> mkdir -p /tmp/cores
$> chmod a+rwx /tmp/cores
$> echo "/tmp/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern


Here, the core dump will be created in “/tmp/cores” directory. In our exercise, we are assuming that core files are located in “/var/lib/systemd/coredump”.

Creation of Service Account to Access Google Storage Bucket.

Create a service account in GCP console by clicking on IAM & Admin > Service Account > Create Service Account.

Image title


Select Service Account Permission Roles and select the “Storage Object Creator” role.

Image title


Store the key as a JSON key and save it locally in your system. In our case, we have saved the file as “projects-jayant-82544f23332b.json” in the local system.

Image title

Note: A service account can also be created with GCP scripts with the command  gcloud iam service-accounts create.

Create the Google Storage External Bucket

Create the Google Storage where you want to upload the core dumps from multiple pods across multiple nodes in different clusters.

Image title

Once the Google Storage Bucket is created, assign the permission by selecting “Add members.” Here, you will add the Service Account information which you created above in Step 1.

Image title


Also assign ”Storage Object Creator” role to it.

Image title

Note: Creation of Google Storage Bucket can also be done by GCP scripts like below. I went with the GUI based concepts so that it is understood by a wider audience.

“gsutil mb -p [PROJECT_NAME] -c [STORAGE_CLASS] -l [BUCKET_LOCATION] -b on gs://[BUCKET_NAME]/”


Step 3: Persistent Volume Mount on CoreDump Location Inside Application Container

If the Persistent Volume is already created, then ignore the below YAML script. In this case, the Persistent Volume name is “storage-pvc”.

Image title


Reference the Persistent Volume name in your Application Pod/Deployment YAML file. Mount the file location where core dump is created to the Persistent Volume.

In the below application pod, we are mounting the core-dump path “/var/lib/systemd/coredump” and mapping it to a persistent volume “storage-pvc” as created above.

You can have your corresponding image in place of “gcr.io/kubeapps-xyz/appshell:1.0” below.

Image title


Core-Dump Agent Collector Dockerfile

In order to create a Core Dump Agent Collector, we will use the Google SDK-Alpine Image and install INOTIFY-TOOLS so that the Core Dump Agent Collector can watch on the specified directory. Google SDK is used as we need to use the Google CLI command to trigger the watch notification.

In this Dockerfile, we will copy the Service Account JSON Key created in step 1 and copy it into Alpine Directory.

Image title


If you'll see above in the screenshot, we also created a “watchnotify.sh” script which will trigger the INOTIFY EVENT whenever there is some file created in core-dump persistent disk location and kept it inside the “/usr/sbin” directory.

Image title

NOTE: If you want to sync the Storage Bucket with the Core-Dump directory, then replace the “inotify –m …xxxx..gsutil cp..” line with the below hightlighted line.

”inotifywait -m /mnt -e close_write | while read path action file; do gsutil -m rsync -d -r "$path" "gs://google-storage/"; done;”.

Later, we can use similar “gsutil” command to copy from Google Storage to Amazon S3 bucket as well. For example:: “gsutil –m rsync –d –r “gs://<your-storage>” “s3://<amazon-bucket>”.

Just remember, we have given the Google Storage Bucket a name in the “watchnotify.sh” which we have created in Step 2. In this case, it is “google-storage.” And we have passed the service-account JSON key which we had put in the Alpine directory in the Dockerfile. Here, it is “secrets/projects-jayant-82xxxx.json.” In your case, it will be a different JSON file.

Now, build the Docker image. I have done it from my GCP console.

And, push it in Image Repository. In your case it will vary, I pushed in gcr.io repository from my GCP console.

docker build . -t gcr.io/projects-jayant/coredump/coredump0.1:1
docker push gcr.io/projects-jayant/coredump/coredump0.1:1


Executing Core-Dump Agent as A DaemonSet

We want to run the Core-Dump Agent Collector Agent as a sidecar container. We will create it as a DaemonSet so that the Core Dump Agent Collector container is up and running as soon as the cluster comes alive in the K8s cluster. This DaemonSet will call the Core Dump Agent Image created in step 4. Also, it will call the “watchnotify.sh” script to watch the core-dump file location. The file location (“/mnt”) is mapped to the same Persistent Volume created in Step 3.

Image title

Now, deploy both the Daemon Set and Application Pod in your Kubernetes Cluster.

In your Google Storage Bucket, you will start seeing the core-dumps files getting created.

In my case, I pushed some dummy files in my application container core-dump path “/var/lib/systemd/coredump” and it gets automatically created on “gs://google-storage.”

report


Phase 2 Approach

In this phase, we can plan how to consume the data captured in Google Storage and what benefits we can get from those data.

One option is that we can use Kafka Streaming to capture live data and then use this data in multiple ways.

Image title


But we'll save that for another day.

Docker (software) Dump (program) Kubernetes

Opinions expressed by DZone contributors are their own.

Related

  • Containerizing AI: Hands-On Guide to Deploying ML Models With Docker and Kubernetes
  • Serverless vs Containers: Choosing the Right Architecture for Your Application
  • Multi-Cluster Networking With Kubernetes and Docker: Connecting Your Containerized Environment
  • From Monolith to Containers: Real-World Migration Blueprint

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: