Tracing and Profiling a .NET Core Application on Azure Kubernetes Service With a Sidecar Container
Imagine running a .NET Core application in Kubernetes, which suddenly starts being sluggish, and the telemetry fails to give you a complete picture of the issue.
Join the DZone community and get the full member experience.Join For Free
Imagine running a .NET Core application in Kubernetes, which suddenly starts being sluggish, and the telemetry fails to give you a complete picture of the issue. To remediate performance issues of applications, starting with .NET Core 3, Microsoft introduced several .NET Core runtime diagnostics tools to diagnose application issues:
- dotnet-counters to view performance counters
- dotnet-dump to capture and analyze dumps
- dotnet-trace to capture runtime events and sample CPU stacks
- dotnet-gcdump to collect garbage collector dumps of application
Let's try to understand how these tools work: the .NET Core (3.0+) runtime contains a Diagnostic Server that sends and receives application diagnostic data. The Diagnostic Server opens an IPC (Interprocess communication) channel through which a client (dotnet tool) can communicate. The channel used for communication varies with the platform: Unix Domain Sockets on *nix systems and Named Pipes on Windows. On Linux, the Unix Domain Socket is placed in the
/tmp directory by default. The following diagram presents the components involved in the communication between the Diagnostic Server and the client:
Running the diagnostics tools on your local system or on an application server where you can install the tools is very easy. If you are running your application on containers, you can still use these tools by following the prescribed guidance in the Microsoft documentation.
There are two approaches to using the diagnostics tools with containerized .NET Core applications as follows:
- Install the tools in the same container as the application
- Install the tools in a sidecar container
If you install the tools with your application in the same container, you will bloat the size of the image. Also, every update of the tools will require an update of the application image. The limitations of the first approach make the second approach of using a sidecar the preferred option.
A sidecar is a container that runs on the same Pod as your application's container. Since it shares the same volume and network as the application's container, it can enhance how the application operates. The most common use cases of sidecar containers are for running log shippers and monitoring agents.
Armed with the understanding of the diagnostics tools, let's discuss the problem we will attempt to resolve.
We need to address the following three issues to successfully connect the diagnostics tools sidecar with the application container and collect the necessary data.
- Accessing the processes in the application container from the sidecar container.
- Sharing the
/tmpdirectory between the sidecar and the application container.
- Storing the extracted data.
For this example, I will assume that you are running your application in Azure Kubernetes Service. However, most of the aspects of this solution will work on any Kubernetes installation.
You can download the source code of the sample application and the Kubernetes manifests used in this article from the following GitHub repository here.
To address the problem, we will add a sidecar containing the diagnostics tools to our application container. We will map the
/tmp directory on the two containers to an emptyDir volume. By default,
emptyDir volumes are stored on the medium that backs the node, such as SSD. However, you can configure it to use the RAM as well. For fast access to the domain socket, use either the SSD or the RAM as the backing medium.
The output of the diagnostics tools is generally quite large. Also, we need to persist the output beyond the lifetime of the Pod and the Node. Therefore, we will mount Azure Files as a Persistent Volume to our sidecar to reliably persist the output of the diagnostics.
Following is the high-level design diagram of the components involved in the solution that we discussed:
Let's build the individual components of the solution and integrate them.
Let's create a simple worker service. An ASP.NET Core Worker Service is used to implement long-running background tasks. Create a folder for your project and execute the following command to create a new worker service.
Open the newly created project in VS Code. To simulate a CPU intensive operation, we will program this service to find the prime number at a given position/index. For example, the prime number at position 0 is 1, at position 2 is 3, and so on.
Apply the following code in the
Worker.cs file of the project. You can try to debug this program a few times to understand how it works.
Let's now create a container image for this application and publish the image to Docker Hub. You can also choose to push the container image to Azure Container Registry by following the steps outlined in the Microsoft quickstart guide. I have published the image of the application in my Docker Hub repository. If you wish to use the published image, then skip the following instructions.
If you haven't already, install the VS Code Docker extension. Follow these instructions to use this utility to add a Dockerfile to your application. For your reference, here is the Dockerfile that the extension generated for the application.
You can use either the extension or the following commands to build the image and push the image to the container registry. Remember to change the name of the repository before you build and push the image.
Let's now build a container image for our diagnostics tools, which we will later deploy as a sidecar to our application container.
Diagnostics Tools in a Container
Create a Dockerfile for the sidecar container and name it
Dockerfile.tools. Populate the file with the following code, which instructs Docker to bundle the necessary tools inside an image.
We will publish this image to Docker Hub (or Azure Container Registry) with the following commands. As before, remember to change the name of the repository in the following commands before you execute them. I have published the image generated from the Dockerfile on Docker Hub. Feel free to use either your image or mine in the next step.
Let's now start writing the specification to deploy the various components to Kubernetes.
Deploying To Kubernetes
I assume that you are running an AKS cluster, and your Kubernetes CLI (
kubectl) context is configured to connect to your AKS cluster. Follow this quick start guide to create a cluster and configure
kubectl if it is not the case.
The final component that we need is a persistent volume to store the output produced by the diagnostics tools. AKS allows you to dynamically create an Azure Files based persistent volume within the same resource group as your cluster nodes. Creating a dynamic persistent volume requires specifying a Storage Class (or Persistent Volume if the storage account already exists) and a Persistent Volume Claim.
Create a file named
deployment.yaml and add the following specification to it, which, when applied dynamically, creates a storage account and makes Azure Files share available as volume. We will soon mount this volume to our sidecar.
Let's extend the manifest to specify a Deployment object. Please note the Pod specification in the Deployment, which defines that two containers, the application container, and the sidecar, will be created in each Pod.
There are a few things worth noting in the previous specification as follows:
- We mounted a shared
emptyDirvolume at the path
/tmpin both the containers.
- We mounted the Azure Files share backed persistent volume at the path
/datain the sidecar container.
- To make the processes discoverable between the two containers, we set the value of the setting
shareProcessNamespaceto true. You can read more about this feature on the Kubernetes documentation.
- Note that the sidecar container doesn't specify a command to run, and hence it will shut down soon after the start. To make the container interactive, we set two parameter values to true,
stdinto pass stdin to the container and
ttyto attach TTY to the process. TTY is used to enable I/O in interactive containers.
Let's now apply the specification with the following command:
After applying the specification, you will find that a new storage account containing a file share service is created in your subscription. Let's now test running a few diagnostics tools in our cluster.
Running dotnet-trace in Sidecar
Let's now request shell to our sidecar container using the
kubectl exec command as follows:
Tip: I recommend using k9s CLI to interact with your cluster. K9s has a simple UI, and it is very user friendly.
After executing the command, you will get a bash shell on the sidecar container. Let's try to find out the process id of the application (running in the application container) with the following command:
The following screenshot from my cluster shows that the
pid of my application is 13. I will run the
dotnet-trace tool on the process next.
Execute the following command to gather trace from the application and store it in
/data volume. Remember that the Azure file share service backs the
data volume. We will instruct the tool to generate the traces in the Chromium format. You can read more about the other available formats in this interesting blog from Scott Hanselman.
The following screenshot presents the output of the command from my cluster.
Let's download the generated file from the Azure portal as follows:
You can inspect the trace output file in any chrome based browser, such as Edge, by using the command
edge://tracing/ in the new Edge or in the Chrome browser using the command
chrome://tracing as follows:
Let's try to collect some performance counter values using the
Running dotnet-counters in Sidecar
Execute the following command after replacing the
pid of the application.
The following screenshot presents the output from my terminal.
As before, download the output file from Azure file share and open the file in VSCode. Following is the screenshot of the file I downloaded from the Azure file share.
You can use the same approach to collect the output of other diagnostics tools by persisting them in an external volume. Feel free to try this approach to gather more diagnostics information about the application.
The cross-platform diagnostics tools are still evolving and getting better. You might see more tools or features added to existing tools to make them even better. I hope this post gave you some pointers on collecting diagnostics data from applications running in Kubernetes.
Published at DZone with permission of Rahul Rai, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.