Troubleshooting, Dynamic Logging, and Observability for .Net — A New Kid on the Block
Troubleshooting systems in production with distributed and cloud-deployed software, the challenges and options available today.
Join the DZone community and get the full member experience.Join For Free
A few years in software development do not equate to a linear amount of knowledge and information. So is true for the .NET ecosystem. The first release of .NET 1.0 saw the light of day in January 2002. The significance of the event was not the palindrome year. The new paradigm is offered to the traditional C++ MFC, VB, and classic ASP developers, supporting two main new languages, C#, and VB.NET. It started as a proprietary Windows technology, closed source language, primarily appealing to the companies on Microsoft Windows stack, tied to the sole IDE, Visual Studio. While .NET was a step forward, the pace and the options outside the designated use were abysmal.
The situation took a sharp turn, with a substantial change in 2016 with the introduction of .NET Core. Starting from the heart of the .NET ecosystem, ASP.NET, the change has spread through the entire platform, leading to a complete re-imagination and makeover of runtime and language. Open source, cross-platform, and free for commercial use, .NET and C# became viable options for many projects that traditionally would go with another platform and languages. From web to mobile, from desktop to backend systems, in any cloud environment, .NET is a solid and viable option with outstanding experience and rich offerings.
When Uncle Ben told Pete, "With great power comes great responsibility" in the 2002 Spiderman movie, he was not referring to the newly emerging .NET platform. This phrase could be applied to the challenges any developer on any platform using any language will eventually face. And .NET is not an exception. Here are just a few things that can go wrong.
Cross-platform software development is often taking place on a single platform. It could be Windows, Linux, or Mac operating systems. The software then needs to be deployed and executed on a platform that might not be a one-to-one match with the development or testing environment. And while technologies such as Docker containers have made it simpler to "ship your development environment to production," it is still not a bulletproof solution. The differences between environments still pose a risk of running into environmental discrepancies, leading to bugs.
Cloud environments pose even more significant challenges, with infrastructure and code executed remotely on the environment outside our reach and control. It could be Azure App Service, AWS Fargate, or GCP Cloud Functions. These services provide foundational troubleshooting but cannot cater to the specifics of the application and its use cases, usually involving additional intrusive services required for troubleshooting.
.NET developers are offered a few options to troubleshoot the problems faced in production running .NET-based systems. Here are a few of those:
Crash dumps. Analyzing crash dumps is a skill that only a few possess. A more significant challenge is that the analysis can pinpoint why the code was crushed but not what led to that critical moment.
Metric and counters. Emitting metrics and counter values from the code, collected and visualized, allows better insights into the remotely executing code. But it lacks the necessary dials and knobs to dynamically adjust the scope of focus. For example, emitting the value of a specific one or more variables within a particular method is not an option.
Logs. Logging is one of the oldest and the most common techniques to help the developer identify issues in the running code. The code is decorated with additional instructions that emit information into a sink, such as a file, a remote service, or any other destination where the logs are retained for a defined period. This option provides a lot of information, but it also has drawbacks. Unnecessary amounts of data irrelevant to the outage or the bug being investigated are stored and muddy the water. In a scaled-out solution, these logs are multiplied. And when consolidated into a single service, such as Application Insights, it requires accurate filtering to separate the wheat from the chaff. Not to mention the price tag associated with processing and storing those logs. But one of the most significant drawbacks of static logging is the inability to adjust what is logged and the duration of the logging session.
Snapshot debugging. Ability to debug remotely executed code within the IDE. Upon exceptions thrown, a debug snapshot is collected from the running application and sent to Application Insights. The stored information includes the stack trace. In addition, a minidump can be obtained in Visual Studio to enhance visibility into why an exception occurred in the first place. But this is still a reactive solution to a problem that requires a more proactive approach.
Dynamic Logging and Observability
Meet dynamic logging with Lightrun. With dynamic logging, the code does not have to be instrumented at every spot. Instead, we want to gain visibility at run time. So instead, the code is left as-is. The magic ingredient is the agent enabling dynamic logging and observability that is added to the solution in the form of a NuGet package. Once included, it is initialized once in your application starting code. And that is it. From there, the Lightrun agent takes care of everything else. And by that, it means connecting to your application at any time to retrieve logs on an ad-hoc basis, without the unnecessary code modification/instrumentation, going through the rigorous code changes approval process, testing, and deployment.
Logs, metrics, and snapshots can be collected and presented on demand, all without wracking a substantial bill otherwise incurred with static logging and metrics storage or leaving the comfort of the preferable .NET IDE of your choice — VS Code, Rider, or Visual Studio (coming soon). The results are immediately available in the IDE or streamed to your observability services such as Sentry, New Relic, DataDog, and many other services in the application performance monitoring space.
.NET offers a range of software development options with rich support for libraries and platforms. With extra help from Lightrun, a symbiotic relationship takes code troubleshooting to the next level, where investigating and resolving code deficiencies does not have to be a lengthy and costly saga.
Opinions expressed by DZone contributors are their own.
Design Patterns for Microservices: Ambassador, Anti-Corruption Layer, and Backends for Frontends
Mastering Time Series Analysis: Techniques, Models, and Strategies
How To Check IP Addresses for Known Threats and Tor Exit Node Servers in Java
Auditing Tools for Kubernetes