DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • What Is Pydantic?
  • Beginners Guide for Web Scraping Using Selenium
  • Unraveling Lombok's Code Design Pitfalls: Exploring Encapsulation Issues
  • Transforming Data Into JSON Structure With Spark: API-Based and SQL-Based Approaches

Trending

  • How To Deploy Helidon Application to Kubernetes With Kubernetes Maven Plugin
  • Exploring Edge Computing: Delving Into Amazon and Facebook Use Cases
  • What Is Kubernetes RBAC and Why Do You Need It?
  • API Design
  1. DZone
  2. Data Engineering
  3. Data
  4. The Benefit of Having an Enterprise Logging Policy

The Benefit of Having an Enterprise Logging Policy

Implementing a logging policy that's easy to follow makes sense. Yet many companies shy away from this approach because they perceive it to be burdensome and costly.

Bob Reselman user avatar by
Bob Reselman
·
Apr. 07, 16 · Opinion
Like (3)
Save
Tweet
Share
3.16K Views

Join the DZone community and get the full member experience.

Join For Free

Lack of an enterprise logging policy is a common shortcoming when it comes to the organizational discipline of logging from within large, distributed applications. Just because you can get log data into a system, it does not necessarily follow that the data you are entering is useful. The old adage, garbage in, garbage out holds true. If an enterprise allows anybody to enter log data in any way possible, anybody will. In the long run, without a proper policy for logging, and procedures to support that policy, an enterprise will spend an unnecessary amount of time and money getting meaningful information from its logs.

It’s a pay me now, pay me later, sort of thing. If your company allows developers to log data on a whim, the amount of badly or randomly structured log data in your system will grow at increasing rates. And, as time goes on, you are going to spend more time and money on the backend trying to make sense of logs that should have been easy to process in the first place from the front end.

You have a choice. You can have a simple policy the ensures that structured, easy to process log data is always entered into the system. Or, you can spend time and money cleaning up your log data on the backend. When it comes to log entry and subsequent log processing, you can pay me now or you pay me later, but you will pay me.

Implementing a logging policy that is easy to follow makes sense. Yet many companies shy away from implementing such a policy under a misconception that it’s burdensome and costly. It’s not. In fact, I am going to show you how to do it. It’s all about following one principal: Use Self Describing Data Formats.

The Case for Self Describing Data Formats

Consider the following log entry shown below in Listing 1:

25 Mar 2016 16:39:03.305 info wXjpihBWuCwe1jkiNiV8YB

Listing 1: A cryptic log entry

What does it tell you? Well, we can infer that the data was entered on March 25, 2016 at a certain time. Also, we can infer that the entry had something to do with, info. But what about, wXjpihBWuCwe1jkiNiV8YB ? You got me.

Now consider this log entry shown below in Listing 2:

25 Mar 2016 17:12:07.061 info {"applicationName":"GoodDogBadDog","token":"wXjpihBWuCwe1jkiNiV8YB"}

Listing 2: A log entry that structures data using JSON

The start of the entry in Listing 2 is the same as the previous entry in Listing 1; the entry was done on March 25, 2016. And, the entry has something to do with info. But, what follows tells you exactly what the entry information is about. An application named, GoodDogBadDog submitted a token and that token has a value of, wXjpihBWuCwe1jkiNiV8YB . This may seem trivially obvious, but it’s not. The first entry tells you nothing. The second entry tells you everything. The second entry is self describing. The entry in Listing 2 describes both the structure of the information as well as the information itself. The format of the entry, all the data between the first, left curly bracket and the last, right curly bracket is JSON. JSON is a self descriptive format. There are others, XML and key-value pairs, for example.

Listing 3 below shows the JSON entry above converted into a set of key-value pairs:

25 Mar 2016 17:12:07.061 info "applicationName":"GoodDogBadDog","token":"wXjpihBWuCwe1jkiNiV8YB"

Listing 3: A log entry that uses key-value pairs to structure information

Here is the JSON entry above converted into XML (please see Listing 4 below):

25 Mar 2016 17:12:07.061 info <entry><applicationName>GoodDogBadDog</applicationName><token>wXjpihBWuCwe1jkiNiV8YB</token></entry>

Listing 4: You can use XML to structure the information of a log entry

No matter which self describing format you use, the important thing to know is this, logging data within the structure of a self-describing data format will save you time and money almost from the gitgo. Structured data is easier to parse and easier to index. Indexed data is easier to query. In fact, a technology such as Logentries has the intelligence built-in to parse and index JSON automatically. It’s a win-win situation all the way around.

Making It Happen

So, if we agree that using a structured data format such as JSON or key-value pairs is a good thing to do, how do we make it happen within the enterprise?

First we need a policy. At the enterprise level, the best policies are the ones that are the simplest. Take this policy for example:

All employees will display their ID badges in plain sight at all times when on the company’s premises.

It a simple policy that is easy to verify. When an employee is in the building, you can see his or her badge or you can’t.

Because the policy is so simple, it is easy to create one or many procedures to support the policy. Thus, one can well imagine this set of procedures to facilitate the badge display policy:

  1. Upon issuing an employee an ID badge, provide the employee both a shirt clip and an ID badge necklace.
  2. Instruct the employee that he or she needs to wear his or her ID badge in plain sight using either the shirt clip or badge necklace when on the company’s premises.

The policy is simple and, as a result, the procedures put in place to support the policy are simple and easy to enforce. The more complex a policy gets, the more complex are the procedures that follow. And, enforcement becomes complex too.

So, when it comes to ensuring that developers use structured data formats when logging, the key is to Keep it Simple.

Consider this policy:

All developers will log data using JSON to name the fields(s) of log data being entered as well as the value of each field.

Simple.

Next comes the procedures. You have a few choices. One way is to enhance your logging component at the code level to make it so only structured data is sent out from an application to the log collector. If you are using a compiled language such as C# or Java, this is a viable approach. (Please see Listing 5, below)

//This class ensures that log entries
//are submitted is structured data
public class Logger
{
 void static Log(LogData logData)
 {
 //Convert object to JSON

 //Send json to log collector
 }
}

var logData = new LogData();
.
.
.
Logger.Log(logData);

Listing 5: C# code that enforces using JSON structure log entries

You create a logging method that accepts POCO or POJO  and then internally the method converts the object to a structured data format, such as JSON. Thus, the only way you can log is through this method. If your code goes errant, logging will not happen, or if you fail to use a POCO or POJO, non-compliance will be picked up at compile time.

When you are using runtime languages such as JavaScript of PHP things get a little harder. Yes, you can throw an error at runtime, should your code come across log data that is random. But, with this approach, the horse has left the barn. Of course, you can stop the code upon a “RandomLogEntryException” and this might be a good thing to do, if you want to catch the problem in a Development or Q/A environment. However, stopping the code in production is just bad business. If you have a good runtime testing environment, raising exceptions on random log data entry can work.

Log inspection and code reviews are alternative procedures, should ensuring policy compliance not be possible at the code level. The procedures must be very exact. How will log inspection happen? What or who will do it? What is the frequency of inspection? How will the code review be conducted? How often? How will the activity be documented? How is non-compliance addressed? Answering these questions will bring about the clarity and detail required for having procedures that produce reliable results.

Thus, we can have a policy that looks like this:

All developers will log data using JSON to name the piece(s) of log data being entered as well as the value of each piece.

With accompanying procedure to support the policy:

  1. The company will, if possible, use or enhance logging clients to ensure that only JSON is sent to the enterprise’s log collectors.
  2. Should a code a solution not be possible, personnel will create automated, service side tools that conduct daily analysis of log entries to ensure that structured JSON data is submitted for logging. Sources that create log data that is not submitted in JSON will be notified by email that correction is required.

The key takeaway is that the policy is simple and the procedures to support the policy are relatively easy to implement. Also the procedures are verifiable.

Why JSON?

If it seems that I am partial to using JSON for doing log entries, you are right. I find JSON to be efficient in terms of layout specification, self-describing in terms of format and easily adaptable into runtime objects in environments such as browser side Javascript and server side NodeJS.

However, JSON is my preference. Preferences vary by developer. The key is to use a data format that is easy to adopt in your enterprise. If your developers like name-value pairs, use them.

Putting It All Together

Data formats that are self describing are easy to work with. You don’t have to spend time and money trying to figure out what is going on. The format tells you what you need to know. In addition to being informative, self describing formats such as JSON and name-value pairs index easily once parsed. And, typically parsing these formats on the server side is easy too.

When it comes time to Implement a policy and supporting procedures to ensure the use of self-describing data formats in your enterprise, remember that it’s best to keep it simple. A simple to understand policy accompanied by easy to follow procedures will save you time and money. Remember, when it comes to your enterprise’s logging efforts, you might pay now, or you might pay later. But using self-describing data formats will reduce the price you pay, no matter what.


Data (computing) JSON Listing (computer) IT

Published at DZone with permission of Bob Reselman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • What Is Pydantic?
  • Beginners Guide for Web Scraping Using Selenium
  • Unraveling Lombok's Code Design Pitfalls: Exploring Encapsulation Issues
  • Transforming Data Into JSON Structure With Spark: API-Based and SQL-Based Approaches

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: