DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Stop Poisoning Your Models: How I Built a CV Dataset Quality Toolkit I Can Reuse Forever
  • Advanced Auto Loader Patterns for Large-Scale JSON and Semi-Structured Data
  • From 13,000 to 20,000+ Endpoints: Architecting Forensics for the Remote Workforce
  • Architecting Scalable JSON Pipelines: The Power of a Single PySpark Schema

Trending

  • The Vector Database Lie
  • How to Build and Optimize AI Models for Real-World Applications
  • We Went Multi-Cloud and Almost Drowned: Lessons From Running Across AWS, GCP, and Azure
  • The Death of "Text-Only" ChatOps: Why Google's A2UI Matters for DevOps and SRE
  1. DZone
  2. Data Engineering
  3. Data
  4. Pretty Data All in Neat Rows

Pretty Data All in Neat Rows

How to take "flat" (i.e., columnar) JSON data and turn it into row-based information using the jq utility and a little ingenuity.

By 
Leon Adato user avatar
Leon Adato
·
Mar. 14, 23 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
1.7K Views

Join the DZone community and get the full member experience.

Join For Free

DBA Mary extraordinary,
What makes your tables grow?
When queries propel through my TSQL shell
data ingests into neat little rows.

This will likely be the last post in my series about monitoring the pihole DNS server. You can find part 1, "Your Pi-Hole is a Rich Source of Data," here and part 2, "Mind the Gap," here. If you've been reading along, it should be clear that this is not so much about the pi-hole in particular as it is about ways New Relic allows you to manipulate observability data. The truth is that the pi-hole provided me with a number of great examples of the different ways data can present itself out of various systems.

With that said, in this post, I'm going to address a situation that happens a lot with JSON output — data that should be recorded as sequential rows under a single field but instead ends up splitting across multiple fields.

I'm going to continue to leverage the pi-hole for this example. In the previous post, all of our attention was focused on the output of a single API endpoint: ?summaryRaw. But, the pi-hole has many other endpoints that emit data, including:

  • topItems=xx — show the xx top domains and top advertisers being requested.
  • topClients=xx— show the top sources of DNS queries within your network.
    • getForwardDestinations - Show the external DNS servers where DNS queries are going once they bounce out of your network.
  • getQueryTypes — Show the volume of each type of DNS query (A, AAAA, PTR, SRV, etc.).

(You can read about all the possible API endpoints in this post.).

So, let's consider a Flex integration that is set up to gather some of the information I've identified above:

YAML
 
integrations:
  - name: nri-flex
    config:
      name: badpihole
      apis:
        - name: badpihole_querytypes
          url: http://pi.hole/admin/api.php?getQueryTypes&auth=abcdefg1234567890 #your auth key goes here
          headers:
            accept: application/json


(Note that I've purposely named the elements "bad" so you can find them because, ultimately, I don't think they are valuable in the current format)

When you look at it in NRQL, you'll see a result like this:
NRQL Result

The issue becomes even worse when the results are highly variable. For example, topItems will return the top domains and advertisers for a given period. While that MIGHT remain somewhat consistent, in larger or more dynamic networks, that list can change drastically.

So with the YAML element of:

YAML
 
        - name: badpihole_topitems
          url: http://pi.hole/admin/api.php?topItems=10&auth=abcdefg1234567890 #your auth key goes here
          headers:
            accept: application/json


You could see your column count go up moment by moment:
Column Count

What's needed is to transform the incoming data so that rather than appearing like this:

 
"top_ads.unity3d.com": 54,
"top_ads.display.ravm.tv": 90,
"top_ads.hbopenbid.pubmatic.com": 49,


Instead, it's re-organized into a format more like this:

 
Name: "top_ads.display.ravm.tv",
Count: 90,
Name: "top_ads.display.ravm.tv"
Count: 90,
Name: "top_ads.hbopenbid.pubmatic.com"
Count: 49,


The result of which looks like this when displayed in New Relic:
Result displayed in New Relic

Phenomenal Cosmic Power, Itty Bitty Command

How is this transformation achieved? Through the remarkably simple use of the jq utility. I mentioned jq in part 2 of this series where the usage was far more complex.

As so often happens in tech, what we're asking for is a much more complex operation, and yet the structure is way easier to understand:

JSON
 
jq: > 
  .[]|.top_queries|to_entries|map({queryname:.key,querycount:.value})


As with the jq wizardry in my last post; this is largely due to the genius of my colleague Haihong Ren. Putting this line into the context of a complete Flex YAML file, it would look like this:

YAML
 
integrations:
  - name: nri-flex
    config:
      name: pihole
      apis:
        - name: pihole_topitems
          url: http://pi.hole/admin/api.php?topItems=10&auth=abcdefg1234567890 #your auth key goes here
          headers:
            accept: application/json
          jq: > 
            .[]|.top_queries|to_entries|map({queryname:.key,querycount:.value})


The result of which, as I showed earlier, is data that is easier to summarize, query, sort, select, and display.
Display

Special Bonus Clip-and-Save Section

There's not really much to summarize here except to underscore that New Relic's platform is not only flexible enough to enable you to collect just about any type of telemetry you need; but also to manipulate it so you can transform data into information, which drives thoughtful action within your organization.

If you'd like to try out this entire thing for yourself but would prefer not to have to BUILD it all yourself (and in this, I applaud your commitment to the economy of effort), then below you will find the complete YAML file. And here is a link to a quick start with the dashboard pictured above.

YAML
 
integrations:
  - name: nri-flex
    config:
      name: pihole
      variable_store:
        authkey: abcdefg1234567890 #your auth key goes here
# In order for this integration to work, you need to include your pihole API key.
# You can get the token by loggin into your pihole and going to Settings/API/Show API token 
#   or by connecting directly to the pihole device and getting the WEBPASSWORD variable from /etc/pihole/setupVars.conf

      apis:
        - name: pihole_summary
          url: http://pi.hole/admin/api.php?summaryRaw&auth=${var:authkey}
          headers:
            accept: application/json

        - name: pihole_topitems
          url: http://pi.hole/admin/api.php?topItems=10&auth=${var:authkey}
          headers:
            accept: application/json
          jq: > 
            .[]|.top_queries|to_entries|map({queryname:.key,querycount:.value})

        - name: pihole_topclients
          url: http://pi.hole/admin/api.php?topClients=10&auth=${var:authkey}
          headers:
            accept: application/json
          jq: > 
            .[]|.top_sources|to_entries|map({clientname:.key,clientcount:.value})

        - name: pihole_toforwarddest
          url: http://pi.hole/admin/api.php?getForwardDestinations&auth=${var:authkey}
          headers:
            accept: application/json
          jq: > 
            .[]|.forward_destinations|to_entries|map({destinationname:.key,destinationcount:.value})

        - name: pihole_querytypes
          url: http://pi.hole/admin/api.php?getQueryTypes&auth=${var:authkey}
          headers:
            accept: application/json
          jq: > 
            .[]|.querytypes|to_entries|map({querytype:.key,querycount:.value})

        - name: pihole_recentblocked
          url: http://pi.hole/admin/api.php?recentBlocked&auth=${var:authkey}
          headers:
            accept: application/json


JSON Data (computing)

Published at DZone with permission of Leon Adato. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Stop Poisoning Your Models: How I Built a CV Dataset Quality Toolkit I Can Reuse Forever
  • Advanced Auto Loader Patterns for Large-Scale JSON and Semi-Structured Data
  • From 13,000 to 20,000+ Endpoints: Architecting Forensics for the Remote Workforce
  • Architecting Scalable JSON Pipelines: The Power of a Single PySpark Schema

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook