Historical Data Analytics With Logz.io
A look into how one tool can help teams deal with the challenges of data retention and data retrieval for historical log data.
Join the DZone community and get the full member experience.Join For Free
Have you ever found yourself trying to reconstruct an event from the past only to come up blank because you cannot go so far back in time? If only you could bring back that missing piece of the puzzle!
In the world of IT, logs are the way machines and software record events. They help us understand when an event happened, where they happened and, most importantly, why they happened. In a perfect world, engineers would have access to all the data generated by their applications regardless of the timestamp attached.
But of course, we don't live in a perfect world. Data volumes are exploding and because retention is extremely expensive, engineers end up making painful decisions about what data to retain and for how long.
Hot vs. Cold Retention
Retention periods vary from organization to organization, use case to use case. You might need to retain data for an entire year for compliance and security needs, or you might make do with just a few days of retention for the purpose of troubleshooting.
Whatever retention period you opt for, once it's over your data is lost. Unless, that is, you've put a system into place that archives this data and "revives" it when necessary by re-ingesting it back into the system. Most organizations will opt for a hybrid solution based on a short "hot" retention period in which data is searchable, and an extended "cold" retention period in which data is archived to a third-party data storage service but is not searchable.
Amazon S3 is one of the more common solutions for storing and archiving log data for extended retention periods because of its cost-efficient storage, its different storage options and integrability with other AWS services.
Logz.io makes it easy for users to archive data to S3 and reingest it back into the system via the user interface.
This feature is accessed via the Settings icon > Tools > Archive & restore in the top menu. Here you are required to enter the name of the S3 bucket you want to archive to and specific IAM credentials for the bucket (the IAM user must have the following S3 permissions: PutObject, ListBucket, and GetObject).
You can test the configuration using the Test connection button to make sure Logz.io has access to the bucket.
If all is OK, you'll see a success message at the bottom of the page and all that's left to do now is click the Start archiving button. Archiving is enabled for the specific Logz.io account you set up archiving from, as reflected in the Archiving is on message at the top of the page.
That's one side of the story. How do you restore this historical data for analysis? This is done on the same page, on the Restore tab.
Here, all you have to do is enter the name of the account you want to restore and the specific time frame you want to restore logs for. Currently, the user interface limits you to a maximum of one day worth of logs. If you need more than that, let the Support team know and we'll help you out.
Logz.io begins the process of restoring the data from the S3 bucket, and you can view the restore actions on the Restored accounts tab:
The restore process will run in the background, and once it completes it will become Active - meaning your data is restored and can be searched in Kibana.
You'll also receive an email notifying you that the restore was completed:
Optimizing Logging Costs
Data retention is one of the most important components of any monitoring solution. On the one hand, you need to be able to search your data for as long as your use case organization requires. On the other hand, your also need to remain cost-efficient. Logz.io offers users a number of ways to maintain this balance.
Published at DZone with permission of Daniel Berman, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.