Over a million developers have joined DZone.

Archive and Analysis with Amazon S3 and Glacier: Introduction

DZone's Guide to

Archive and Analysis with Amazon S3 and Glacier: Introduction

Free Resource

MongoDB Atlas is a database as a service that makes it easy to deploy, manage, and scale MongoDB. So you can focus on innovation, not operations. Brought to you in partnership with MongoDB.

Logging is an essential part of any system. It let's you understand what's going on in your system especially serving as a vital source for debugging. Primarily many systems uses logging to let developers debug issues in the production environment. But there are systems where logging becomes the essential component to understand the following

  • User Behavior - understanding user behavior patterns such as which areas of the system is being used by the user
  • Feature Adoption - evaluate new feature adoption by tracking how a new feature is being used by the users. Do they vanish after a particular step in a particular flow? Are people from a specific geography use this during a specific time of the day?
  • Click through analysis - let's say you are placing relevant ads across different pages in your websites. You would like to know how many users clicked them, the demographic analysis and such
  • System performance
    • Any abnormal behavior in certain areas in the system - a particular step in a workflow resulting in error/exception conditions
    • Analyzing performance of different areas in the system - such as finding out if a particular screen takes more time to load because of a longer query getting executed. Should we optimize the database? Should we probably introduce a caching layer?
Any architect would enforce logging as a core component in the technical architecture. While logging is definitely required, many a times, inefficient logging such as too much logging, using inappropriate log levels might lead to the following
  • Under performance of the system - the system could be spending more resources in logging than actively serving requests
  • Huge log files - generally log files grow very fast, especially when inappropriate log levels are used such as "debug" levels for all log statements
  • Inadequate data - if the log contains only debug information by the developer there will not be much of an analysis that can be performed
On the other hand, the infrastructure architecture also needs to support for efficient logging and analysis
  • Local Storage - how do you efficiently store the log files on the local server without running out of disk space; especially when log files tend to grow
  • Central Log Storage - how do you centrally store log files so that it can be used later for analysis
  • Dynamic Server Environment - how do you make sure you collect & store all the log files in a dynamic server environment where servers will be provisioned and de-provisioned on demand depending upon load
  • Multi source - handling log files from different sources - like your web servers, search servers, Content Distribution Network logs, etc...
  • Cost effective - when your application grows, so does your log files. How do you store the log files in the most cost effective manner without burning a lot of cash
In this multi-post article let's take up a case of a typical e-commerce web application with the above characteristics and setup a best practice architecture for logging, analysis and archiving in AWS. We will see how different AWS services can be used effectively to store and process the logs from different sources in a cost effective and efficient manner.

MongoDB Atlas is the best way to run MongoDB on AWS — highly secure by default, highly available, and fully elastic. Get started free. Brought to you in partnership with MongoDB.


Published at DZone with permission of Raghuraman Balachandran, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}