Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

AWS CloudTrail Log Analysis With the ELK Stack

DZone's Guide to

AWS CloudTrail Log Analysis With the ELK Stack

CloudTrail is a useful tool for monitoring access and usage of your AWS-based IT environment. You can find lots of valuable information in the data with Kibana.

· Performance Zone ·
Free Resource

Sensu is an open source monitoring event pipeline. Try it today.

CloudTrail records all the activity in your AWS environment, allowing you to monitor who is doing what, when, and where. Every API call to an AWS account is logged by CloudTrail in real time. The information recorded includes the identity of the user, the time of the call, the source, the request parameters, and the returned components.

By default, CloudTrail logs are aggregated per region and then redirected to an S3 bucket (compressed JSON files). You can then use the recorded logs to analyze calls and take action accordingly. Of course, you can access these logs on S3 directly, but even a small AWS environment will generate hundreds of compressed log files every day which makes analyzing this data a real challenge.

Shipping these logs into the ELK Stack will allow you to tackle this challenge by giving you the ability to easily aggregate, analyze and visualize the data.

Let’s take a look at how to set up this integration.

Enabling Cloudtrail

If you haven’t already, start by opening your CloudTrail console and adding a new trail (one trail is offered by AWS for free; additional trails will entail a cost).

add a new tail

You will be required to name the trail and the S3 bucket you wish to ship the logs to. You will also need to decide whether you want to trail activity in all AWS regions or just the current region. I recommend starting out with a trail for one single region.

Advanced settings will allow you to enable encryption and SNS usage.

Once created, CloudTrail will begin recording API actions and sending the logs to the defined S3 bucket.

Shipping Into ELK

Once in S3, there are a number of ways to get the data into the ELK Stack.

If you are using Logz.io, you can ingest the data directly from the S3 bucket, and parsing will be applied automatically before the data is indexed by Elasticsearch. If you are using your own ELK deployment, you can use a variety of different shipping methods to get Logstash to pull and enhance the logs.

Let’s take a look at these two methods.

Using Logz.io

To ship into the Logz.io ELK Stack, all you have to do is configure the S3 bucket in which the logs are stored in the Logz.io UI. Before you do so, make sure the bucket has the correct ListBucket and GetObject policies as described here.

Under Log Shipping, open the AWS > CloudTrail tab. Enter the name of the S3 bucket, the path to the directory containing the logs and the IAM user credentials (access key and secret key), and click Save to apply the configuration.

apply the configuration

That’s all there is to it. Logz.io will identify the log type and automatically apply parsing to the logs. After a few seconds, the logs will be displayed in Kibana:

cloudtrail logs in kibana

Using Logstash

As mentioned above, there are a number of ways to ship your CloudTrail logs from the S3 bucket if you are using your own ELK deployment.

You can use the Logstash S3 input plugin or, alternatively, download the file and use the Logstash file input plugin. In the example below, I am using the latter, and in the filters section defined some grok patterns to have Logstash parse the CloudTrail logs correctly. A local Elasticsearch instance is defined as the output:

input {
  file {
     path => "/pathtofile/*.log"
     type => "cloudtrail"
  }
}
 
filter {
  if [type] == "cloudtrail" {
   geoip {
     source => "sourceIPAddress"
     target => "geoip"
     add_tag => [ "cloudtrail-geoip" ]
   }
 }
 
output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

Analyzing CloudTrail Logs

Since CloudTrail records the API events in JSON format, Elasticsearch easily maps the different fields included in the logs. These fields are displayed on the left side of the Discover page in Kibana. I recommend reading the relevant AWS docs on the different available field before commencing with the analysis stage.

To get some visibility into the CloudTrail logs, a first step would be to add some of the available fields to the main display area. This depends on what you would like to analyze, but the most obvious fields to add would be the ‘awsRegion’, ‘eventName’, ‘eventSource’ and ‘userIdentity.type’ fields.

kms logs

This gives you a better idea of the data you have available and what to analyze. To dive deeper and search for specific alerts, your next step would be to query Kibana.

You could start with a free-text search. So if you know what event you are looking for, simply enter it in the search field at the top of the page:

ec2 logs

You can use a field-level search to look for events for a specific service:

eventsource kms logs

You could use a logical statement to search for a combination. For example, say you are looking for only KMS activity for a specific region:

activity for a specific aws region

Kibana supports a wide range of queries, and we cover most of them in this article. Save the queries that interest you in order to use them in your visualizations and dashboards.

Visualizing CloudTrail Logs

Based on your Kibana queries, you can create your CloudTrail visualizations and dashboards. In this section, we will show a few simple examples of how to visualize your CloudTrail logs.

Top events:

top events

Events over time:

fields to add

Source IP map:

source ip map

Creating a CloudTrail Dashboard

These were just examples of how to visualize your CloudTrail logs. You can of course slice and dice the data in whichever manner you like depending on your needs and requirements. If, for example, you want to focus on S3 usage, you can make use of relevant fields (i.e. requestParameters.bucketName.)

In any case, once you have your visualizations created, your next step towards a building a comprehensive monitoring tool is adding them all up into a dashboard.

aws cloudtrail dashboard

ELK Apps, the Logz.io library of Kibana searches, alerts, visualization and dashboards, includes a number of ready-made dashboards for CloudTrail logs, together with searches and alerts.

To install these dashboards, simply open ELK Apps, search for cloudtrail and install the dashboard.

Adding Alerting

Analysis and visualization is one thing, but being able to be actively notified, in real-time, when specific events are taking place, is another.

Logz.io ships with an advanced alerting mechanism which enables you to get notified either by email or any messaging or paging application you are using such as PagerDuty, Datadog or Slack, on specific API calls, and events monitored by CloudTrail.

create cloudtrail alerts

The ELK Stack does not include built-in alerting capabilities, so to be able to get notified, for example, when an unauthorized user accessed a specific resource requires additional costs and configurations.

Summary

CloudTrail is an extremely useful tool to use and crucial for monitoring access and usage of your AWS-based IT environment. There is a lot of valuable information from the data and using Kibana’s analysis and visualization capabilities.

There are lots of hidden treasures inside these events — and through visualizations and alerts, you can learn a lot about your AWS environment.

Sensu: workflow automation for monitoring. Learn more—download the whitepaper.

Topics:
performance ,tutorial ,elk stack ,log analysis ,aws cloudtrail

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}