Data Analytics on Application Events and Logs Using Elasticsearch, Logstash, and Kibana
In this post, we will learn how to use Elasticsearch, Logstash, and Kibana for running analytics on application events and logs.
Join the DZone community and get the full member experience.
Join For FreeIn this post, we will learn how to use Elasticsearch, Logstash, and Kibana for running analytics on application events and logs. Firstly, I will install all these applications on my local machine.
Installations
You can read my previous posts on how to install Elasticsearch, Logstash, Kibana, and Filebeat on your local machine.
Basic Configuration
I hope by now you are have installed Elasticsearch, Logstash, Kibana, and Filebeat to your system. Now, let's do a few basic configurations that are required to run analytics on application events and logs.
Elasticsearch
Open the elasticsearch.yml
file in the [ELASTICSEARCH_INSTLLATION_DIR]/config
folder and add properties to it.
cluster.name: gauravbytes-event-analyzer
node.name: node-1
The cluster name is used by the Elasticsearch node to form a cluster. Node name within cluster need to be unique. We are running only single instance of Elasticsearch on our local machine. But, in production grade setup there will be master nodes, data nodes, and client nodes that you will be configuring as per your requirements.
Logstash
Open the logstash.yml
file in the [LOGSTASH_INSTALLATION_DIR]/config
folder and add below properties to it.
node.name: gauravbytes-logstash
path.data: [MOUNTED_HDD_LOCATION]
config.reload.automatic: true
config.reload.interval: 30s
Creating Logstash Pipeline for Parsing Application Events and Logs
There are three parts in a pipeline: input, filter, and output. Below is the pipeline configuration for parsing application events and logs.
input {
beats {
port => "5044"
}
}
filter {
grok {
match => {"message" => "\[%{TIMESTAMP_ISO8601:loggerTime}\] *%{LOGLEVEL:level} *%{DATA:loggerName} *- (?(.|\r|\n)*)"}
}
if ([fields][type] == "appevents") {
json {
source => "event"
target => "appEvent"
}
mutate {
remove_field => "event"
}
date {
match => [ "[appEvent][eventTime]" , "ISO8601" ]
target => "@timestamp"
}
mutate {
replace => { "[type]" => "app-events" }
}
}
else if ([fields][type] == "businesslogs") {
mutate {
replace => { "[type]" => "app-logs" }
}
}
mutate {
remove_field => "message"
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{type}-%{+YYYY.MM.dd}"
}
}
In the input section, we are listening on port 5044 for a beat (filebeat to send data on this port).
In the output section, we are persisting data in Elasticsearch on an index based on type and date combination.
Let's discuss the filter section in detail.
- 1) We are using the grok filter plugin to parse plain lines of text to structured data.
grok { match => {"message" => "\[%{TIMESTAMP_ISO8601:loggerTime}\] *%{LOGLEVEL:level} *%{DATA:loggerName} *- (?(.|\r|\n)*)"} }
- 2) We are using the json filter plugin to the convert
event
field to a JSON object and storing it inappEvent
field.json { source => "event" target => "appEvent" }
- 3) We are using the mutate filter plugin to the remove data we don't require.
mutate { remove_field => "event" } mutate { remove_field => "message" }
- 4) We are using the date filter plugin to the parse the
eventTime
fromappEvent
field toISO8601
dateformat and then replacing its value with@timestamp
field..date { match => [ "[appEvent][eventTime]" , "ISO8601" ] target => "@timestamp" }
Filebeat
Open the file filebeat.yml
in [FILEBEAT_INSTALLATION_DIR]
and add the below configurations.
filebeat.prospectors:
- type: log
enabled: true
paths:
- E:\gauravbytes-log-analyzer\logs\AppEvents.log
fields:
type: appevents
- type: log
enabled: true
paths:
- E:\gauravbytes-log-analyzer\logs\GauravBytesLogs.log
fields:
type: businesslogs
multiline.pattern: ^\[
multiline.negate: true
multiline.match: after
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 3
output.logstash:
hosts: ["localhost:5044"]
In the configurations above, we are defining two different type of filebeat prospectors; one for application events and the other for application logs. We have also defined that the output should be sent to logstash. There are many other configurations that you can do by referencing thefilebeat.reference.yml
file in the filebeat installation directory.
Kibana
Open the kibana.yml
file in the [KIBANA_INSTALLATION_DIR]/config
folder and add the below configuration to it.
elasticsearch.url: "http://localhost:9200"
We have only configured Elasticsearch URL but you can change the Kibana host, port, name, and other SSL related configurations.
Running ELK Stack and Filebeat
//running elasticsearch on windows
\bin\elasticsearch.exe
// running logstash
bin\logstash.bat -f config\gauravbytes-config.conf --config.reload.automatic
//running kibana
bin\kibana.bat
//running filebeat
filebeat.exe -e -c filebeat-test.yml -d "publish"
Creating Application Event and Log structure
I have created two classes AppEvent.java
and AppLog.java
which will capture information related to application events and logs. Below is the structure for both the classes.
//AppEvent.java
public class AppEvent implements BaseEvent {
public enum AppEventType {
LOGIN_SUCCESS, LOGIN_FAILURE, DATA_READ, DATA_WRITE, ERROR;
}
private String identifier;
private String hostAddress;
private String requestIP;
private ZonedDateTime eventTime;
private AppEventType eventType;
private String apiName;
private String message;
private Throwable throwable;
}
//AppLog.java
public class AppLog implements BaseEvent {
private String apiName;
private String message;
private Throwable throwable;
}
Let's Generate Events and Logs
I have created a sample application to generate dummy events and logs. You can check out the full project on GitHub. There is an AppEventGenerator
Java file. Run this class with system argument -DLOG_PATH=[YOUR_LOG_DIR]
to generate dummy events. If your log_path is not the same as one defined in the filebeat-test.yml
file, then copy the log files generated by this project to the location defined in the filebeat-test.yml
file. You soon see the events and logs get persisted in Elasticsearch.
Running Analytics on Application Events and Logs in Kibana Dashboard
Firstly, we need to define Index patterns in Kibana to view the application events and logs. Follow the step-by-step guide below to create an Index pattern.
- Open Kibana dashboard by opening the URL (http://localhost:5601/).
- Go to the Management tab (Left pane, last option).
- Click on the Index Patterns link.
- You will see an already created index, if any. On the left side, you will see an option to Create Index pattern. Click on it.
- Now, define the index pattern and click Next. Choose the time filter field name. I chose the @timestamp field for this. You can select any other timestamp field present in this Index and finally click on the Create index pattern button.
Let's View the Kibana Dashboard
Once the Index pattern is created, click on the Discover tab on the left pane and select index pattern created by you in the previous steps.
You will see a beautiful GUI with a lot of options to mine the data. On the top most pane, you will see option to Auto refresh and data that you would want to fetch (last 15 minutes, 30 minutes, 1 hour, 1 day, and so on) and it will automatically refresh the dashboard.
The next lane has a search box. You can further write queries to have a more granular view of the data. It uses Apache Lucene's query syntax.
You can also define filters to have a more granular view of data.
This is how you can run the analytics using ELK on your application events and logs. You can also define complex custom filters, queries, and create a visualization dashboard. Feel free to explore Kibana's official documentation to use it to its full potential.
Published at DZone with permission of Gaurav Rai Mazra, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments