Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Parse Elasticsearch Results Using Ruby

DZone's Guide to

Parse Elasticsearch Results Using Ruby

· Big Data Zone
Free Resource

Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity. 

One of our modules in our project is an elasticsearch cluster.

In order to fine tune the configuration (shards, replicas, mapping, etc.) and the queries, we created a JMeter environment.

I wanted to test a simple query with many different input parameters, which will return results.
I.e. query for documents that exist.

The setup for JMeter is simple. I created the query I want to check as a POST parameter.
In that query, instead of putting one specific value, which means sending the same values in the query over and over, I used parameter. I directed JMeter to read from a file (CSV) the parameters.

The next thing was to create that data file. A file, which consists of rows with real values from the cluster.

For that I used another query, which I ran against the cluster using CURL.
(I am changing some parameters naming)

{
   "fields":[
      "FIELD_1"
   ],
   "size":10000,
   "query":{
      "constant_score":{
         "filter":{
            "bool":{
               "must":[
                  {
                     "term":{
                        "LIVE":true
                     }
                  },
                  {
                     "exists":{
                        "field":"FIELD_1"
                     }
                  }
               ]
            }
         }
      }
   }
}

I piped the result into a file.
Here’s a sample of the file (I changed the names of the index, document type and values for this example):

{
  "took" : 586,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 63807792,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "1111111",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "123"
      }
    }, {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "22222222",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "12345"
      }
    }, {
      "_index" : "my_index",
      "_type" : "the_document",
      "_id" : "33333333",
      "_score" : 1.0,
      "fields" : {
        "FIELD_1" : "4456"
      }
    } ]
  }
}

The next thing was parsing this json file, taking only FIELD_1 and put the value in a new file.
For that I used Ruby:

#!/usr/bin/ruby

require 'rubygems'
require 'json'
require 'pp'

input_file = ARGV[0]
output_file = ARGV[1]

json = File.read(input_file)
obj = JSON.parse(json)
hits = obj['hits']

actual_hits = hits['hits']
begin
  file = File.open(output_file, "w")
  actual_hits.each do |hit|
    fields = hit['fields']
    field1 = fields['FIELD_1']
    file.puts(field1)
  end
rescue IOError => e
  # there was an error
ensure
  file.close unless file == nil
end

Important note:
There’s a shorter, better, way to write to file in Ruby:

File.write(output_file, field1)

Unfortunately I can’t use it, as I have older Ruby version and I can’t upgrade it in our sandbox environment.

Check out the Exaptive data application Studio. Technology agnostic. No glue code. Use what you know and rely on the community for what you don't. Try the community version.

Topics:

Published at DZone with permission of Eyal Golan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}