DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Build Reactive REST APIs With Spring WebFlux
  • RESTful Web Services: How To Create a Context Path for Spring Boot Application or Web Service
  • Building REST API Backend Easily With Ballerina Language
  • Aggregating REST APIs Calls Using Apache Camel

Trending

  • The Perfection Trap: Rethinking Parkinson's Law for Modern Engineering Teams
  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • Simplifying Multi-LLM Integration With KubeMQ
  • Integrating Model Context Protocol (MCP) With Microsoft Copilot Studio AI Agents
  1. DZone
  2. Data Engineering
  3. Data
  4. Perform Bulk Inserts With Elasticsearch's REST High-Level Client

Perform Bulk Inserts With Elasticsearch's REST High-Level Client

Generating data sets and inserting/ingesting them into databases is a key role of any data scientist. Learn how to do it with Elasticsearch!

By 
Sujith Menon user avatar
Sujith Menon
·
Jan. 07, 19 · Tutorial
Likes (8)
Comment
Save
Tweet
Share
41.7K Views

Join the DZone community and get the full member experience.

Join For Free

We would often like to generate some kind of random data when playing with databases or for just throwing some data at our application. Faker can be very useful for these purposes. It generates data for various domain objects that you would want to model in your application. For instance, the first name or last name of a person, book names and their authors and publishers, etc. The entire list of “fakers” (domain objects) is provided in the Faker GiHub readme file. Another interesting/useful feature is that we can also generate “locale” specific data from it.

The Faker gihub repository can be found here: Faker Github

In this tutorial, we will create a sample Spring Boot application and use the above Faker dependency to generate some data and then use that data to populate our Elastic DB. You could use any other database along with any other Java-based application depending on your needs. This will also serve as an example on Elastic Search's REST High-Level Client usage.

1. Let us create a simple Spring Boot application and test the Faker service.

FakerAndESApp.java

package techgabs;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class FakerAndESApp {
    public static void main(String[] args){
        SpringApplication.run(FakerAndESApp.class, args);
    }
}

2. Test the class responsible for displaying the output of the Faker service.

TestFaker.java

package techgabs;
import com.github.javafaker.Faker;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;

import java.util.Locale;

@Component
public class TestFaker {

    @EventListener
    public void test(ApplicationReadyEvent event){

        Faker faker = new Faker(new Locale("en-IND"));
        System.out.println(faker.name().firstName());
        System.out.println(faker.name().lastName());

        System.out.println(faker.name().firstName());
        System.out.println(faker.name().lastName());
    }
}

Sample output:


Chandira
Iyer
Varalakshmi
Naik

Note that the locale was set to “en-IND.” The entire list of locale settings can be found here, Faker Github
Another thing to note is that every time faker.name().firstName() is called, a new string is returned even though the same Faker object is used. Every call to the method returns a new value.

3. Now that we know how Faker works, let us try to generate some book data and insert them into ES.

Let us first get the required list of Gradle dependencies that we need for the project -> build.gradle

plugins {
    id 'java'
}

group 'techgabs.faker.es'
version '1.0-SNAPSHOT'

sourceCompatibility = 1.8

repositories {
    mavenCentral()
}
dependencies {
    testCompile group: 'junit', name: 'junit', version: '4.12'

    // https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-web
    compile group: 'org.springframework.boot', name: 'spring-boot-starter-web', version: '2.1.0.RELEASE'

    compile 'com.github.javafaker:javafaker:0.16'

    // https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-high-level-client
    compile 'org.elasticsearch.client:elasticsearch-rest-high-level-client:6.4.2'
}

4. Create a Book model to hold Faker generated data.

Book.java

package techgabs.model;
public class Book {
    public String getAuthor() {
        return author;
    }
    public void setAuthor(String author) {
        this.author = author;
    }
    public String getGenre() {
        return genre;
    }
    public void setGenre(String genre) {
        this.genre = genre;
    }
    public String getPublisher() {
        return publisher;
    }
    public void setPublisher(String publisher) {
        this.publisher = publisher;
    }
    public String getTitle() {
        return title;
    }
    public void setTitle(String title) {
        this.title = title;
    }
    public String getId() {
        return id;
    }
    public void setId(String id) {
        this.id = id;
    }
    private String id;
    private String author;
    private String genre;
    private String publisher;
    private String title;
}

Prerequisites for ElasticSearch:

Please make sure the Elasticsearch DB is up and running.

On Mac, I found it easier to install ES via brew.

brew update
brew install elasticsearch

On Windows, you can download the MSI from here -> ElasticSearch MSI For Windows

A better approach in both cases would be to useDdocker to download an ES image and run it.

5. Create a service to generate fake data.

package techgabs.service;


import com.github.javafaker.Faker;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import techgabs.dao.BookDao;
import techgabs.model.Book;

import java.util.ArrayList;
import java.util.List;
import java.util.Locale;
import java.util.UUID;

@Service
public class BulkService {

    private Faker faker =new Faker(new Locale("en-IND"));

    @Autowired
    private BookDao bookDao;

    public void fakeBulkInsert(int count){

        bookDao.bulkInsert(getFakeBookList(count));
    }

    private List getFakeBookList(int count) {
        List bookList = new ArrayList<>();

        for(int i=0;i < count;i++) {
            Book book = new Book();
            book.setId(UUID.randomUUID().toString());
            book.setAuthor(faker.book().author());
            book.setGenre(faker.book().genre());
            book.setPublisher(faker.book().publisher());
            book.setTitle(faker.book().title());
            bookList.add(book);
        }
        return bookList;
    }
}

We will now use the RestHighLevelClient ES module to perform bulk inserts of the data generated in the previous step. Below is the Config class for creating RestHighLevelClient. Note that it's important to destroy the client explicitly after use. It also uses a low-level RestClient. Please check Elastic Search Rest High Level Client docs for more information.

6. Config Class for Rest High-Level Client for ES.

ESConfig.java 
package techgabs.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;

import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.config.AbstractFactoryBean;
import org.springframework.context.annotation.Configuration;

import java.io.IOException;

@Configuration
public class ESConfig extends AbstractFactoryBean {

    private RestHighLevelClient restHighLevelClient;

    @Override
    public Class getObjectType() {
        return RestHighLevelClient.class;
    }

    @Override
    protected RestHighLevelClient createInstance() throws Exception {

        try {
            restHighLevelClient = new RestHighLevelClient(
                    RestClient.builder(new HttpHost("localhost", 9200, "http"),
                            new HttpHost("localhost", 9201, "http")
                    )
            );

        }
        catch (Exception ex){
            System.out.println(ex.getMessage());
        }
        return restHighLevelClient;
    }

    @Override
    public void destroy(){

        try {
            restHighLevelClient.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

7. Create a DAO layer to perform bulk inserts.

package techgabs.dao;

import com.fasterxml.jackson.databind.ObjectMapper;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import techgabs.model.Book;

import java.io.IOException;
import java.util.List;
import java.util.Map;

@Component
public class BookDao {

    private static final String INDEX="book_index";

    private static final String TYPE="book_type";

    @Autowired
    private RestHighLevelClient restHighLevelClient;

    @Autowired
    private ObjectMapper objectMapper;

    public void bulkInsert(List bookList){

        BulkRequest bulkRequest = new BulkRequest();

        bookList.forEach(book -> {
            IndexRequest indexRequest = new IndexRequest(INDEX,TYPE,book.getId()).
                    source(objectMapper.convertValue(book, Map.class));

            bulkRequest.add(indexRequest);
        });

        try {
            restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

8. We will now create a controller from which we will invoke the bulk insert service method.

package techgabs.controller;

import com.github.javafaker.Faker;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RestController;
import techgabs.service.BulkService;

@RestController
public class Controller {

    @Autowired
    private BulkService bulkService;

    @PostMapping("/faker/bulk/{count}")
    public void bulkInsertWithFakeData(@PathVariable("count") int count){
        bulkService.fakeBulkInsert(count);
    }
}

9. Insert data via a REST client, like Postman or curl.

POST http://localhost:8080/faker/bulk/2

Verify that the data is correctly inserted in ES.

POST http://localhost:9200/book_index/_search

Output:

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
            {
                "_index": "book_index",
                "_type": "book_type",
                "_id": "3f2bcae1-4314-4a05-b9f5-86782320e9da",
                "_score": 1,
                "_source": {
                    "id": "3f2bcae1-4314-4a05-b9f5-86782320e9da",
                    "author": "Deenabandhu Banerjee",
                    "genre": "Suspense/Thriller",
                    "publisher": "André Deutsch",
                    "title": "As I Lay Dying"
                }
            },
            {
                "_index": "book_index",
                "_type": "book_type",
                "_id": "1ea8da99-7df7-407f-a059-a2c31ae95138",
                "_score": 1,
                "_source": {
                    "id": "1ea8da99-7df7-407f-a059-a2c31ae95138",
                    "author": "Baalaaditya Banerjee",
                    "genre": "Speech",
                    "publisher": "Signet Books",
                    "title": "The Moving Toyshop"
                }
            }
        ]
    }
}

Summary

We created a sample application to demonstrate how the Faker service generates sample data and then later we inserted that sample data into Elasticsearch. In the process, we also understood how to configure Elasticsearch and use the RestHighLevelClient to create indexes. We also verified the results by using the search REST end point of Elasticsearch.

REST Web Protocols Elasticsearch Data (computing) Spring Framework application

Opinions expressed by DZone contributors are their own.

Related

  • Build Reactive REST APIs With Spring WebFlux
  • RESTful Web Services: How To Create a Context Path for Spring Boot Application or Web Service
  • Building REST API Backend Easily With Ballerina Language
  • Aggregating REST APIs Calls Using Apache Camel

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!