DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
Creating Live Dashboards With QuickSight
See how you can bring together AWS Lambda, S3 and QuickSight to create a live dashboard of COVID-19 vaccination.
May 20, 2021
by James Sugrue
· 9,310 Views · 4 Likes
article thumbnail
Introduction to Apache Kafka With Spring
Introduction to Apache Kafka with Spring.
May 20, 2021
by Otavio Santana DZone Core CORE
· 13,180 Views · 12 Likes
article thumbnail
Looking for the Best Java Data Computation Layer Tool
This essay is a deep dive into 4 types of data computation layer tools (class libraries) to compare structured data computing capabilities and basic functionalities.
May 20, 2021
by Jerry Zhang
· 6,972 Views · 1 Like
article thumbnail
Best Practices for Data Pipeline Error Handling in Apache NiFi
Learn actionable strategies for error management modeling in Apache NiFi data pipelines, and understand the benefits of planning for error handling.
May 19, 2021
by Pieter Humphrey
· 18,289 Views · 8 Likes
article thumbnail
Migrate Data Across Kafka Cluster Using mirrormaker2 in Strimzi
In this article, we will discuss a use case where data from one Kafka cluster has to be migrated to another Kafka Cluster. We will be using mirrormaker 2.
Updated May 18, 2021
by Chandra Shekhar Pandey
· 9,575 Views · 2 Likes
article thumbnail
Deploying an Apache Kafka mock service with Microcks
Microcks is an open source Kubernetes-native platform for API mocking and testing. You can use the AsyncAPI specification examples to tell Microcks to generate events to Apache Kafka with a simple configuration.
May 17, 2021
by Hugo Guerrero DZone Core CORE
· 19,347 Views · 9 Likes
article thumbnail
High-Performance Batch Processing Using Apache Spark and Spring Batch
Batch processing is dealing with a large amount of data; it actually is a method of running high-volume, repetitive data jobs and each job does a specific task.
May 16, 2021
by Reza Ganji DZone Core CORE
· 29,590 Views · 7 Likes
article thumbnail
Veeva Nitro and AWS SageMaker for Life Sciences Data Scientists
There is a rise in industry-specific data analytics solutions because building up and maintaining a custom data warehouse is difficult.
May 14, 2021
by Istvan Szegedi
· 6,759 Views · 2 Likes
article thumbnail
Deploy Elasticsearch on Kubernetes Using OpenEBS LocalPV
Overview Elastic Stack is a group of open-source tools that includes Elasticsearch for supporting data ingestion, storage, enrichment, visualization, and analysis for containerized applications. As a distributed search and analytics engine, Elasticsearch is an open-source tool that ingests application data, indexes it then stores it for analytics. Since it gathers large volumes of data while indexing different data types, Elasticsearch is often considered write-heavy. To manage such dynamic volumes of data, Kubernetes makes it easy to configure, manage, and scale Elasticsearch clusters. Kubernetes also simplifies the provisioning of resources for Elasticsearch using Infrastructure-as-Code configurations, abstracting cluster management. While Kubernetes alone cannot store data generated by a cluster, persistent volumes can be used to sustain it for future use. To help with this, OpenEBS provisions local persistent volumes or LocalPV and allows for data to be stored on physical disks. Many users have shared their experience of using OpenEBS for local storage management in Kubernetes for Elasticsearch, including the Cloud Native Computing Foundation, ByteDance (TikTok), and Zeta Associates (Lockheed Martin) on the Adopters list in the OpenEBS community available here. In this guide, we explore how OpenEBS LocalPV can provision data storage for Elasticsearch clusters. This guide will also cover - Primary functions of Elastic Stack operators in a Kubernetes cluster Integrating Elasticsearch operators with Fluentd and Kibana to form the EFK stack Monitoring Elasticsearch cluster metrics with Prometheus and Grafana Getting Started with Elasticsearch Analytics Elasticsearch extends the ability to store and search large amounts of textual, graphical or numerical data efficiently. Kubernetes makes it easy to manage the connections between Elasticsearch nodes, thereby simplifying deploying Elasticsearch on-premises or in hosted cloud environments. It must be noted that Elasticsearch nodes are different from Kubernetes nodes of a cluster. While an Elasticsearch node runs a single instance of Elasticsearch, a Kubernetes node is a physical or virtual machine that the orchestrator runs on. Elasticsearch Cluster Topology From Kubernetes’ point of view, an Elasticsearch node can be considered as a POD. Whenever an Elasticsearch cluster is deployed, three types of Elasticsearch PODs are created: Master - manage the Elasticsearch cluster Client - direct incoming traffic to appropriate PODs Data - responsible for storing and availing cluster data The diagram below shows the topology of a typical 7 POD Elasticsearch cluster with 3-master, 2-client and 2-data nodes: Deploying Elasticsearch involves creating manifest files for each of the cluster’s PODs. By connecting to the cluster, OpenEBS creates a visibility tier that enables cluster monitoring, logging and topology checks for LocalPV Storage. Additionally, to enable cluster-wide analytics, the following tools are deployed : Fluentd - An open-source data collection agent that integrates with Elasticsearch to collect log data, transform it then ship it to the Elastic Backend. Fluentd is set up on cluster nodes to collect and convert POD information and send it to the Elasticsearch data PODs for storage and indexing. It is typically set up as a DaemonSet to run on each Kubernetes worker node. Kibana - Once the cluster is deployed on Kubernetes, it needs to be monitored and managed. To help with this, Kibana is used as a visualization tool for cluster data by providing the Elasticsearch client service as an environment variable in PODs that Kibana should connect to. Solution Guide The following solution guide explains the steps and important considerations for deploying Elasticsearch clusters on Kubernetes using OpenEBS Persistent Volumes. By following the guide, you can create persistent storage for the EFK stack supported by Kubernetes, to which OpenEBS is deployed. The guide includes steps on performing metric checks and performance monitoring for the Elasticsearch cluster using Prometheus and Grafana. Let us know how you use Elasticsearch in production and if you have an interesting use case to share. Also, please check out other OpenEBS deployment guides on common Kubernetes stateful workloads on our website. Deploying Kafka on Kubernetes Deploying WordPress on DigitalOcean Kubernetes Deploying Magento on Kubernetes Deploying Percona on Kubernetes Deploying Cassandra on Kubernetes Deploying MinIO on Kubernetes Deploying Prometheus on Kubernetes This article has already been published on https://blog.mayadata.io/deploy-elasticsearch-on-kubernetes-using-openebs-localpv and has been authorized by MayaData for a republish.
May 12, 2021
by Sudip Sengupta DZone Core CORE
· 7,903 Views · 3 Likes
article thumbnail
How Do AI Systems Identify Duplicate Data?
A discussion of AI concepts, such as comparing records in a database, and how these techniques can be used in conjunction with Salesforce.
May 10, 2021
by Ilya Dudkin DZone Core CORE
· 16,153 Views · 3 Likes
article thumbnail
Spring Cloud Stream Channel Interceptor
A Channel Interceptor is used to capture a message before being sent or received in order to view or modify it. Learn how a channel interceptor works and how to use it.
May 5, 2021
by Mohammed ZAHID
· 15,336 Views · 4 Likes
article thumbnail
Deploying Kafka on OpenShift
Bringing Kafka to the cloud.
May 2, 2021
by Niklas Heidloff
· 6,188 Views · 4 Likes
article thumbnail
Building Hybrid Multi-Cloud Event Mesh With Apache Camel and Kubernetes
A full installation guide for building the event mesh with Apache Camel. We will be using microservice, function, and connector for the connector node in the mesh.
May 1, 2021
by Christina Lin DZone Core CORE
· 13,578 Views · 4 Likes
article thumbnail
Develop a Scraper With Node.js, Socket.IO, and Vue.js/Nuxt.js
Web scraper development with node.js and vue.js in the front-end with socket.io to get real-time data.
Updated April 28, 2021
by Dwayne O. Smith
· 15,602 Views · 6 Likes
article thumbnail
Migrate HDFS Data to Azure
A developer and Hadoop expert runs through the processes he and his team used to transfer their data over network with TLS encryption when switching to Azure.
April 23, 2021
by Radhika Mekala
· 8,388 Views · 3 Likes
article thumbnail
Bare Metal Vs The World: When And Why To Use This IoT OS
Bare Metal for IoT runs one application at a time—in stark contrast to regular operating systems. Let's consider the options for developers and devices.
April 22, 2021
by Carsten Rhod Gregersen
· 17,119 Views · 3 Likes
article thumbnail
Resolving Permission Issue in Multi-node Hadoop Cluster
It has been observed when we configure and deploy a multi-node Hadoop cluster or add new DataNodes, there is an SSH permission issue in communication with Hadoop daemons.
April 22, 2021
by Gautam Goswami DZone Core CORE
· 7,852 Views · 2 Likes
article thumbnail
Data Platform as a Service
When we design and build a Data Platform, we are working on providing the capacities and tools that others teams need to develop their projects. Let's discuss!
April 20, 2021
by Miguel Garcia DZone Core CORE
· 16,193 Views · 16 Likes
article thumbnail
Graph-Based Data Science, Machine Learning, and AI
What does graphing have to do with machine learning and data science? A lot, actually — learn more in The Year of the Graph Newsletter's Spring 2021 edition.
April 20, 2021
by George Anadiotis
· 9,897 Views · 5 Likes
article thumbnail
MQTT 5 vs. MQTT v3.1.1 for IoT App Development
See what's new with MQTT and IoT.
April 17, 2021
by Daniil Liadov
· 18,667 Views · 2 Likes
  • Previous
  • ...
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×