DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
What We Learned Migrating to a Pub/Sub Architecture: Real-World Case Studies from High-Traffic Systems
Kafka-powered migration of an e-commerce platform—focused on scalability, fault tolerance, and clean event-driven design.
August 19, 2025
by Ravi Teja Thutari DZone Core CORE
· 8,011 Views · 4 Likes
article thumbnail
Prompt-Based ETL: Automating SQL Generation for Data Movement With LLMs
Prompt-based ETL uses LLMs to convert plain English into validated, schema-aware SQL, automating data transformation and enabling self-serve analytics.
August 18, 2025
by Saisuman Singamsetty
· 6,769 Views · 4 Likes
article thumbnail
Real-Time Analytics Using Zero-ETL for MySQL
Learn how to set up a zero-ETL integration from Amazon RDS for MySQL to Amazon Redshift using AWS CLI for real-time analytics without complex pipelines.
August 18, 2025
by Harpreet Kaur Chawla
· 1,868 Views · 2 Likes
article thumbnail
Amazon EMRFS vs HDFS: Which One Is Right for Your Big Data Needs?
Unlock the potential of your data strategy. Discover how EMRFS and HDFS can optimize big data processing on Amazon EMR. Make an informed choice for success.
August 15, 2025
by Satrajit Basu DZone Core CORE
· 2,960 Views · 2 Likes
article thumbnail
Data Pipeline Architectures: Lessons from Implementing Real-Time Analytics
Real-time pipelines sound great — until you're buried in Kafka ops, broken joins, and silent delays. Here's what actually works, and when simpler tools win.
August 15, 2025
by Chandan Shukla
· 2,401 Views · 2 Likes
article thumbnail
How IoT Devices Communicate With Alexa, Google Assistant, and HomeKit — A Developer’s Deep Dive
Discover how voice assistants like Alexa, Google Assistant, and Siri communicate with IoT devices through cloud APIs, secure protocols, and smart home hubs.
August 14, 2025
by Praveen Chinnusamy
· 13,923 Views · 5 Likes
article thumbnail
Cloud Data Engineering for Smarter Healthcare Marketing
Cloud data engineering helps healthcare marketers turn vast, unused patient data into real-time, personalized campaigns.
August 14, 2025
by Joydeep Bhattacharya DZone Core CORE
· 2,268 Views · 2 Likes
article thumbnail
No More ETL: How Lakebase Combines OLTP, Analytics in One Platform
Databricks Lakebase is a serverless Postgres database unifying real-time transactions and analytics for modern apps and AI—no ETL or infra hassle.
August 14, 2025
by Junaith Haja
· 3,476 Views · 3 Likes
article thumbnail
Declarative Pipelines in Apache Spark 4.0
Apache Spark's declarative pipelines let you define your entire data pipeline's desired outcome, and Spark handles the execution details.
August 12, 2025
by Sandeep Bishnoi
· 7,466 Views · 7 Likes
article thumbnail
Build Your Own Customized ChatGPT Using OpenAI
Learn to build a no-code AI bot to generate test cases from user stories using ChatGPT. Customize tone, behavior, and export structured test scripts easily.
August 6, 2025
by Gaurav Sharma
· 2,594 Views · 3 Likes
article thumbnail
Apache Spark Framework for Clustering Algorithms in Distributed Mode
Apache Spark's framework to train clustering algorithms is not supported by SparkML in distributed mode using customer partitioners and the mapPartition technique.
July 29, 2025
by Arun Kumar Natva
· 1,664 Views · 1 Like
article thumbnail
Unity Catalog + AI: How Databricks Is Making Data Governance AI-Native in 2025
Data governance from a restrictive practice to an enabling force, facilitating intelligent, secure, and agile data management within enterprises.
July 25, 2025
by Sairamakrishna BuchiReddy Karri
· 3,294 Views · 1 Like
article thumbnail
Data Partitioning and Bucketing: How Modern Data Systems Organize and Optimize Your Data
This article explains how partitioning and bucketing work in the big data world and talks about best practices that you can follow.
July 24, 2025
by Rajanikantarao Vellaturi
· 3,864 Views · 6 Likes
article thumbnail
Building a Modern Data Platform That Delivers Real Business Value
Data modernization transforms how organizations manage and use data by aligning cloud-native tech, governance, and culture to drive value and agility.
July 23, 2025
by Amlan Patnaik
· 2,217 Views · 2 Likes
article thumbnail
Designing Retry-Resilient Fare Pipelines With Idempotent Event Handling
Design retry-safe fare pipelines with idempotent event handling to ensure consistency across failures, retries, and duplications.
July 23, 2025
by Ravi Teja Thutari DZone Core CORE
· 3,182 Views · 2 Likes
article thumbnail
Implementing Data Analytics in Healthcare: A Hands-On Approach
Success in healthcare data analytics requires cleaning and integrating data, ensuring privacy, and starting small before scaling up.
July 21, 2025
by Mykhailo Kopyl
· 1,498 Views · 2 Likes
article thumbnail
Best Practices for Syncing Hive Data to Apache Doris :  From Scenario Matching to Performance Tuning
Learn how to efficiently sync and analyze big data by combining Hive’s storage with Doris’s real-time analytics using various sync strategies and optimizations.
July 17, 2025
by Darren Xu
· 4,115 Views
article thumbnail
Migrating Traditional Workloads From Classic Compute to Serverless Compute on Databricks
This tutorial explains the migration of Databricks workloads from Classic Compute to Serverless Compute for efficiency and cost effectiveness.
July 17, 2025
by Prasath Chetty Pandurangan
· 4,375 Views
article thumbnail
Fraud Detection in Mobility Services With Apache Kafka and Flink
Mobility services like Uber and Grab use data streaming with Kafka and Flink for fraud detection by applying AI and machine learning in real-time.
July 17, 2025
by Kai Wähner DZone Core CORE
· 4,852 Views · 1 Like
article thumbnail
Streamline Your ELT Workflow in Snowflake With Dynamic Tables and Medallion Design
Dynamic Tables in Snowflake bring declarative, incremental ELT. Define SQL + freshness target, and Snowflake handles the orchestration, no dbt or Airflow needed.
July 16, 2025
by Harshavardhan Yedla
· 4,280 Views · 1 Like
  • Previous
  • ...
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×