DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
Apache Spark Framework for Clustering Algorithms in Distributed Mode
Apache Spark's framework to train clustering algorithms is not supported by SparkML in distributed mode using customer partitioners and the mapPartition technique.
July 29, 2025
by Arun Kumar Natva
· 1,578 Views · 1 Like
article thumbnail
Unity Catalog + AI: How Databricks Is Making Data Governance AI-Native in 2025
Data governance from a restrictive practice to an enabling force, facilitating intelligent, secure, and agile data management within enterprises.
July 25, 2025
by Sairamakrishna BuchiReddy Karri
· 3,154 Views · 1 Like
article thumbnail
Data Partitioning and Bucketing: How Modern Data Systems Organize and Optimize Your Data
This article explains how partitioning and bucketing work in the big data world and talks about best practices that you can follow.
July 24, 2025
by Rajanikantarao Vellaturi
· 3,704 Views · 6 Likes
article thumbnail
Building a Modern Data Platform That Delivers Real Business Value
Data modernization transforms how organizations manage and use data by aligning cloud-native tech, governance, and culture to drive value and agility.
July 23, 2025
by Amlan Patnaik
· 2,161 Views · 2 Likes
article thumbnail
Designing Retry-Resilient Fare Pipelines With Idempotent Event Handling
Design retry-safe fare pipelines with idempotent event handling to ensure consistency across failures, retries, and duplications.
July 23, 2025
by Ravi Teja Thutari DZone Core CORE
· 3,087 Views · 2 Likes
article thumbnail
Implementing Data Analytics in Healthcare: A Hands-On Approach
Success in healthcare data analytics requires cleaning and integrating data, ensuring privacy, and starting small before scaling up.
July 21, 2025
by Mykhailo Kopyl
· 1,462 Views · 2 Likes
article thumbnail
Best Practices for Syncing Hive Data to Apache Doris :  From Scenario Matching to Performance Tuning
Learn how to efficiently sync and analyze big data by combining Hive’s storage with Doris’s real-time analytics using various sync strategies and optimizations.
July 17, 2025
by Darren Xu
· 4,037 Views
article thumbnail
Migrating Traditional Workloads From Classic Compute to Serverless Compute on Databricks
This tutorial explains the migration of Databricks workloads from Classic Compute to Serverless Compute for efficiency and cost effectiveness.
July 17, 2025
by Prasath Chetty Pandurangan
· 4,337 Views
article thumbnail
Fraud Detection in Mobility Services With Apache Kafka and Flink
Mobility services like Uber and Grab use data streaming with Kafka and Flink for fraud detection by applying AI and machine learning in real-time.
July 17, 2025
by Kai Wähner DZone Core CORE
· 4,801 Views · 1 Like
article thumbnail
Streamline Your ELT Workflow in Snowflake With Dynamic Tables and Medallion Design
Dynamic Tables in Snowflake bring declarative, incremental ELT. Define SQL + freshness target, and Snowflake handles the orchestration, no dbt or Airflow needed.
July 16, 2025
by Harshavardhan Yedla
· 4,142 Views · 1 Like
article thumbnail
Data Ingestion: The Front Door to Modern Data Infrastructure
AWS offers a rich set of ingestion services. This guide provides industry use cases and a cheat sheet to help you choose the right one for your organization.
July 16, 2025
by Junaith Haja
· 2,657 Views · 3 Likes
article thumbnail
Dashboards Are Dead Weight Without Context: Why BI Needs More Than Visuals
BI engineers must build context-rich, code-driven dashboards that enable decisions. Actionable pipelines and semantic clarity are now the standard.
July 15, 2025
by Venkata Murali Krishna Nerusu
· 4,364 Views · 1 Like
article thumbnail
Designing Configuration-Driven Apache Spark SQL ETL Jobs with Delta Lake CDC
Simplify complex ETL pipelines and enable scalable, maintainable data processing with Spark SQL and Delta Lake Change Data Capture.
July 14, 2025
by Janaki Ganapathi
· 2,065 Views
article thumbnail
Contract-Driven ML: The Missing Link to Trustworthy Machine Learning
Model accuracy means nothing if data breaks in production. Learn how data contracts ensure reliability, prevent silent failures, and protect ML performance.
July 10, 2025
by Sana Zia Hassan
· 2,109 Views · 1 Like
article thumbnail
Build Real-Time Analytics Applications With AWS Kinesis and Amazon Redshift
Build real-time analytics with AWS Kinesis for streaming, AWS Lambda for processing, and Amazon Redshift for scalable data analysis.
July 10, 2025
by Danil Temnikov DZone Core CORE
· 3,067 Views · 4 Likes
article thumbnail
Top 5 Trends in Big Data Quality and Governance in 2025
Explore the top 5 trends in data quality and governance for 2025, from real-time validation to AI-powered checks and privacy-first practices.
July 10, 2025
by Vivek Venkatesan
· 2,070 Views · 2 Likes
article thumbnail
Breaking Free from ZooKeeper: Why Kafka’s KRaft Mode Matters
Kafka shifts from ZooKeeper to KRaft mode for better scalability, faster recovery, and lower complexity, using Raft-based quorum for metadata management.
July 9, 2025
by Ammar Husain DZone Core CORE
· 2,006 Views · 4 Likes
article thumbnail
The AWS Playbook for Building Future-Ready Data Systems
Gone are the days when teams dumped everything into a central data warehouse and hoped analytics would magically appear.
July 9, 2025
by Junaith Haja
· 3,110 Views · 7 Likes
article thumbnail
How Developers Are Driving Supply Chain Innovation With Modern Tech
This blog shows how developers use modern tech to transform supply chains with smart, code-first solutions and real-time systems.
July 7, 2025
by Jatin Lamba
· 1,906 Views · 1 Like
article thumbnail
Understanding k-NN Search in Elasticsearch
Elasticsearch, a powerful distributed search engine and k-NN Search with text embedding model integration makes it ideal for modern AI-driven search solutions.
July 7, 2025
by Govind Singh Rawat
· 4,110 Views · 7 Likes
  • Previous
  • ...
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×