DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
A Fresh Look at Optimizing Apache Spark Programs
Optimize Spark jobs by tuning configurations, writing efficient code (Data Frames, broadcast joins), using optimized storage, and monitoring the Spark UI and logs.
October 14, 2025
by Nataraj Mocherla
· 3,020 Views · 2 Likes
article thumbnail
How Developers Use Synthetic Data to Stress-Test Models in Noisy Markets
Synthetic data lets quants stress-test equity strategies beyond noisy markets, preserving volatility, and building resilience before risking real capital.
October 14, 2025
by Jay Mehta
· 1,319 Views · 1 Like
article thumbnail
Operationalizing Responsible AI: Turning Ethics Into Engineering
This article will provide a direction on how to build a reliable AI system in production by incorporating bias mitigation strategies.
October 13, 2025
by Jofia Jose Prakash
· 2,197 Views · 2 Likes
article thumbnail
Apache Iceberg REST Catalog: The Key to Vendor-Agnostic Data Interoperability
In this article, I have demonstrated how Iceberg Data can be accessed through the Iceberg REST Catalog from Data Mesh with a simple Python application.
October 13, 2025
by Pravin Dwiwedi
· 2,606 Views · 1 Like
article thumbnail
Introduction to Spring Data Elasticsearch 5.5
Getting started with the latest version of Spring Data Elasticsearch 5.5 and Elasticsearch 8.18 as a NoSQL database for our data storage.
October 10, 2025
by Arnošt Havelka DZone Core CORE
· 3,038 Views · 2 Likes
article thumbnail
8 Challenges in Multimodal Training Data Creation
Creating high-quality multimodal training data is essential yet complex, involving challenges in synchronization, scalability, context capture, and tooling.
October 8, 2025
by Chirag Shivalker
· 2,953 Views
article thumbnail
7 AWS Services Every Data Engineer Should Master
In 2025, S3, Glue, Lambda, Athena, Redshift, EMR, and Kinesis form the core AWS toolkit for building fast, reliable, and scalable data pipelines.
October 6, 2025
by Sai Mounika Yedlapalli
· 4,229 Views · 3 Likes
article thumbnail
From Big Data to Agents: My Decade Building Systems
How a simple scraper, a few dashboards, and a lot of curiosity turned into agentic systems that actually ship value. A builder’s path.
October 3, 2025
by Nacho Corcuera
· 2,758 Views · 2 Likes
article thumbnail
Building a Scalable and Reliable Marketing Data Stack on GCP
A resilient marketing data stack on GCP leverages BigQuery, Pub/Sub, and Dataflow to deliver real-time insights, handle schema drift, and scale analytics.
October 2, 2025
by Shafeeq Ur Rahaman
· 1,325 Views · 2 Likes
article thumbnail
Salesforce Data Cloud: Setting Up and Using the Ingestion API
In this guide, learn to use Salesforce Data Cloud Ingestion API for real-time and bulk data ingestion to deliver accurate, personalized customer experiences.
October 2, 2025
by Ramesh Bellamkonda
· 3,921 Views · 2 Likes
article thumbnail
Implementing Governance on Databricks Using Unity Catalog
We learn how to treat data as a product through governance in Unity Catalog, ensuring the right people, metadata about the datasets.
October 1, 2025
by Junaith Haja
· 2,847 Views · 3 Likes
article thumbnail
Master Advanced Error-Handling to Make PySpark Pipelines Production-Ready
PySpark jobs often fail because of bad data, network issues, or logic errors. Sometimes, after hours of processing. Learn how to make your Spark pipelines more reliable.
September 30, 2025
by Ram Ghadiyaram DZone Core CORE
· 4,921 Views · 6 Likes
article thumbnail
Complex Data Tasks Are Now One-Liners With AI in Databricks SQL
In this guide, learn how to simplify data tasks with AI in Databricks SQL — summarize, translate, analyze sentiment, and mask PII with one-liner queries.
September 26, 2025
by Junaith Haja
· 3,604 Views · 2 Likes
article thumbnail
AWS Glue Crawlers: Common Pitfalls, Schema Challenges, and Best Practices
Learn key challenges and best practices for using AWS Glue crawlers, from handling CSV schema issues to schema evolution, partitions, and ETL jobs.
September 25, 2025
by Saradha Nagarajan
· 4,018 Views
article thumbnail
LLMs at the Edge: Decentralized Power and Control
Deploying LLMs at the edge decentralizes intelligence, enhances privacy, reduces latency, increases autonomy, and empowers local control.
September 23, 2025
by Bhanuprakash Madupati
· 23,015 Views · 4 Likes
article thumbnail
Azure IOT Cloud-to-Device Communication Methods
Learning and choosing the correct cloud-to-device communication method to send a message to the device using the Azure IoT Hub to build an effective IoT system.
September 22, 2025
by Anup Rao
· 2,948 Views
article thumbnail
The Real-time Data Transfer Magic of Doris Kafka Connector's "Data Package": Part 1
One Man Stands Guard, and Ten Thousand Cannot Pass! Learn all about real-time data import, transformation, and error handling with Doris Kafka Connector.
September 12, 2025
by Michael Hayden
· 3,007 Views · 1 Like
article thumbnail
Azure VM Instance Types and Their Roles in Different Distributed Software Systems
Azure provides various VM instance types optimized for compute, memory, storage, or GPU needs, such as Databricks, Snowflake, AKS, Synapse, and Azure Functions.
September 11, 2025
by Srinivasarao Rayankula
· 19,602 Views · 3 Likes
article thumbnail
AI on the Fly: Real-Time Data Streaming From Apache Kafka to Live Dashboards
Real-time data streaming plays a key role for AI models as it allows them to handle and respond to data as it comes in, instead of just using old fixed datasets.
September 11, 2025
by Gautam Goswami DZone Core CORE
· 7,040 Views
article thumbnail
From HTTP to Kafka: A Custom Source Connector
Learn how to implement a custom Kafka Connect HTTP source connector to integrate with HTTP endpoints, covering connector configuration, deployment and usage.
September 10, 2025
by Ion Pascari
· 4,727 Views · 5 Likes
  • Previous
  • ...
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×