Big Data Resources

This article introduces process mining, explaining its key elements and practical applications for discovering and analyzing workflows using event data.

January 13, 2025

by Puneet Malhotra

· 3,000 Views · 5 Likes

Revolutionizing Catalog Management for Data Lakehouse With Polaris Catalog

Achieve cross-query-engine interoperability and zero-data-copy architecture with Polaris Catalog, enhance data management, and streamline workflows.

January 10, 2025

by Dipankar Saha

· 2,918 Views · 1 Like

Quantum Machine Learning for Large-Scale Data-Intensive Applications

This article examines how QML can harness the principles of quantum mechanics to achieve significant computational advantages over classical approaches.

January 9, 2025

by Manasi Sharma

· 5,638 Views · 1 Like

AI Governance: Building Ethical and Transparent Systems for the Future

This article takes a deep dive into AI governance, including insights surrounding its challenges, frameworks, standards, and more.

January 9, 2025

by Sukanya Konatam

· 6,482 Views · 2 Likes

Top 5 Key Features of Apache Iceberg for Modern Data Lakes

Gain insight into key Iceberg features such as time travel, schema evolution, partition evolution, and ACID transactions with clear SQL examples and diagrams.

January 8, 2025

by Rajanikantarao Vellaturi

· 3,135 Views · 2 Likes

Understanding WebRTC Security Architecture and IoT

Learn more about how WebRTC's triple-layer security architecture protects IoT communications and creates the building blocks of secure device interactions.

January 8, 2025

by Carsten Rhod Gregersen

· 4,245 Views

Data Governance Challenges in the Age of Generative AI

Navigate privacy, security, and compliance challenges for innovation. Effective data governance is now more critical due to recent generative AI developments.

January 7, 2025

by nishchai jayanna manjula

· 5,352 Views · 3 Likes

The Evolution of Adaptive Frameworks

Adaptive frameworks revolutionize collaboration by providing real-time, personalized updates that address traditional tool limitations.

January 6, 2025

by Manasi Sharma

· 2,610 Views · 2 Likes

Efficiently Processing Billions of Rows Daily With Presto

In this article, we will delve into strategies for efficiently storing and processing large datasets using Presto.

January 6, 2025

by Ajay Krishnan Prabhakaran

· 3,544 Views · 2 Likes

Setting Up Local Kafka Container for Spring Boot Application

Set up Kafka container for Spring Boot application, which covers container configuration, Docker Desktop, and Spring Boot integration.

January 6, 2025

by Amol Gote

CORE

· 4,188 Views · 3 Likes

Event-Driven AI: Building a Research Assistant With Kafka and Flink

Learn about the development of an AI-powered research assistant created to help prepare for podcast interviews.

January 6, 2025

by Sean Falconer

· 2,432 Views · 3 Likes

Incremental Jobs and Data Quality Are On a Collision Course

Big data isn’t dead; it’s just going incremental. But bad things happen when uncontrolled changes collide with incremental jobs. Reacting to changes is a losing strategy.

January 1, 2025

by Jack Vanlightly

· 4,631 Views · 1 Like

Data Lake vs. Data Warehouse vs. Data Lakehouse

The three data storage options and their pros and cons: the legacy data warehouse, the more recent data lake, and contemporary data lakehouse architectures.

December 31, 2024

by Noa Shavit

· 4,659 Views · 4 Likes

Streamlining Database Management: Running PostgreSQL in Docker Containers

You will learn how to run PostgreSQL in a Docker container with a portable, isolated, and resource-efficient database setup that is easy to manage and scalable.

December 31, 2024

by VARUNREDDY DEVIREDDY

· 5,661 Views · 3 Likes

Iceberg Catalogs: A Guide for Data Engineers

Learn about Apache Iceberg catalogs, their types, configurations, and the best-fit solutions for managing metadata and large datasets in different environments.

December 31, 2024

by Naresh Dulam

· 4,445 Views · 2 Likes

Bridging the Gap: Unlocking the Power of HDFS-Based Data Lakes With Streaming Databases

Big data technologies' quick development has highlighted the necessity of a smooth transition between real-time data analytics and batch processing systems.

December 30, 2024

by Gautam Goswami

CORE

· 2,491 Views · 1 Like

The Future of Data Lies in Transformer Models vs. Big Data Transformations

Last year witnessed the explosive rise of large models, generating global enthusiasm and making AI seem like a solution to all problems.

December 25, 2024

by William Guo

· 4,001 Views · 1 Like

Creating Scalable, Compliant Cloud Data Pipelines in SaaS through AI Integration

Cloud-based AI data pipelines provide SaaS companies with scalable, cost-effective solutions for real-time insights and efficient data management.

December 24, 2024

by Venkata Gummadi

· 19,320 Views · 3 Likes

Key Considerations for On-Premise to Cloud Data Warehouse Migration

Migrating your on-premise data warehouse to the cloud requires thorough planning. This article will help in the decision-making process by addressing critical factors.

December 18, 2024

by Maria Anurag Reddy Basani

· 5,413 Views

Delta Live Tables in Databricks: A Guide to Smarter, Faster Data Pipelines

Delta Live Tables (DLT) in Databricks streamlines data pipelines, enhancing data quality, simplifying pipeline management, and enabling real-time data processing.

December 18, 2024

by Kiran Polimetla

· 5,103 Views · 2 Likes

The Latest Big Data Topics