Big Data Resources

The Shift to Open Industrial IoT Architectures With Data Streaming

Modernize OT with Data Streaming, Kafka, MQTT, and OPC-UA to replace legacy middleware, enabling real-time data + scalable cloud integration.

July 4, 2025

by Kai Wähner

CORE

· 2,920 Views · 1 Like

How Predictive Analytics Became a Key Enabler for the Future of QA

Predictive analytics turns QA into a proactive process, using data and ML to spot defects early, speed up releases, and reduce bugs by up to 30%.

July 4, 2025

by Nidhi Sharma

· 1,677 Views · 4 Likes

Deploying LLMs Across Hybrid Cloud-Fog Topologies Using Progressive Model Pruning

Deploying LLMs at the edge is hard due to size and resource limits. This guide explores how progressive model pruning enables scalable hybrid cloud–fog inference.

July 2, 2025

by Sam Prakash Bheri

· 2,303 Views · 1 Like

Replacing Legacy Systems With Data Streaming: The Strangler Fig Approach

Modernize legacy systems without the risk of a full rewrite. Strangler Fig + data streaming enables scalable, real-time transformation.

July 1, 2025

by Kai Wähner

CORE

· 1,558 Views · 4 Likes

Transform Settlement Process Using AWS Data Pipeline

Modern AWS data pipelines automate ETL for settlement files using S3, Glue, Lambda, and Step Functions, transforming data from raw to curated with full orchestration.

June 30, 2025

by Prabhakar Mishra

· 1,902 Views · 2 Likes

DevOps at the Edge: Deploying Machine Learning Models on IoT Devices

Deploying ML models on IoT devices using DevOps practices enables scalable, low-latency intelligence at the edge without managing cloud infrastructure.

June 25, 2025

by Bhanu Sekhar Guttikonda

CORE

· 5,046 Views · 4 Likes

How Trustworthy Is Big Data? A Guide to Real-World Challenges and Solutions

Big data only delivers value when it's reliable. Identify and fix trust issues like schema drift, outliers, and silent errors using Deequ and Great Expectations.

June 25, 2025

by Vivek Venkatesan

· 1,887 Views

Optimizing Data Pipelines in Cloud-Based Systems: Tools and Techniques

Learn how to build and optimize data pipelines in cloud-based systems to process and transfer vast amounts of data effectively.

June 24, 2025

by Anil Jonnalagadda

· 19,213 Views · 8 Likes

Storage-Computing Integration vs. Separation: Architectural Trade-offs, Use Cases, and Insights from Apache Doris

Discover the pros, cons, and use cases of storage-computing integration vs. separation, with real-world insights from Apache Doris’s hybrid architecture.

June 24, 2025

by Darren Xu

· 2,207 Views · 1 Like

Enabling more informed, transparent, and responsive policies that directly address societal needs and enhance resilience in the face of issues.

June 24, 2025

by Ram Ghadiyaram

CORE

· 2,423 Views · 4 Likes

Unveiling Supply Chain Transformation: IIoT and Digital Twins

Dive into industrial automation using industrial internet of things (IIoTs) and digital twins (DTs) towards the advancement of the supply chain using key technologies.

June 24, 2025

by Manvinder Kumra

· 1,962 Views

Snowflake Cortex for Developers: How Generative AI and SaaS Enable Self-Serve Data Analytics

This article explores how Snowflake Cortex, Snowflake’s generative AI solution, advances self-serve analytics for both structured and unstructured data.

June 23, 2025

by Dipankar Saha

· 2,999 Views · 2 Likes

Architects of Ambient Intelligence With IoT and AI Personal Assistants

Traditional IoT + AI faces latency, privacy, and ecosystem issues. Decentralized AI and federated learning enhance real-time, privacy-centric, user-trusted solutions.

June 23, 2025

by Praveen Chinnusamy

· 5,957 Views · 3 Likes

Real-Object Detection at the Edge: AWS IoT Greengrass and YOLOv5

Real-time object detection at the edge using YOLOv5 and AWS IoT Greengrass enables fast, offline, and scalable processing in bandwidth-limited or remote environments.

June 23, 2025

by Anil Jonnalagadda

· 20,726 Views · 17 Likes

Building an IoT Framework: Essential Components for Success

Understanding the building blocks for creating secure, scalable, and efficient IoT frameworks that deliver real-world value and future-proof performance.

June 20, 2025

by Carsten Rhod Gregersen

· 2,337 Views · 2 Likes

Top Trends for Data Streaming With Apache Kafka and Flink

Explore how Apache Kafka and Apache Flink are transforming data streaming, powering real-time analytics, and shaping cloud and future-ready business systems.

June 18, 2025

by Kai Wähner

CORE

· 2,707 Views · 1 Like

A New Era of Unified Lakehouse: Who Will Reign? A Deep Dive into Apache Doris vs. ClickHouse

Apache Doris delivers unified, real-time analytics with flexible updates and high concurrency, outperforming ClickHouse on complex queries.

June 18, 2025

by Michael Hayden

· 4,212 Views · 2 Likes

Driving Streaming Intelligence On-Premises: Real-Time ML With Apache Kafka and Flink

This article explores how to design, build, and deploy a predictive ML model using Flink and Kafka in an on-premises environment to power real-time analytics.

June 17, 2025

by Gautam Goswami

CORE

· 1,973 Views · 6 Likes