Data Resources

Lakehouse: Starting With Apache Doris + S3 Tables

This is the first article in the “Lakehouse: What’s the Big Deal?” series, where I will periodically discuss Lakehouse. Your comments and discussions are welcome.

March 25, 2025

by Mingyu Chen

· 3,933 Views · 3 Likes

Text Clustering With Deepseek Reasoning

We will develop and understand a novel approach to text clustering and explain our inference results using the DeepSeek reasoning model.

March 24, 2025

by Kalpan Dharamshi

· 3,142 Views · 2 Likes

Data Lake vs. Data Warehouse vs. Data Lakehouse

The pros and cons of the legacy data warehouse, the more recent data lake, and contemporary data lakehouse architectures.

Updated March 21, 2025

by Noa Shavit

· 7,637 Views · 6 Likes

How Doris SQL Cache Saved My Daily Morning Meetings

Doris SQL Cache can intelligently remember query results and make repetitive queries as smooth as silk, thus significantly improving query performance.

March 21, 2025

by Zen Hua

· 4,109 Views · 4 Likes

Patch Management in the Age of IoT: Challenges and Solutions

IoT patch management tackles risks via automation, lightweight patches, and centralized tools, ensuring security despite device variety and resource limits.

March 20, 2025

by andrew vereen

· 4,928 Views · 1 Like

Personalized Product Recommendations in E-Commerce Using ML

Data mining and machine learning to uncover patterns in data with Support Vector Machine (SVM) and Random Forest in recommending personalized e-commerce products.

March 20, 2025

by Aditi Choudhary

· 2,791 Views · 1 Like

Memory Management in Couchbase’s Query Service

Features designed to manage memory in the Query Service. Learn about mechanisms to control the memory usage of SQL++ queries and of the overall Query engine.

March 19, 2025

by Dhanya Gowrish

· 5,243 Views · 4 Likes

OpenAI vs Ollama Using LangChain's SQLDatabaseToolkit

Learn about LangChain's SQLDatabaseToolkit for NL-to-SQL queries and compare OpenAI and Ollama results, highlighting setup, examples, and tool performance.

March 19, 2025

by Akmal Chaudhri

CORE

· 5,294 Views · 5 Likes

Comparing DuckDB, Snowflake, and Databricks

In this article, we do an in-depth comparison of DuckDB, Snowflake, and Databricks to help you find the best data processing platform for your organization.

March 19, 2025

by Noa Shavit

· 4,654 Views · 1 Like

Financial Data and RAG Usage in LLMs

AI in finance uses LLMs and RAG to assess credit risk, process data, and adapt in real-time, improving decision-making and operational efficiency.

March 18, 2025

by Ajay Tanikonda

· 2,824 Views · 1 Like

Role of Data Annotation Services in AI-Powered Manufacturing

Learn how data annotation services enhance AI-powered manufacturing by improving automation, precision, and decision-making, with their key benefits and applications.

March 18, 2025

by Peter Leo

· 2,509 Views · 2 Likes

SAP HANA Triggers: Enhancing Database Logic and Automation

Learn about SAP HANA triggers, their types, use cases, and best practices to automate tasks, enforce business rules, and optimize database operations.

March 18, 2025

by Govinda Rao Banothu

· 3,409 Views

Secure File Transfer as a Critical Component for AI Success

This article intends to highlight the importance of secure file transfer and its role in support of an organization's artificial intelligence (AI) initiatives.

March 18, 2025

by Anil Soni

· 2,794 Views · 2 Likes

All About GPU Threads, Warps, and Wavefronts

This article talks about the GPU threads concept, warps and warpSize, and subtle differences when using NVIDIA and AMD GPUs.

March 17, 2025

by Jagadish Krishnamoorthy

· 6,226 Views

Attribute-Level Governance Using Apache Iceberg Tables

This article explains how data filter options in lake formation can be fruitful in managing fine-grained access leveraging Apache Iceberg tables.

March 17, 2025

by Ankur Srivastava

· 3,866 Views · 2 Likes

Build a Scalable E-commerce Platform: System Design Overview

This article explores the architecture of a distributed and scalable e-commerce platform with multiple services and components, hosted on a cloud platform like AWS.

March 14, 2025

by Prashant Nigam

· 15,719 Views · 2 Likes

Smart Cities With Multi-Modal Retrieval-Augmented Generation

Learn how MM-RAG revolutionizes smart city management by integrating text, images, and IoT data to deliver real-time actionable insights for urban challenges.

March 14, 2025

by Shaik Abdul Kareem

· 41,849 Views · 2 Likes

Handling Concurrent Data Loads in Delta Tables

Learn to handle Delta Lake concurrency issues with retries, exponential backoff, partitioning, Auto-Optimize, MERGE, and monitoring techniques.

March 12, 2025

by munikrishnaiah sundararamaiah

· 4,861 Views · 3 Likes

Lightning Data Service for Lightning Web Components

Explore the potential of Lightning Data Service in Lightning Web Components through its three core features: base components, wire adapters, and wire functions.

March 12, 2025

by Jaseem Pookandy

CORE

· 4,746 Views · 1 Like

Disaster Recovery Plan for DevOps

Ensuring business continuity requires more than just robust pipelines and agile practices in DevOps, it also requires you to have a reliable DR strategy.

March 11, 2025

by Daria Kulikova

· 4,396 Views · 4 Likes

The Latest Data Topics