Data Resources

A tech company that uses ClickHouse database cloning with JuiceFS for efficient end-to-end testing, improving data quality and consistency.

January 27, 2025

by Tao Ma

· 1,774 Views · 1 Like

CUI Document Identification and Classification

This article focuses on how developers can design and implement systems to identify, classify, and tag CUI documents using technical tools and frameworks.

January 27, 2025

by Prashant Kondle

· 1,805 Views

Accelerating HCM Cloud Implementation With RPA

Streamline HCM cloud implementation with RPA: automate data migration, MFA handling, role management, and UAT for faster, accurate, and secure deployments.

January 24, 2025

by Tarun Jain

· 3,417 Views · 1 Like

AI/ML Techniques for Real-Time Fraud Detection

The importance of AI/ML techniques in real-time fraud detection, with a focus on behavioral analytics to combat increasingly sophisticated fraud methods.

January 24, 2025

by Milavkumar Shah

· 22,451 Views · 4 Likes

Phased Migration Strategy for Zero Downtime in Systems

Software migrations are inevitable, but clean execution is crucial to avoid future chaos like rollbacks or backfilling. Here are some tips to ensure smooth migrations.

January 23, 2025

by Sandeep Kumar Gond

· 3,092 Views · 2 Likes

NoSQL for Relational Minds

If you’ve mostly worked with relational databases, you’re missing out on the vast possibilities of NoSQL. Time to discover a world beyond rows and tables.

January 23, 2025

by Bhrigu Srivastava

· 4,527 Views · 5 Likes

Refactoring Design Patterns in Python

After a year, I finally converted all examples from the book Refactoring to Patterns by Joshua Kerievsky to Python. I explain some examples in this article.

January 23, 2025

by Douglas Cardoso

· 2,020 Views

An Introduction to Bloom Filters

An introduction to the Bloom filter data structure, explaining what it is, when to use it, and key technical details about its implementation and functionality.

January 23, 2025

by Sandeep Kumar Gond

· 2,091 Views · 3 Likes

Understanding HyperLogLog for Estimating Cardinality

Learn about the probabilistic algorithm designed to estimate the cardinality of a dataset with both high accuracy and low memory usage.

January 23, 2025

by Bhala Ranganathan

CORE

· 3,515 Views

Caching Strategies for Resilient Distributed Systems

Caching boosts performance in distributed systems but can fail. Mitigate risks with intermediate caches and smart caching strategies. Always design for cache failure.

January 22, 2025

by Rajesh Pandey

· 2,606 Views · 2 Likes

Hybrid Search Using Postgres DB

This article explains how to implement hybrid search (lexical and semantic) in a single PostgresDB using full-text-search and pgvector.

January 22, 2025

by Suraj Dharmapuram

· 4,115 Views · 6 Likes

5 Key Steps for a Successful Cloud Migration Strategy

Applying the right strategy is essential when moving a company from on-premises to the cloud or from one cloud to another.

January 21, 2025

by Srinivas Chippagiri

CORE

· 2,817 Views · 1 Like

Choose a Database With Hybrid Vector Search for AI Apps

Semantic and traditional search capabilities are needed for AI applications. In this article, we look at the features that LLM/RAG applications need to succeed.

January 21, 2025

by John Lafleur

· 2,579 Views · 2 Likes

OPC-UA and MQTT: A Guide to Protocols, Python Implementations

Explore two essential IoT protocols: OPC-UA for secure and structured industrial device communication and MQTT, a lightweight, real-time protocol for telemetry.

January 21, 2025

by Nikhil Makhija

· 3,477 Views · 2 Likes

Seamless Transition from Elasticsearch to OpenSearch

A step-by-step guide for migrating from Elasticsearch to OpenSearch, ensuring compatibility, performance, and cost-efficiency.

January 20, 2025

by Nilesh Jain

· 4,232 Views · 3 Likes

Real-Time Data Streaming With AI

Real-time data streaming combined with AI transforms traditional analytics by enabling instant insights, scalability, and improved decision-making.

January 20, 2025

by Suri Nuthalapati

CORE

· 15,995 Views · 2 Likes

Building a Reactive Event-Driven App With Dead Letter Queue

Learn how to build fault-tolerant, reactive event-driven applications using Spring WebFlux, Apache Kafka, and Dead Letter Queue to handle data loss efficiently.

January 20, 2025

by Sulakshana Singh

· 3,111 Views · 3 Likes

Dark Data: Recovering the Lost Opportunities

Dark data is the vast amounts of unstructured information collected by organizations that often go unused. It includes emails, customer interactions, sensor data, etc.

January 17, 2025

by Vijay Singh Khatri

CORE

· 2,881 Views

ArangoDB: Achieving Success With a Multivalue Database

ArangoDB's multimodel capabilities simplify handling key-value, document, and graph data in one database. Jakarta NoSQL enables seamless integration.

January 16, 2025

by Otavio Santana

CORE

· 3,032 Views · 1 Like

Understanding Leaderless Replication for Distributed Data

Learn about leaderless replication: its trade-offs, direct writes vs. coordination-based approaches, failure handling, and commercial databases in distributed systems.

January 16, 2025

by Stelios Manioudakis, PhD

CORE

· 8,256 Views · 4 Likes

The Latest Data Topics