DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Transforming AI-Driven Data Analytics with DeepSeek: A New Era of Intelligent Insights
  • Transforming Data Analytics by Combining SQL and ML
  • Cloud-Driven Analytics Solution Strategy in Healthcare
  • AI/ML Techniques for Real-Time Fraud Detection

Trending

  • Comprehensive Guide to Property-Based Testing in Go: Principles and Implementation
  • Breaking Bottlenecks: Applying the Theory of Constraints to Software Development
  • Stateless vs Stateful Stream Processing With Kafka Streams and Apache Flink
  • Doris: Unifying SQL Dialects for a Seamless Data Query Ecosystem
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Knowledge Graphs and Analytics Without Graph Databases for Gen-AI

Knowledge Graphs and Analytics Without Graph Databases for Gen-AI

A new paradigm that is cost-effective and efficient.

By 
Sudheer Kandula user avatar
Sudheer Kandula
·
Dec. 31, 23 · Analysis
Likes (2)
Comment
Save
Tweet
Share
6.4K Views

Join the DZone community and get the full member experience.

Join For Free

Graphs are more relevant and useful today than ever. Thanks to the AI revolution happening right now, engineers are thinking about the opportunities around Gen-AI, leveraging open Gen-AI solutions with dynamic prompting, data grounding, and masking which further pushes them to think about effective solutions like knowledge graphs.

Engineer, Mary is working on a data grounding problem and is considering building their Knowledge Graph for an AI solution for personalized product recommendations at work, and starts to wonder about

  • How to build these graphs,
  • Where to store them,
  • How to integrate with vast amounts of data we have from wide sources of databases, warehouses, and lake houses?

Mary’s concern seems very reasonable and if she has to now write application logic to generate Graphs, connecting with a new Graph database to store them, which comes with its challenges like integration, security, costs, reliability, and technology learnings.

Mary can overcome these cumbersome issues with a simple yet powerful application of Native Graph Analytics Engines.

Yes, it’s possible today to achieve graph queries on existing data without materializing the graphs or using graph databases.

Wondering how graph analytics and graph queries are achieved natively on pre-existing data in databases, warehouses, and lakes!! Let’s take a sneak peek.

Let me take a step back and explain what are Graphs and How is Graph analytics beneficial over traditional data analytics.

In Software Engineering, Graphs are data structures to model and represent relationships between entities. They consist of vertices (nodes) and edges (relationships) that connect these vertices, and can be directed or undirected, weighted or unweighted.

Graph analytics is a powerful emerging form of data analysis on graph based data that helps businesses understand complex relationships between various data entities. It helps in understanding, visualizing and deriving meaningful insights out of the complex relationships.

How is graph analytics with graph databases better than traditional SQL analytics on relational stores?

                                             Table Comparison of Graph vs Traditional Analytics                                              Table Comparison of Graph vs Traditional Analytics


We can see graph analytics are more performant, flexible, extensible, scalable, and relevant for intelligent|AI analytics.

How Graph Analytics Is Achieved Today

Most of the enterprises today for leveraging Graph analytics generate Graphs and store them in graph databases. Neo4j, TigerGraph, Amazon Neptune, and OrientDB are widely adopted as graph databases in the industry.

enterprises boundary

                                                        Source: Author


The New Paradigm

Native Graph Analytics Engines are the new paradigm where we can directly achieve graph queries, and visualization on existing relational/SQL data without using Graph Databases in between and still leveraging all the advantages we get from both Graph and Traditional Analytics approaches.

This seems to be a very powerful tool with a lot of opportunities around Graph Analytics and seems to altogether can thrown away using redundant graph databases and shift to this new paradigm of Native Graph analytics.

Removing Graph databases from the flow                                                 Removing Graph databases from the flow


 Ultimate user flow diagram

                                                     Ultimate user flow diagram

Three Birds at a Shot!

If we can apply graph queries on existing traditional data stores like relational databases, warehouses, lakes, or lake houses, we can achieve three things in one go,

  1. Zero-ETL: No need for copying, migration, or ETL of data from existing data stored to build and store graphs. Also, no need to copy base relational data from one lake to another. You can have a virtual layer that queries on the fly.
  2. No new Graph databases: No need for materializing and storing data in graph format, They can be done on the fly in run time, no need for a new graph database to bring in, and no need to bother with integration, cost, and security constraints.
  3. High Performance: Also achieve all the performance benefits that graph queries have on relational data.

Industry is keeping up fast with this new approach and there have already been a few players on this front.

Open Source Libraries for Development

Apache Spark GraphX: GraphX is a new component in Spark for graphs and graph-parallel computation which includes a growing collection of graph algorithms and builders to simplify graph analytics tasks.

Apache Flink Gelly: Gelly is Apache Flink’s graph-processing API and library. Flink’s native support for iterations makes it a suitable platform for large-scale graph analytics.

Readily Available Players/Engines for Native Support

PuppyGraph: Using PuppyGraph, you can now graph query your existing data in warehouses, lakes, and lake houses in any of the following open table formats with a seamless no-ETL integration.

  • Apache Iceberg
  • Apache Hudi
  • Apache Hive
  • Delta Lake

And relational data from the below Databases

  • MySQL
  • PostgreSQL

Timbr.ai: Timbr’s Semantic Graph Platform is a SQL-native knowledge graph that turns your databases into reasoning machines so we can apply optimized graphs like SQL queries on data. It supports full back-end integration to any relational database that is SQL / ANSI SQL compliant or can be queried in SQL. The connection can be established either by a JDBC or ODBC connector with no ETL.

  • Relational databases(MySQL, MariaDb, SqlServer, PostgreSQL, SAP Hana, Aurora Oracle)
  • NoSQL databases (MongoDB)
  • Datalakes (S3, GCS, Microsoft ADLS)
  • Warehouses ( RedShift, BigQuery, Snowflake, Databricks, Synapse, Athena)
  • Engines (Apache Spark, Presto, Trino)
  • Data formats (Parquet/JSON/CSV)

To summarize, it’s worth exploring these options we have in place to achieve graph analytics without the use of graph databases and materializing graphs. Good luck trying!!

AI Analytics Apache Flink Apache Spark Relational database

Opinions expressed by DZone contributors are their own.

Related

  • Transforming AI-Driven Data Analytics with DeepSeek: A New Era of Intelligent Insights
  • Transforming Data Analytics by Combining SQL and ML
  • Cloud-Driven Analytics Solution Strategy in Healthcare
  • AI/ML Techniques for Real-Time Fraud Detection

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!