DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Implementing Data Analytics in Healthcare: A Hands-On Approach
  • Revolutionizing Catalog Management for Data Lakehouse With Polaris Catalog
  • Understanding PolyBase and External Stages: Making Informed Decisions for Data Querying
  • Emerging Trends in Data Warehousing: What’s Next?

Trending

  • Compliance Automated Standard Solution (COMPASS), Part 11: Compliance as Code, the OSCAL MCP Server Way
  • Introduction to Retrieval Augmented Generation (RAG)
  • Detecting Bugs and Vulnerabilities in Java With SonarQube
  • A Deep Dive into Tracing Agentic Workflows (Part 1)
  1. DZone
  2. Data Engineering
  3. Big Data
  4. High-Performance Analytics for the Data Lakehouse

High-Performance Analytics for the Data Lakehouse

New platform enables lakehouse analytics and reduces the cost of infrastructure by conducting analytics without ingesting data into the central warehouse.

By 
Tom Smith user avatar
Tom Smith
DZone Core CORE ·
Mar. 15, 23 · Interview
Likes (2)
Comment
Save
Tweet
Share
3.1K Views

Join the DZone community and get the full member experience.

Join For Free

CelerData, a unified analytics platform for the modern, real-time enterprise, has announced the latest version of its enterprise analytics platform.

“The data lakehouse has added critical capabilities to the data lake architecture by introducing ACID control, table formats, and data governance,” said James Li, CEO, CelerData. “However, analytics capabilities on the lakehouse are still limited and cost prohibitive. Most query engines struggle to support interactive ad-hoc queries, are not able to support real-time analytics, and fall apart when facing a large number of concurrent users.”

The new platform enables lakehouse users to conduct high-performance analytics without ingesting data into a central data warehouse. Queries will be completed three times faster at a significant cost reduction.

Data lakehouse users can perform analytics by querying across streaming data, and historical data in real-time without waiting and combining streaming data into batches for analysis. This simplifies the data architecture and improves the timeliness of lakehouse analytics. The advanced query engine can support thousands of concurrent users at 10,000 QPS (Queries Per Second), enabling new use cases.

I had the opportunity to interview: Li Kang, VP of Strategy, CelerData to learn more about the benefits of the platform evolution:

Question 1

I am not familiar with the table formats Iceberg, Hudi, and Deltalake. Can you provide a use case for each that I can relate to?

Answer

“Table formats are a way to organize data files. Files stored in the data lake don’t have standard metadata information like database tables do. Table formats try to bring database-like features to the data lake, such as tables, columns, transaction logs, etc. The result is that a data lake can be managed in a way similar to how a database is managed.”

Question 2

Why is it important to be able to perform high-performance analytics without ingesting data into a central data warehouse? Is this where the 3X query performance is achieved?

Answer

“Data ingestion has three side effects: It’s expensive because dedicated hardware resources are required to ingest data, and the cloud vendors may also charge for the network traffic. It’s time-consuming because, depending on the amount of data and hardware resources, this process can take minutes to hours. There is also a potential for data quality issues, such as inconsistency, because there is now a duplication of data in its original location and data warehouse.”

“People need data warehouses because data warehouses provide metadata information and great query performance. With CelerData and table formats, we can address these two concerns in the data lake.”

Question 3

What are some examples of business problems you are solving, or will be able to solve for clients?

Answer

“CelerData can be used across industries for any customer who needs to analyze large amounts of data to drive business decisions. Examples of business problems we solve include real-time fraud detection, digital advertisement placement and performance analysis, retail promotion and product recommendations, supply chain management, social media platform analysis, and much more.”

Question 4

Why will developers like this news? How will it make their lives simpler and easier?

Answer

“Modern applications increasingly need built-in applications, so application developers, whether they are building SaaS, mobile, or enterprise applications, will appreciate a high-performance query engine that allows them to derive insight from large amounts of data easily, quickly, and cost-effectively.”

Conclusion

I hope you have learned something new from this interview with Li Kang at CelerData and will take this information and apply it to your software development career or hobby.

Analytics Data lake Data warehouse Use case

Opinions expressed by DZone contributors are their own.

Related

  • Implementing Data Analytics in Healthcare: A Hands-On Approach
  • Revolutionizing Catalog Management for Data Lakehouse With Polaris Catalog
  • Understanding PolyBase and External Stages: Making Informed Decisions for Data Querying
  • Emerging Trends in Data Warehousing: What’s Next?

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook