DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Big Data
  4. High-Performance Analytics for the Data Lakehouse

High-Performance Analytics for the Data Lakehouse

New platform enables lakehouse analytics and reduces the cost of infrastructure by conducting analytics without ingesting data into the central warehouse.

Tom Smith user avatar by
Tom Smith
CORE ·
Mar. 15, 23 · Interview
Like (1)
Save
Tweet
Share
1.90K Views

Join the DZone community and get the full member experience.

Join For Free

CelerData, a unified analytics platform for the modern, real-time enterprise, has announced the latest version of its enterprise analytics platform.

“The data lakehouse has added critical capabilities to the data lake architecture by introducing ACID control, table formats, and data governance,” said James Li, CEO, CelerData. “However, analytics capabilities on the lakehouse are still limited and cost prohibitive. Most query engines struggle to support interactive ad-hoc queries, are not able to support real-time analytics, and fall apart when facing a large number of concurrent users.”

The new platform enables lakehouse users to conduct high-performance analytics without ingesting data into a central data warehouse. Queries will be completed three times faster at a significant cost reduction.

Data lakehouse users can perform analytics by querying across streaming data, and historical data in real-time without waiting and combining streaming data into batches for analysis. This simplifies the data architecture and improves the timeliness of lakehouse analytics. The advanced query engine can support thousands of concurrent users at 10,000 QPS (Queries Per Second), enabling new use cases.

I had the opportunity to interview: Li Kang, VP of Strategy, CelerData to learn more about the benefits of the platform evolution:

Question 1

I am not familiar with the table formats Iceberg, Hudi, and Deltalake. Can you provide a use case for each that I can relate to?

Answer

“Table formats are a way to organize data files. Files stored in the data lake don’t have standard metadata information like database tables do. Table formats try to bring database-like features to the data lake, such as tables, columns, transaction logs, etc. The result is that a data lake can be managed in a way similar to how a database is managed.”

Question 2

Why is it important to be able to perform high-performance analytics without ingesting data into a central data warehouse? Is this where the 3X query performance is achieved?

Answer

“Data ingestion has three side effects: It’s expensive because dedicated hardware resources are required to ingest data, and the cloud vendors may also charge for the network traffic. It’s time-consuming because, depending on the amount of data and hardware resources, this process can take minutes to hours. There is also a potential for data quality issues, such as inconsistency, because there is now a duplication of data in its original location and data warehouse.”

“People need data warehouses because data warehouses provide metadata information and great query performance. With CelerData and table formats, we can address these two concerns in the data lake.”

Question 3

What are some examples of business problems you are solving, or will be able to solve for clients?

Answer

“CelerData can be used across industries for any customer who needs to analyze large amounts of data to drive business decisions. Examples of business problems we solve include real-time fraud detection, digital advertisement placement and performance analysis, retail promotion and product recommendations, supply chain management, social media platform analysis, and much more.”

Question 4

Why will developers like this news? How will it make their lives simpler and easier?

Answer

“Modern applications increasingly need built-in applications, so application developers, whether they are building SaaS, mobile, or enterprise applications, will appreciate a high-performance query engine that allows them to derive insight from large amounts of data easily, quickly, and cost-effectively.”

Conclusion

I hope you have learned something new from this interview with Li Kang at CelerData and will take this information and apply it to your software development career or hobby.

Analytics Data lake Data warehouse Use case

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Important Data Structures and Algorithms for Data Engineers
  • 5 Common Firewall Misconfigurations and How to Address Them
  • The Beauty of Java Optional and Either
  • Microservices 101: Transactional Outbox and Inbox

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: