DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Full-Stack Observability Essentials: Explore the fundamentals of system-wide observability and key components of the OpenTelemetry standard.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Navigating the Divide: Distinctions Between Time Series Data and Relational Data
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Design to Support New Query Parameters in GET Call Through Configurations Without Making Code Changes
  • Time Series Compression Algorithms and Their Applications

Trending

  • Leveraging FastAPI for Building Secure and High-Performance Banking APIs
  • Revolutionizing Software Testing
  • 16 K8s Worst Practices That Are Causing You Pain (Or Will Soon)
  • What Is GitOps?
  1. DZone
  2. Data Engineering
  3. Databases
  4. Time Series Feature Design: Scale Out / High Availability

Time Series Feature Design: Scale Out / High Availability

Oren Eini user avatar by
Oren Eini
·
Mar. 11, 14 · Interview
Like (0)
Save
Tweet
Share
4.04K Views

Join the DZone community and get the full member experience.

Join For Free

I expect (at a bare minimum), to be able to do about 25,000 sustained writes per second on a single machine. I actually expect to be able to do significantly more. But let us go with that amount for now as something that if it drops below that value, we are in trouble. Over a day, that is 2.16 billion records. I don’t think this likely, but that is a lot of data to work with.

That leads to interesting questions on scale out story. Do we need one? Well, probably not for performance reasons (said the person who haven’t written the code, much less benchmarked it) at least not if my hope for the actual system performance comes about.

However, just writing the data isn’t very interesting, we also need to read and work with it. One of the really nice thing about time series data is that it is very easily sharded, because pretty much all the operations you have are naturally accumulative. You have too much data for a single server, go ahead and split it. On query time, you can easily merge it all back up again and be done with it.

However, a much more interesting scenario for us is the high availability / replication story. This is where things gets… interesting. With RavenDB, we do async server side replication. With time series data, we can do that (and we have the big advantage of not having to worry about conflicts), but the question is how.

The underlying mechanism for all the replication in RavenDB is the notion of etags, of a way to track the order of changes to the database. In time series data, that means tracking the changes that happens in some form of a sane fashion.

It means having to track, per write, at least another 16 bytes. (8 bytes for a monotonically increasing etag number, another 8 for the time that was changed). And we haven’t yet spoken of things like replication of deletes, replication of series tag changes, etc. I can tell you that dealing with replication of deletes in RavenDB, for example, was no picnic.

A much simpler alternative would be to not handle this at the data level. One nice thing about Voron is that we can actually do log shipping. That would trivially solve pretty much the entire problem set of replication, because it would happen at the storage layer, and take care of all of it.

However, it does come with its own set of issues. In particular, it means that the secondary server has to be read only, it cannot accept any writes. And by that I mean that it would physically reject them. That leads to a great scale out story with read replicas, but it means that you have to carefully consider how you are dealing the case of the primary server going down for a duration, or any master/master story.

I am not sure that I have any good ideas about this at this point in time. I would love to hear suggestions.

Time series Database Design

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Navigating the Divide: Distinctions Between Time Series Data and Relational Data
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Design to Support New Query Parameters in GET Call Through Configurations Without Making Code Changes
  • Time Series Compression Algorithms and Their Applications

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: