DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Inside the Elastic Shard
  • How to Optimize Elasticsearch for Better Search Performance
  • Logging With the Elastic Stack
  • Microservices With Apache Camel and Quarkus (Part 3)

Trending

  • Unleashing the Power of Microservices With Spring Cloud
  • JWT Token Revocation: Centralized Control vs. Distributed Kafka Handling
  • What Is Kubernetes RBAC and Why Do You Need It?
  • Exploring Sorting Algorithms: A Comprehensive Guide
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. RavenDB Sharding: Adding a New Shard to an Existing Cluster, the easy way

RavenDB Sharding: Adding a New Shard to an Existing Cluster, the easy way

We continue our series of answers to questions from our mailing list. We have a sharded cluster, and we want to add a new node to the cluster. How do we do it?

Oren Eini user avatar by
Oren Eini
·
May. 23, 15 · Interview
Like (0)
Save
Tweet
Share
4.94K Views

Join the DZone community and get the full member experience.

Join For Free

Continuing on the theme of giving a full answer to interesting questions on the mailing list in the blog, we have the following issue. 

We have a sharded cluster, and we want to add a new node to the cluster, how do we do it? I’ll discuss a few ways in which you can handle this scenario. But first, let us lay out the actual scenario.

We’ll use the Customers & Invoices system, and we put the data in the various shard along the following scheme:

CustomersSharded by region
InvoicesSharded by customer
UsersShared database (not sharded)

We can configure this using the following:

var shards = new Dictionary<string, IDocumentStore>
{
    {"Shared", new DocumentStore {Url ="http://rvn1:8080", DefaultDatabase = "Shared"}},
    {"EU", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe"}},
    {"NA", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica"}},
};

ShardStrategy shardStrategy = new ShardStrategy(shards)
    .ShardingOn<Company>(company =>company.Region, region =>
    {
        switch (region)
        {
            case "USA":
            case "Canada":
                return "NA";
            case "UK":
            case "France":
                return "EU";
            default:
                return "Shared";
        }
    })
    .ShardingOn<Invoice>(invoice => invoice.CompanyId)
    .ShardingOn<User>(user=> "Shared");


So far, so good. Now, we have so much work that we can’t just have two servers for customers & invoices, we need more. We change the sharding configuration to include 2 new servers, and we get:

var shards = new Dictionary<string, IDocumentStore>
     {
         {"Shared", new DocumentStore {Url = "http://rvn1:8080", DefaultDatabase = "Shared"}},
         {"EU", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe"}},
         {"NA", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica"}},
        {"EU2", new DocumentStore {Url = "http://rvn4:8080", DefaultDatabase = "Europe-2"}},
        {"NA2", new DocumentStore {Url = "http://rvn5:8080", DefaultDatabase = "NorthAmerica-2"}},
     };

    var shardStrategy = new ShardStrategy(shards);
    shardStrategy.ShardResolutionStrategy = new NewServerBiasedShardResolutionStrategy(shards.Keys, shardStrategy)
        .ShardingOn<Company>(company => company.Region, region =>
        {
            switch (region)
            {
                case "USA":
                case "Canada":
                    return "NA";
                case "UK":
                case "France":
                    return "EU";
                default:
                    return "Shared";
            }
        })
        .ShardingOn<Invoice>(invoice => invoice.CompanyId)
        .ShardingOn<User>(user => user.Id, userId => "Shared");

Note that we have a new shard resolution strategy, what is that? This is how we control lower level details of the sharding behavior, in this case, we want to control where we’ll write new documents.

public class NewServerBiasedShardResolutionStrategy : DefaultShardResolutionStrategy
{
    public NewServerBiasedShardResolutionStrategy(IEnumerable<string> shardIds, ShardStrategy shardStrategy)
        : base(shardIds, shardStrategy)
    {
    }

    public override string GenerateShardIdFor(object entity, ITransactionalDocumentSession sessionMetadata)
    {
        var generatedShardId = base.GenerateShardIdFor(entity, sessionMetadata);
        if (entity is Company)
        {
            if (DateTime.Today < new DateTime(2015, 8, 1) ||
                DateTime.Now.Ticks % 3 != 0)
            {
                return generatedShardId + "2";
            }
            return generatedShardId;
        }
        return generatedShardId;
    }
} 

What is the logic? If we have a new company, we’ll call the GenerateShardIdFor(entity) method, and for the next 3 months, we’ll create all new companies (and as a result, their invoices) in the new servers. After the 3 months have passed, we’ll still generate the new companies on the new servers at a rate of two thirds on the new servers vs. one third on the old servers.

Note that in this case, we didn’t have to modify any data whatsoever. And over time, the data will balance itself out between the servers. In my next post, we’ll deal with how we actually split an existing shard into multiple shards.

Shard (database architecture) cluster

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Inside the Elastic Shard
  • How to Optimize Elasticsearch for Better Search Performance
  • Logging With the Elastic Stack
  • Microservices With Apache Camel and Quarkus (Part 3)

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: