DZone
Database Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Database Zone > How Milvus Balances Query Load Across Nodes

How Milvus Balances Query Load Across Nodes

Milvus 2.0 supports automatic load balance across query nodes, but you can still do it manually. In this article, we'll teach you how to use this feature.

Jun Gu user avatar by
Jun Gu
CORE ·
xi ge user avatar by
xi ge
·
Mar. 10, 22 · Database Zone · Tutorial
Like (3)
Save
Tweet
3.03K Views

Join the DZone community and get the full member experience.

Join For Free

In previous blog articles, we have successively introduced the deletion, bitset, and compaction functions in Milvus 2.0. To culminate this series, we would like to share the design behind load balance, a vital function in the distributed cluster of Milvus.

Usage

Milvus 2.0 supports automatic load balance by default. But you can still trigger load balance manually. Please note that only sealed segments can be transferred across query nodes.

Now, without further ado, let's try this feature first. (The following example uses PyMilvus 2.0.0 on Milvus 2.0.0).

 
from pymilvus import connections, utility, Collection, DataType, FieldSchema, CollectionSchema
# Connect to Milvus
connections.connect(
    alias="default", 
    host='x.x.x.x', 
    port='19530'
)
# Create a collection
pk_field = FieldSchema(
    name="id", 
    dtype=DataType.INT64, 
    is_primary=True, 
)
vector_field = FieldSchema(
    name="vector", 
    dtype=DataType.FLOAT_VECTOR, 
    dim=2
)
schema = CollectionSchema(
    fields=[pk_field, vector_field], 
    description="Try out load balance"
)
collection_name = "try_load_balance"
collection = Collection(
    name=collection_name, 
    schema=schema, 
    using='default', 
    shards_num=2
)
# Insert randomly generated vectors
import random
data = [
    [i for i in range(100000)],
    [[random.random() for _ in range(2)] for _ in range(100000)]
]
collection.insert(data)
# Load the collection
collection.load()
# Check the number of the entities in the collection to seal the segments
collection.num_entities
# Check the segment information
utility.get_query_segment_info(collection_name)
# [segmentID: 431092560210952194    // Segment ID
# collectionID: 431092538449330177  // Collection ID corresponding to the collection name
# partitionID: 431092538449330178
# mem_size: 2162688                 // Memory usage of the segment
# num_rows: 49937                   // Row count of the segment
# nodeID: 29                        // ID of the query node
# state: Sealed                     // State of the segment, including {Growing, Sealed}
# , segmentID: 431092560210952193
# collectionID: 431092538449330177
# partitionID: 431092538449330178
# mem_size: 2162688
# num_rows: 50063
# nodeID: 29
# state: Sealed
# ]

# Transfer segment with ID 431092560210952193 from query node with ID 29 to that with 28
utility.load_balance(
    src_node_id=29, 
    dst_node_ids=[28], 
    sealed_segment_ids=[431092560210952193]
)


Implementation

Whereas the number and size of segments buffered in query nodes differ, the search performance across the query nodes may also vary. The worst-case scenario could happen when a few query nodes are exhausted searching on a large amount of data, but newly created query nodes remain idle because no segment is distributed to them, causing a massive waste of CPU resources and a huge drop in search performance.

To avoid such circumstances, the query coordinator (query coord) is programmed to distribute segments evenly to each query node according to the RAM usage of the nodes. Therefore, CPU resources are consumed equally across the nodes, thereby significantly improving search performance.

Trigger Automatic Load Balance

According to the default value of the configuration queryCoord.balanceIntervalSeconds, the query coord checks the RAM usage (in percentage) of all query nodes every 60 seconds. If either of the following conditions is satisfied, the query coord starts to balance the query load across the query node:

  1. RAM usage of any query node in the cluster is larger than queryCoord.overloadedMemoryThresholdPercentage (default: 90);
  2. Or the absolute value of any two query nodes' RAM usage difference is larger than queryCoord.memoryUsageMaxDifferencePercentage (default: 30).

After the segments are transferred from the source query node to the destination query node, they should also satisfy both the following conditions:

  1. RAM usage of the destination query node is no larger than queryCoord.overloadedMemoryThresholdPercentage (default: 90);
  2. The absolute value of the source and destination query nodes' RAM usage difference after load balancing is less than that before load balancing.

With the above conditions satisfied, the query coord proceeds to balance the query load across the nodes.

Load Balance

When load balance is triggered, the query coord first loads the target segment(s) to the destination query node. Both query nodes return search results from the target segment(s) at any search request at this point to guarantee the completeness of the result.

After the destination query node successfully loads the target segment, the query coord publishes a sealedSegmentChangeInfo to the Query Channel. As shown below, onlineNodeID and onlineSegmentIDs indicate the query node that loads the segment and the segment loaded respectively, and offlineNodeID and offlineSegmentIDs indicate the query node that needs to release the segment and the segment to release respectively.

Having received the sealedSegmentChangeInfo, the source query node then releases the target segment.

The whole process succeeds when the source query node releases the target segment. By completing that, the query load is set balanced across the query nodes, meaning the RAM usage of all query nodes is no larger than queryCoord.overloadedMemoryThresholdPercentage, and the absolute value of the source and destination query nodes' RAM usage difference after load balancing is less than that before load balancing.

This is the finale of the Milvus 2.0 New feature blog series. Following this series, we are planning a new series of Milvus Deep Dive, which introduces the basic architecture of Milvus 2.0. Please stay tuned.

Database

Published at DZone with permission of Jun Gu. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 7 Features in Jakarta EE 10 Release
  • Purpose-Driven Microservice Design
  • Top 10 Automated Software Testing Tools
  • Migrating Legacy Applications and Services to Low Code

Comments

Database Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo