DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Data Privacy and Security: A Developer's Guide to Handling Sensitive Data With DuckDB
  • The Critical Role of Data at Rest Encryption in Cybersecurity
  • Guarding Privacy: Cutting-Edge Technologies for Data Protection
  • Next-Gen Data Protection: Navigating Data Security Challenges in 2024

Trending

  • Introducing Graph Concepts in Java With Eclipse JNoSQL
  • Enforcing Architecture With ArchUnit in Java
  • Supervised Fine-Tuning (SFT) on VLMs: From Pre-trained Checkpoints To Tuned Models
  • Chat With Your Knowledge Base: A Hands-On Java and LangChain4j Guide
  1. DZone
  2. Data Engineering
  3. Data
  4. How Does the Milvus Vector Database Ensure Data Security?

How Does the Milvus Vector Database Ensure Data Security?

This article aims to analyze how Milvus, the vector database ensures data security with user authentication and TLS connection.

By 
Charles Xie user avatar
Charles Xie
·
Sep. 05, 22 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
4.1K Views

Join the DZone community and get the full member experience.

Join For Free
In full consideration of your data security, user authentication and transport layer security (TLS) connection are now officially available in Milvus 2.1. Without user authentication, anyone can access all data in your vector database with SDK. However, starting from Milvus 2.1, only those with a valid username and password can access the Milvus vector database. In addition, in Milvus 2.1 data security is further protected by TLS, which ensures secure communications in a computer network.

This article aims to analyze how Milvus, the vector database ensures data security with user authentication and TLS connection and explain how you can utilize these two features as a user who wants to ensure data security when using the vector database.

What Is Database Security and Why Is It Important?

Database security refers to the measures taken to ensure that all data in the database are safe and kept confidential. Recent data breach and data leak cases at Twitter, Marriott, and Texas Department of Insurance, etc, makes us all the more vigilant to the issue of data security. All these cases constantly remind us that companies and businesses can suffer from severe loss if the data are not well protected and the databases they use are secure.

How Does the Milvus Vector Database Ensure Data Security?

In the current release of 2.1, the Milvus vector database attempts to ensure database security via authentication and encryption. More specifically, on the access level, Milvus supports basic user authentication to control who can access the database. Meanwhile, on the database level, Milvus adopts the transport layer security (TLS) encryption protocol to protect data communication.

User Authentication

The basic user authentication feature in the Milvus vector database supports accessing the vector database using a username and password for the sake of data security. This means clients can only access the Milvus instance upon providing an authenticated username and password.

The Authentication Workflow in the Milvus Vector Database

All gRPC requests are handled by the Milvus proxy; hence authentication is completed by the proxy. The workflow of logging in with the credentials to connect to the Milvus instance is as follows.
  1. Create credentials for each Milvus instance, and the encrypted passwords are stored in etcd. Milvus uses bcrypt for encryption as it implements Provos and Mazières's adaptive hashing algorithm.
  2. On the client side, SDK sends ciphertext when connecting to the Milvus service. The base64 ciphertext (<username>:<password>) is attached to the metadata with the key authorization.
  3. The Milvus proxy intercepts the request and verifies the credentials.
  4. Credentials are cached locally in the proxy.
Authentication Workflow
Authentication Workflow


When the credentials are updated, the system workflow in the Milvus vector database is as follows
  1. ..Root coord is in charge of the credentials when insert, query, and delete APIs are called.
  2. When you update the credentials because you forget the password, for instance, the new password is persisted in, etcd. Then all the old credentials in the proxy's local cache are invalidated.
  3. The authentication interceptor looks for the records from local cache first. If the credentials in the cache is not correct, the RPC call to fetch the most updated record from root coord will be triggered. And the credentials in the local cache are updated accordingly. 

Credentials update workflow.

Credentials update workflow.

How to Manage User Authentication in the Milvus Vector Database

To enable authentication, you need to first set common.security.authorizationEnabled to true when configuring Milvus in the milvus.yaml file.

Once enabled, a root user will be created for the Milvus instance. This root user can use the initial password of Milvus to connect to the Milvus vector database.
 
from pymilvus import connections
connections.connect(
    alias='default',
    host='localhost',
    port='19530',
    user='root_user',
    password='Milvus',
)


We highly recommend changing the password of the root user when starting the Milvus vector database for the first time.
Then root user can further create more new users for authenticated access by running the following command to create new users.
 
from pymilvus import utility
utility.create_credential('user', 'password', using='default')


There are two things to remember when creating new users:
  1. As for the new username, it can not exceed 32 characters in length and must start with a letter. Only underscores, letters, or numbers are allowed in the username. For example, a username of "2abc!" is not accepted.
  2. As for the password, its length should be 6-256 characters.
Once the new credential is set up, the new user can connect to the Milvus instance with the username and password.
 
from pymilvus import connections
connections.connect(
    alias='default',
    host='localhost',
    port='19530',
    user='user',
    password='password',
)


Like all authentication processes, you do not have to worry if you forget the password. The password for an existing user can be reset with the following command.
 
from pymilvus import utility
utility.reset_password('user', 'new_password', using='default')


TLS Connection

Transport layer security (TLS) is a type of authentication protocol to provide communications security in a computer network. TLS uses certificates to provide authentication services between two or more communicating parties.

How to Enable TLS in the Milvus Vector Database

To enable TLS in the Milvus vector database, you need to first run the following command to prepare two files for generating the certificate: a default OpenSSL configuration file named openssl.cnf and a file named gen.shused to generate relevant certificates.
 
mkdir cert && cd cert
touch openssl.cnf gen.sh


Then you can copy and paste the configuration we provide here to the two files. Or you can also make modifications based on our configuration to better suit your application.
When the two files are ready, you can run the gen.shfile to create nine certificate files. Likewise, you can also modify the configurations in the nine certificate files to suit your need.
 
chmod +x gen.sh
./gen.sh


There is one final step before you can connect to the Milvus service with TLS. You have to set tlsEnabled to true and configure the file paths of server.pem, server.key, and ca.pem for the server in config/milvus.yaml. The code below is an example.
 
tls:
  serverPemPath: configs/cert/server.pem
  serverKeyPath: configs/cert/server.key
  caPemPath: configs/cert/ca.pem

common:
  security:
    tlsEnabled: true


Then you are all set and can connect to the Milvus service with TLS as long as you specify the file paths of the client.pem, client.key, and ca.pemfor the client when using the Milvus connection SDK. The code below is also an example.
 
from pymilvus import connections

_HOST = '127.0.0.1'
_PORT = '19530'

print(f"\nCreate connection...")
connections.connect(host=_HOST, port=_PORT, secure=True, client_pem_path="cert/client.pem",
                        client_key_path="cert/client.key",
                        ca_pem_path="cert/ca.pem", server_name="localhost")
print(f"\nList connections:")
print(connections.list_connections())


Data security Data structure Database Data (computing) security

Published at DZone with permission of Charles Xie. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Data Privacy and Security: A Developer's Guide to Handling Sensitive Data With DuckDB
  • The Critical Role of Data at Rest Encryption in Cybersecurity
  • Guarding Privacy: Cutting-Edge Technologies for Data Protection
  • Next-Gen Data Protection: Navigating Data Security Challenges in 2024

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!