DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Securing Your Software Supply Chain with JFrog and Azure
Register Today
  1. DZone
  2. Data Engineering
  3. Databases
  4. Getting to Know the MariaDB ColumnStore
Content provided by MariaDB logo

Getting to Know the MariaDB ColumnStore

Learn more about how MariaDB's ColumnStore works by glancing at its architecture and how it can be integrated into you rwork.

David Thompson user avatar by
David Thompson
·
Nov. 09, 16 · Tutorial
Like (2)
Save
Tweet
Share
6.84K Views

mariadb columnstore is a gplv2 open source columnar database built on mariadb server. it is a fork and evolution of the former infinidb product. it can deployed in the cloud (optimized for amazon web services) or on a local cluster of linux servers using either local or networked storage.

mariadb columnstore: a massively parallel, distributed database

mariadb columnstore consists of two main component service classes:

  • user modules : providing the mariadb sql engine front end and query orchestration.
  • performance modules : providing distributed query processing.

by utilizing the mariadb server as the front end, all server capabilities of mariadb can also be leveraged including secure connections, audit plugin, and other storage engines. for the latter, columnstore supports the ability to perform cross engine joins to allow querying a columnstore table against say an innodb table.

the mariadb server process incoming connection requests and queries for each user connection as shown in the diagram below. once a sql query is received by the user module, it processes that sql query and distributes query operations across the performance modules. the performance modules executes the query operations in distributed manner and reads/writes mariadb columnstore columnar data files and return intermediate query operation results to user modules. any operation that cannot be distributed is performed at the user module level before returning results through the mariadb server process back to the client.

image title

both user modules and performance modules are horizontally scalable. scaling performance modules first allows for the greatest reduction in individual query performance. scaling user modules allows for high availability and increased query concurrency. both user modules and performance modules are multi-threaded further increasing performance on a per node level.

mariadb columnstore: a columnar database

as a columnar database, mariadb columnstore stores table data in columns rather than rows. this allows the query optimizer to only read columns necessary to fulfill a given query and its result set. once a particular column value has been identified, the corresponding row values can easily be determined through a logical offset into those other column files. data partitioning by columns is also called vertical partitioning. horizontal partitioning of data is achieved by distributing the data across performance modules. further data elimination within a performance module is achieved by maintaining range metadata within a distributed extent map allowing elimination of particular column extent files should the value fall outside of the range. by storing data as columns it is also much easier to add and remove columns over time even online.

it is important to recognize that both the vertical and horizontal partition are automatically provided and managed by mariadb columnstore. very little configuration and maintenance is required to maintain high performance of the system. as a result, indexes are not required to be defined or maintained either.

mariadb columnstore: the big data platform

if your analytical query workload is up to a hundred thousand rows and your table's size remains under a million rows, an oltp engine such as innodb or myisam will handle this with reasonable performance. beyond that, performance is much harder to tune for and maintain. mariadb columnstore is designed for such workloads.

it is suitable for reporting or analysis of millions-billions of rows from data sets containing millions-trillions of rows. as the data size grows, mariadb columnstore allows you to add more pm nodes to scale your performance linearly.

my next blog post will continue this topic and go deeper into more specifics of how mariadb columnstore is able to handle such big data workloads. you can write to me at or follow me @davidwbt for more insights into mariadb columnstore.


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: