A Look at Oracle Sharding
A Look at Oracle Sharding
Oracle is getting into the sharding game, promising the ability to scale up to 1,000 shards while staying ACID compliant. See how it works and whether it's right for you.
Join the DZone community and get the full member experience.Join For Free
Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph™ is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.
Since its inception, Oracle Database has seen a lot of revisions. It's constantly been revisited to meet the expectations of the data industry. Being an Oracle DBA for over 18 years, I have seen and worked with at least six versions and have experience in other databases like MS-SQL, MySQL, Vertica, Postgres, Redshift, MongoDB, Cassandra, etc. In my experience, the rich features and the database instrumentation for performance troubleshooting and flexibility in operating that Oracle provides haven't been met by any other database so far.
In recent years, we've been seeing trends and motion toward distributed databases due to the increasingly demanding requirement of scalability for web applications, but I'm not here to talk about those databases and their capabilities.
Again, to meet industry requirements, Oracle is releasing a new feature seen in the world distributed databases (shared-nothing architecture), which has been dominated by many other databases in recent years. In the upcoming release Oracle 12.2, the Oracle Sharding feature hopes to provide the exact same capabilities of shared-nothing architectures with leader nodes and shard nodes to distribute data and the ability scale up to 1,000 shards without compromising high availability and relational database properties.
With all other distributed databases, consistency is the main issue. None of them, or maybe a couple of them at most, provide full consistency due to non-implementation of MVCC (undo) in their databases. Users have to compromise on data consistency or set the consistency level to full, knowing that it's costing them performance. But Oracle, with its new sharding feature, provides full consistency and also relational schema support.
Oracle Sharding is a combination of connection pools, ONS, sharding software (GSM), partitioning, and the Oracle Database. It is fully ACID compliant, like other RDBMSes.That could be a major breakthrough.
Supported Features of Oracle Sharding
- Relational schemas
- Database partitioning
- ACID properties and read consistency (very rich feature when compared to other databases)
- SQL and other programmatic interfaces
- Complex data types
- Online schema changes
- Multi-core scalability
- Advanced security
- High availability features
- Enterprise-scale backup and recovery
- Sharded database (SDB): A single logical Oracle database that is horizontally partitioned across a pool of physical Oracle Databases (shards) that share no hardware or software. The schema of this database is partitioned in other databases (different hosts).
- Shards: Independent physical Oracle databases that host a subset of the sharded database SDB (schema).
- Global service: Database services that provide access to data in an SDB and handle implementation of general services to a distributed service.
- Shard catalog: An Oracle database that supports automated shard deployment, centralized management of a sharded database, and multi-shard queries, like leader nodes and config instances in MongoDB.
- Shard directors: Network listeners that enable high-performance connection routing based on a sharding key. It's like a mongos instance and holds the key information stored in the shard catalog.
- Connection pools: At runtime, these act as shard directors by routing database requests across pooled connections.
- Management interfaces: GDSCTL (command-line utility) and Oracle Enterprise Manager (GUI)
Considerations Before Sharding
Licensing for sharding/partitioning.
Application suitability. In general, OLTP applications fit the best with regional data distributed to single node and access through that node.
Design of the relational schema/table, especially the data distribution key, like other databases.
It's not RAC (shared everything) architecture. It's distributed database (shard/partitioning/shared nothing).
How Is the Data Distributed?
When a table is created, one of the columns need to specified as the distribution key. This is common in any distributed databases.
The distribution key can be of type consistent, hash, or list.
The distribution metadata is stored in the shard catalog, called gsmcatalog.
The table is now created in multiple shards (databases) with partitions evenly distributed to each shard.
Data Access flow
A service called GSM global service will be created in the shard catalog database with its type, region affinity, etc. using gsdctl.
This service will be used by the user/application.
When a user fires a query, the service will connect to the shard catalog and get the metadata of the distribution. The shard directors reroute the connection to a specific nodes or all nodes.
The shard catalog acts as the leader node/config node.
Let's take a look at the architecture. In this example, we're using two shards and one shard catalog database.
General Software Requirements of Oracle Sharding
Oracle Database 12c Release 2 , Non-Container Databases
Oracle 12c Global Service Manager (Separate Oracle Home)
Oracle Non-container databases for shard catalog (SDB) database
Creating a Sample Oracle Shard Table
ALTER SESSION ENABLE SHARD DDL; CREATE SHARDED TABLE customers ( cust_id NUMBER NOT NULL , name VARCHAR2(50) , address VARCHAR2(250) , region VARCHAR2(20) , class VARCHAR2(3) , signup DATE CONSTRAINT cust_pk PRIMARY KEY(cust_id) ) PARTITION BY CONSISTENT HASH (cust_id) TABLESPACE SET ts1 PARTITIONS AUTO ;
- Make sure the session has the shard DDL enabled,
- The DDL commands will be fired across the shard nodes through database links created in te SDB.
- A tablespace, ts1, will be created in shardcatalog and shard nodes
- The table partitions will be created in the shard catalog (the first metadata) and shard nodes (exact partitions) distributed evenly.
In next post, we will demonstrate how to install and configure Oracle Sharding and explore more of its capabilities.
Opinions expressed by DZone contributors are their own.