DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Vector Databases Are Reinventing How Unstructured Data Is Analyzed
  • JQueue: A Library to Implement the Outbox Pattern
  • How Milvus Realizes the Delete Function
  • FHIR Data Model With Couchbase N1QL

Trending

  • How Kubernetes Cluster Sizing Affects Performance and Cost Efficiency in Cloud Deployments
  • Exploring Intercooler.js: Simplify AJAX With HTML Attributes
  • Unit Testing Large Codebases: Principles, Practices, and C++ Examples
  • Designing a Java Connector for Software Integrations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Introduction to Couchbase for Oracle Developers and Experts: Part 6: Indexing

Introduction to Couchbase for Oracle Developers and Experts: Part 6: Indexing

Today, we're exploring data and data processing issues and interests. SQL, NoSQL, flexible schema, scale-up, scale-out, transactions, and high availability.

By 
Keshav Murthy user avatar
Keshav Murthy
DZone Core CORE ·
Feb. 06, 22 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
3.8K Views

Join the DZone community and get the full member experience.

Join For Free

Here are the previous articles comparing architecture, database objects, data types, data modeling, and statements and features of Oracle with Couchbase. This post will focus on indexing.  

Funny illustration

"Use the Index, Luke!"  -- Source 

Overview




Oracle

Couchbase

Index Documentation

Index Documentation

Types of Indexes: 

Primary & secondary Index (B-tree)

Bitmap Index

Partial Index

Partitioned Index

Function-based index

Spatial index

Search indexes (full-text search)

Types of Indexes: 

Primary & secondary index (lock-free skiplist)

Partial index

Partitioned index

Functional-key index

Array index

Flex indexes

Search index

Spatial index


Indexing data structures

B-Tree

Bitmap

Spatial

Inverted tree

Indexing data structures

Lock-free skiplist

Inverted tree (text)

Z-curve (spatial)

CLASSES of INDEXES


Oracle indexes can be large and are managed through bufferpools.  Oracle text index can be creed with an in-memory option.

Couchbase has two classes of indexes: Standard secondary index which can be large and relevant entries are paged in based on usage, memory-optimized index, optimized for performance is entirely kept in memory and a write is done to the disk for recovery purposes.

INDEXING FEATURES


Updates;

Indexes are updated synchronously.  Changes are visible within the transaction (read your own writes).

Updates:

The indexes are updated asynchronously. However, within each transaction, changes are visible immediately (read your own writes) for all the access methods (including index scan).  Yes, we do magic!

Table Scans

For SQL indexes are optional. You need it for improving query latency, throughput, and meeting SLAs. Tables have internal mechanisms to scan the entire table, partitioned or otherwise.   

Collection Scans

Couchbase collections are hash partitioned distributed table/collection. You can retrieve a document if you have the document key. There isn’t a full scan access method just in the collection. You should build a PRIMARY INDEX that provides the equivalent of a table scan for collection. You can issue arbitrary queries on it.

Index consistency

Index updates are synchronous and are visible based on the isolation level. 

Index consistency:

Index updates are asynchronous and are visible based on index scan consistency. In Couchbase transactions, the index scan levels are request_plus by default and read-your-own-writes are always in place.  These subtle features are taken for granted in a single system and transactional RDBMS.  These options are available in modern distributed systems so applications can trade off consistency for availability and performance.

TYPES OF INDEXES


PRIMARY KEY index

CREATE TABLE t1(c1 int primary key)

CREATE TABLE t1(c1 int, constraint c1pk PRIMARY KEY(c1))

You can create a primary key one or more columns and an index is automatically created to enforce this primary key constraint.

PRIMARY KEY index

CREATE PRIMARY INDEX ON t1;

CREATE PRIMARY INDEX ip1 ON t1; 

Each JSON document you insert into Couchbase has a separate, user-generated, unique per-collection document key that can be up to 250 bytes. The primary key is simply an index on the document keys.  The uniqueness is enforced by the collection without the need for the primary key.

SECONDARY index

CREATE INDEX i1 ON t(c1)

CREATE INDEX i1 ON t(c1, c2, c3)

CREATE INDEX i1 ON t(c1 ASC, c2 DESC, c3 DESC)

These are the workhorses for an OLTP workload. These form the kernel of the index and can be combined with most other features to form sophisticated, sometimes complex indexes to power the workload. This is Luke’s lightsaber. 

SECONDARY index

CREATE INDEX i1 ON t(c1)

CREATE INDEX i1 ON t(c1, c2, c3)

CREATE INDEX i1 ON t(c1 ASC, c2 DESC, c3 DESC)

The secondary indexes are similar to Oracle at a high level. Couchbase indexes do not support reverse scans. If you do need them, you need to specify the DESC order in the index definition. You can define that for each key.  Each type of index will be used for all types of range scans, it only makes a difference in performance for order-by query optimization. 

PARTIAL index

“Partial Index gives us the ability to create both local and global indexes on only a subset of partitions within a partitioned table. Prior to Oracle 12c, you could not create indexes on selective partitions; Indexes always meant on all of the data.” 

PARTIAL INDEX

The concept here is similar to PostgreSQL to Oracle.

You can create indexes on any arbitrary subset of documents and the optimizer will choose the index automatically and when appropriate.

CREATE INDEX i1 ON t(c1) where c2 = “USA”;

CREATE INDEX i2 ON t(c1, c2) where c3 IN [“C”, 23, 24];

CREATE INDEX i3 ON t(c1) WHERE c4 LIKE “xyz%”

FUNCTION based index

Instead of indexing the column value AS-IS, you index the result of a function or an expression on it.

CREATE INDEX i1 ON t(LOWER(c1));

CREATE INDEX i2 ON t(c1 + (c2 * c4));


FUNCTION based index

The functionality is similar.

CREATE INDEX i1 ON t(LOWER(c1));

CREATE INDEX i2 ON t(c1 + (c2 * c4));


PARTITION index

Oracle has the widest functionality and support for partitioning for table and therefore Index: range, list, hash, interval, reference, and all. All of these combined with other features makes it powerful. This is mainly targeted for data warehousing where you’re potentially analyzing large sets of data and “logically pruning” partitions to scan for the query has a significant benefit.

PARTITION index

Couchbase collections are always hash partitioned. By default, the index is global and in a single partition. You’ll have defined the partitioned index as part of CREATE INDEX. Couchbase supports hash partitioned index, but this can be on any arbitrary expression.  Just like any hash partitioned object, queries with equality or IN expression on the partitioning key will benefit from partition pruning; others will get the benefit from parallel scans.

For partitioning by range, you’ll have to use partial index syntax and create multiple indexes.  Oracle-like range or interval index syntaxes are unavailable. 

DOMAIN indexes (Search)

You can index and query text for language awareness, stemming, etc using Oracle text.  These indexes are used by the optimizer when you have CONTAINS, CATSEARCH, or MATCHES predicates. Oracle text supports only character types and cannot index numerical or DateTime data types. 

Full-Text Search (Search)

Text search is similar to Oracle. Couchbase FTS can index text(string), numbers, booleans, and datetimes making the search usable in a larger number of use cases.  Couchbase FTS has an elaborate query language and all of its features can be used by N1QL using the SEARCH() predicate.  N1QL also exploits the FTS index for its FLEX indexing optimization for the queries. Flex index is a technology where a single index in FTS can support arbitrarily complex predicates in N1QL.  This helps when users are given the flexibility to choose custom predicates for their reports. 

Use cases:

  1. Flexible query support: Index on fields (a, b, c) can have predicates (a= 10 and (b = 20 or c between 30 and 40)); It can also have predicates ((a = 10 or c = 20) or (b between 10 and 20); etc, etc.

  2. Dynamic query support: when you create this flex index with (dynamic property set to true), FTS will index all the fields in the document, including the ones unknown at the time of index creation. This takes flexibility to new heights. Index evolves as the schema evolves. 

  3. All the cases above support searching in addition to range queries.

  4. In addition to search, FTS allows simple aggregation known as facets.

  5. FTS also helps efficient processing of multiple array predicates because it can index any number of arrays in a single index efficiently. 

The full-text search has a fuller functionality to enable modern agile development and effective search.

DOMAIN indexes (Spatial)

These indexes non-tradition types like point, line, polygon, etc with the ability to issues filters like overlaps, contains, and nearest neighbor.  

Couchbase SPATIAL index

Couchbase spatial index uses a z-curve data structure and is integrated into FTS indexing and querying. This also enables you to index and query scalars, arrays, text, and spatial using a single index!  You can learn more here, here, and here 

ARRAY index

Oracle calls this MULTIVALUE index – essentially, the index will have multiple entries pointing to the same doing. A normal index will have one index entry per row.


CREATE MULTIVALUE INDEX mvi_1 ON mytable t

      (t.jcol.credit_score.numberOnly());



ARRAY index

The array is THE difference between the relational model and the JSON model.   — Gerald Sangudi

As we saw in the data model section, it’s easy to store the array, but difficult to index.  Couchbase supports the most generalized array indexing in databases. From simple array indexes on a field, set of fields, and expressions on each one of those. As mentioned above, Using FTS, Couchbase can index multiple array fields in a single index and push down multiple query predicates to the index scan to improve performance. 

  

BITMAP index

Unavailable. 

Index High Availability

Oracle allows the creation of multiple indexes with the same index definition. So, if one of the indexes is unavailable (rebuild/etc), the other can be used for queries.

Index High Availability

You can simply create additional copies of the index by specifying the num_replica parameter to CREATE INDEX statement. The query engine automatically distributes the workload between the indexes based on the performance and workload. 

Index consistency

Index updates are synchronous and are visible based on the isolation level. 

Index consistency:

Index updates are asynchronous and are visible based on index scan consistency. In Couchbase transactions, the index scan levels are request_plus by default and read-your-own-writes is always in place.  These subtle features are taken for granted in a single system and transactional RDBMS.  These options are available in modern distributed systems so applications can trade off consistency for availability and performance.

INDEX ADVISOR

Oracle’s SQL advisor includes an index advisor among other things.

INDEX ADVISOR

Couchbase has to ADVISE statement, ADVISOR function, and advisor service. 


With so many types of indexes, the choices for the optimizer are plenty. In the next article, we compare the two optimizers!
Database Relational database Data structure

Published at DZone with permission of Keshav Murthy, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Vector Databases Are Reinventing How Unstructured Data Is Analyzed
  • JQueue: A Library to Implement the Outbox Pattern
  • How Milvus Realizes the Delete Function
  • FHIR Data Model With Couchbase N1QL

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!