DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Couchbase N1QL: To Query or To Analyze?

Couchbase N1QL: To Query or To Analyze?

Which service should run your query? Well, it depends on your workload.

Keshav Murthy user avatar by
Keshav Murthy
CORE ·
Till Westmann user avatar by
Till Westmann
·
Feb. 06, 19 · Opinion
Like (2)
Save
Tweet
Share
8.01K Views

Join the DZone community and get the full member experience.

Join For Free

Note: This recommendation note was written in collaboration with Till Westmann and Prof. Mike Carey, who lead the Couchbase analytics R&D.


Transaction-Analytics spectrum.  Source: BI Research

Transactions-Analytics Spectrum. Source: BI Research

Couchbase’s Query and Analytics Services both support N1QL (SQL for JSON). A common question is: “Which service should run my query?” The quick answer is, “It depends on your workload.” This article explains this answer.

A transaction is when you buy coffee and cake at the grocery store. The analysis is when the grocery store looks at all the sales data to see what day of the week, what month of the year coffee and/or cake sells more — so they can plan the inventory, sales, and prices.

Every business does these three things in a cycle or a spiral [The Goal].

  1. Run the business process to deliver products or services to the customers.
  2. Analyze the business to determine what to change and what to change to.
  3. Make the change happen.

In modern-day business, each step requires applications to perform the steps.

The Query Service is used by the applications needed to run the business; it is designed for a large number of concurrent queries, each doing a small amount of work. In the RDBMS world, this workload is called the OLTP workload.

Applications or tools used for analysis have different workload characteristics. These typically use the Analytics Service; it is designed for a smaller number of concurrent queries analyzing a larger number of documents. In the RDBMS world, this workload is called the OLAP workload.

Query Service

Analytics Service

Tutorials

https://query-tutorial.couchbase.com/tutorial/#1 https://sqlplusplus-tutorial.couchbase.com/tutorial/#1

High Level Comparison

Used for data manipulation within application logic Used for reports, analysis (historical, interactive), dashboards
Short queries

— Relatively simple SQL

— Typically involving small amounts of data

Longer operations

— Complex SQL with analytics

— Typically involving larger amounts of data

Run SELECT, INSERT, UPDATE, DELETE, MERGE for operational applications Run SELECT for analysis
Random Updates

— Update few documents per query

No Updates

— Changes ingested from Data Service

Millisecond to a second latency,

high throughput (10-1000 qps), performance goal is queries/second

Second to minutes response time,

<1-10 queries per second,

performance goal is seconds/query

Large number of Indexes Fewer indexes
Developer written queries; generated queries are well known Ad-hoc queries; complex reports, dashboards, BI workload
Queries execute on a single query node using the distributed index and data infrastructure Queries execute on all analytics nodes using its distributed compute, index, and data infrastructure

Technical Comparison: Architecture

Queries run in SMP mode, throughput is scalable by adding new query nodes Queries run in MPP mode, can handle larger data or reduce query execution times by adding new analytics nodes

Technical Comparison: Optimizer

Nested loop join by default

Hash join via a query hint

Parallel hash join by default,

(Index-)Nested loop join or broadcast join via a query hint

Technical Comparison: Indexes

Global secondary indexes Local secondary indexes (co-located with data partitions)
Uses Memory Optimized Indexes; standard secondary indexes (Plasma) Log-structured Merge-Tree (LSM) based secondary indexes
Support for both covered and non-covered index scans Non-covered index scans
Can use full-text index for queries (6.5)

Technical Comparison: Execution

Most queries are prepared once and executed many times Ad-hoc and exploratory queries
SDKs use the prepare-execute model based on an ad-hoc flag SDKs offer ad-hoc and parameterized queries
Most operations are done in memory; only when an index scan returns large data is it written to the disk backfill Bounded-memory operations on large data (larger than cluster memory) with graceful spilling as needed
Single node query parallelism Multi-node partitioned-parallel join, sort, aggregate, and grouped aggregate operators
Expects a single (possibly RAIDed) storage device Un-RAIDed use of multiple storage devices
Performance isolation via covering indexes that support scan, grouping, and aggregation operations on Index Service nodes Performance isolation for all queries via shadowing of data on Analytics Service nodes

Technical Documentation

https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/index.html https://docs.couchbase.com/server/6.0/analytics/introduction.html

Technical Information

https://www.couchbase.com/products/n1ql https://www.couchbase.com/sqlplusplus

Technical Books

https://blog.couchbase.com/a-guide-to-n1ql-features-in-couchbase-5-5-special-edition/ https://resources.couchbase.com/sql_tutorial

https://www.amazon.com/SQL-Users-Tutorial-Don-Chamberlin/dp/0692184503/


  1. [The GOAL]: The Goal: The process of ongoing improvement. https://www.amazon.com/Goal-Process-Ongoing-Improvement/dp/0884271951/ref=sr_1_1?keywords=the+goal&qid=1547969233&sr=8-1
  2. Systems of Engagement for Customer Jobs: https://marketing.cioreview.com/cxoinsight/systems-of-engagement-for-customer-jobs-nid-24677-cid-51.html
Database Data infrastructure Analyze (imaging software) Analytics application Joins (concurrency library) Comparison (grammar) Nested loop join Technical documentation

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Kotlin Is More Fun Than Java And This Is a Big Deal
  • How To Avoid “Schema Drift”
  • Problems of Cloud Cost Management: A Socio-Technical Analysis
  • DevSecOps Benefits and Challenges

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: