DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Couchbase Eventing: Small Scripts that Solve Big Problems at Scale
Content provided by Couchbase logo

Couchbase Eventing: Small Scripts that Solve Big Problems at Scale

With simple functions come great power. Couchbase Eventing is must have tool in your big data arsenal allowing small scripts to overcome hard to solve problems.

Jon Strabala user avatar by
Jon Strabala
·
Jun. 16, 20 · Tutorial
Like (3)
Save
Tweet
Share
5.45K Views

Eventing: Simple Yet Powerful

Eventing allows small scripts to overcome hard to solve problems. First off, let's look at the basic Eventing Lifecycle. Eventing Life Cycle 6.5 The steps below show how easy it is to write and use an Eventing Function:

  • Add (or import) an Eventing Function via Couchbase Server's UI.
  • Assign a data source, a scratchpad bucket, and some bindings to manipulate documents or communicate with the outside world.
  • Implement some JavaScript code to process the received mutation.
  • Save your new Function and hit "Deploy"

That's it; you now have a distributed function responding to mutations in your data set in real-time across your entire cluster. The Eventing service provides an infrastructure-less platform that can scale your Eventing Functions as your business experiences growth whether a one-time spike or a monthly increase in data stores or clients served without concern for the fact that your JavaScript-based Eventing functions are running in a robust reliable parallel distributed fashion.  To learn more about Couchbase Eventing please refer to the Eventing Overview in our documentation. The examples in this article below show that in some cases Eventing can act like a drop of oil to needed "free up" the moving parts of your applications.

Couchbase a Database You Build to Order

I like to think of a Couchbase cluster as a set of scalable interoperable "microservices". These services can be wired together and sized to meet a specific set of operational and business needs.  Couchbase provides Multi-dimensional Scaling, or MDS, across five key services with minimal resource interference between them. Furthermore, each Couchbase service is independently scalable, via simply adding more nodes. This allows customers to create the ideal multi-node cluster at the possible lowest TCO for the task(s) at hand.

  • The Data nodes easily scale to provide JSON aware KV operations at-scale for single and multi-record lookups is extremely fast. Did I did mention that for single and multi-record lookups it is extremely fast?
  • The Query nodes easily scale to provide indexes and also N1QL, a SQL enhancement that works with natively with JSON documents, that allows programmers get up and running quickly without having to learn a new way of thinking.
  • The Search nodes easily scale to provide full-text search for natural-language querying featuring: language-aware searching, scoring of results, and fast FTS based indexes,
  • The Analytics node easily scales to parallel data management capability to efficiently run complex queries: large ad hoc join, set, aggregation, and grouping operations across large datasets.
  • The Eventing service easily scale to provide a computing paradigm that developers can use to handle data changes (or mutations) and react to them in real-time.

Like all product lines the newer services in Couchbase Query, Search, and Eventing, and Analytics have a few warts but taken as a whole the complete basket provides a unified suite, or a one-stop shop to solve a myriad of problems. I mean seriously if you don't care about a unified product and all you are going to do is use FTS you might just consider using Elasticsearch but once you need to integrate your FTS results with N1QL (SQL for JSON) you might have been much better off just starting with Couchbase. Today we are going to utilize just two services: 1) the primary KV service provided by Data nodes 2) the Eventing Service. I will highlight through a few tiny JavaScript functions how you can overcome some hard at-scale problems by leveraging the Eventing Service.

Prerequisites

In this article we will be using the latest GA version, Couchbase version 6.5.1 However, if you are not familiar with Couchbase or the Eventing service please walk through GET STARTED and one Eventing example specifically refer to the following:

  • Setup a working Couchbase 6.5.1 server as per the directions in Start Here!
  • Understand how to deploy a basic Eventing function as per the directions in the Data Enrichment example specifically "Case 2" where we will only use one bucket the 'source' bucket.

Enriching Data via Eventing

A Typical Customer Problem

A live production system has stored billions of documents. A new business need has occurred where the existing data needs to be enriched.  This requirement of additional data impacts the complete document set.  The operational impact encompasses both old or historic data and also new or mutating data. The business needs to keep the production system running non-stop and continuously respond to new real-time data. Consider a somewhat contrived example application a GeoIP lookup service. This utility needs a dataset to enable looking up countries by IPV4 address ranges. The initial implementation stored records as follows:

JSON
xxxxxxxxxx
1
 
1
{
2
  "type": "ip_country_map",
3
  "country": "RU",
4
  "ip_start": "7.12.60.1",
5
  "ip_end": "7.62.60.9"
6
}


Months later new business requirements change. The engineering needed the JSON documents to be enriched with new fields.  The requirement was to include the numeric representations of the two existing IPV4 address.

JSON
 
xxxxxxxxxx
1
 
1
{
2
  "type": "ip_country_map",
3
  "country": "RU",
4
  "ip_end": "7.62.60.9",
5
  "ip_start": "7.12.60.1",
6
  "ip_num_start": 118242305,
7
  "ip_num_end": 121519113
8
}


Eventing to The Rescue

A simple nineteen (13) line JavaScript (2 of which comments) Eventing Function can be written and deployed to solve the issue with minimal resources.

JavaScript
xxxxxxxxxx
1
13
 
1
function OnUpdate(doc, meta) {
2
  doc["ip_num_start"] = get_numip_first_3_octets(doc["ip_start"]);
3
  doc["ip_num_end"]   = get_numip_first_3_octets(doc["ip_end"]);
4
  // src is a bucket alias to the source bucket in settings, write back to it
5
  src[meta.id]=doc;
6
}
7
function get_numip_first_3_octets(ip) {
8
  if (!ip) return 0;
9
  var parts = ip.split('.');
10
  // IP Number = A x (256*256*256) + B x (256*256) + C x 256 + D
11
  return = (parts[0]*(256*256*256)) + (parts[1]*(256*256)) +
12
               (parts[2]*256) + parseInt(parts[3]);
13
}


By deploying the above Function with a "Feed boundary" set to "Everything" all documents of type: "ip_country_map" are processed and enriched. The Eventing Function is left "deployed" reacting to all new inserts or updates (or mutations) in real-time enriching new items and also updating existing items on changes to "ip_start_num" or "ip_end_num" to the proper "numeric" representations. Because the documents are enriched (the old fields are still present) the existing production applications will still work.  All new or changed data is updated in real-time to the new schema.  The GeoIP application components are decoupled via this simple Eventing Function such that they can be upgraded one at a time. When all production components have been updated the Eventing Function can be undeployed and decommissioned.

Purging Stale Data via Eventing

A Typical Customer Problem

A live production system has stored over 7 billion documents. All documents have an automatic expiration (or TTL for time to live) set. The production environment constantly receives new data and constantly expires old data. An operational mistake was made resulting in 2 billion documents being created without an expiration. The customer didn't have the resources (nor desired to pay for the resources) to create a large index to utilize N1QL to identify select and purge the data that lacked an active (non-zero TTL) when it was no longer useful.

Eventing to The Rescue

A simple six (6) line JavaScript (2 of which are comments) Eventing Function was deployed and solve the issue with minimal resources.

JavaScript
xxxxxxxxxx
1
 
1
function OnUpdate(doc, meta) {
2
    if (meta.expiration !== 0) return;
3
    // delete all items that have TTL or expiration of 0
4
    // src is a bucket alias to the source bucket in settings, delete from it.
5
    delete src[meta.id];
6
}


By deploying the above Function with a "Feed boundary" set to "Everything" the entire document set in the source bucket was scanned. All documents with a non-zero TTL (meaning they had no expiration) were ignored.  Only the matching documents with a TTL greater than zero are deleted. Once the source bucket was cleaned the Eventing Function was undeployed as it was used as an administrative tool. Note we could replace the expiration !== 0 test in our JavaScript to filter out data for any needed purpose. Pretty easy to guess what we are doing below:

JavaScript
 
xxxxxxxxxx
1
 
1
function OnUpdate(doc, meta) {
2
    if (!(doc.type === "customer"&& doc.active === false)) return 
3
    // archive the customer to the bucket aias arc and remove from the bucket alias src
4
    arc[meta.id] = doc;
5
    delete src[meta.id];
6
}


In fact, we could easily update the above to perform a cascade archive and delete not only of the customer but of any other related information such as orders, returns and shipping addresses. Refer to the example Cascade Delete in the Eventing Documentation.

Stripping Sensitive Data via Eventing

A Typical Customer Problem

A company running Couchbase on-premises in production was needed to share customer profile information (150M and growing). Their business partner is also running Couchbase but in a cloud provider, AWS. Given a typical profile record like the following:

JavaScript
xxxxxxxxxx
1
48
 
1
{
2
  "type": "master_profile",
3
  "first_name": "Peter",
4
  "last_name": "Chang",
5
  "id": 80927079070,
6
  "basic_profile": {
7
    "partner_id": 80980221,
8
    "services": [
9
      {
10
        "music": true
11
      },
12
      {
13
        "radio": true
14
      },
15
      {
16
        "video": false
17
      }
18
    ]
19
  },
20
  "sensitive_profile": {
21
    "ssn": "111-11-1111",
22
    "credit_card": {
23
      "number": "3333-333-3333-3333",
24
      "expires": "01/09",
25
      "ccv": "111"
26
    }
27
  },
28
  "address": {
29
    "home": {
30
      "street": "4032 Kenwood Drive",
31
      "city": "Boston",
32
      "zip": "02102"
33
    },
34
    "billing": {
35
      "street": "541 Bronx Street",
36
      "city": "Boston",
37
      "zip": "02102"
38
    }
39
  },
40
  "phone": {
41
    "home": "800-555-9201",
42
    "work": "877-123-8811",
43
    "cell": "878-234-8171"
44
  },
45
  "locale": "en_US",
46
  "timezone": -7,
47
  "gender": "M"
48
}

They couldn't just share the entire profile since it contained sensitive data on user preferences and payment methods.  They only needed to share a limited subset like the following:
JavaScript
 
xxxxxxxxxx
1
20
 
1
{
2
  "type": "shared_profile",
3
  "first_name": "Peter",
4
  "id": 80927079070,
5
  "basic_profile": {
6
    "partner_id": 80980221,
7
    "services": [
8
      {
9
        "music": true
10
      },
11
      {
12
        "radio": true
13
      },
14
      {
15
        "video": false
16
      }
17
    ]
18
  },
19
  "timezone": -7
20
}


The customer wanted to replace middle-wear SPARK solution required hours on failure to initialize and only provided slow batch process (hours to reflect updates) and sync up the profile information in real-time.

How Eventing Helps

A simple eight (9) line JavaScript (3 of which are comments) Eventing Function was deployed and solve the issue with minimal resources.

JavaScript
x
 
1
function OnUpdate(doc, meta) {
2
    // only process our profile documents
3
    if (doc.type !== "master_profile") return;
4
    // aws_bkt is a bucket alias to the target bucket to replicate to AWS via 
5
    // XCDR. Write the minimal (non-sensitive) profile doc to the bucket for AWS.
6
    aws_bkt["shared_profile::"+doc.id] = 
7
        { "type": "shared_profile", "first_name": doc.first_name, "id": doc.id, 
8
          "basic_profile": doc.basic_profile, "timezone": doc.timezone };
9
}


By deploying the above Function with a "Feed boundary" set to "Everything", the entire document set in the source bucket was scanned and all documents of type: "master_profile" were processed and only a sub-document from each profile without the sensitive information was copied to the shared destination bucket. The Eventing Function is always left deployed reacting to all user profile changes (or mutations) in real-time and forwarding each and every mutation to the AWS destination bucket.

Resources

  • Download: Download Couchbase Server 6.5.1

References

  • Couchbase Eventing documentation: https://docs.couchbase.com/server/current/eventing/eventing-overview.html
  • Couchbase Server 6.5 What’s New: https://docs.couchbase.com/server/6.5/introduction/whats-new.html
  • Couchbase blogs on Eventing: https://blog.couchbase.com/tag/eventing/

Comments

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: