DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
11 Monitoring and Observability Tools for 2023
Learn more
  1. DZone
  2. Data Engineering
  3. Databases
  4. Correcting My Misconceptions With REGIONAL BY ROW Tables

Correcting My Misconceptions With REGIONAL BY ROW Tables

This tutorial will dive deeper into nuances with CockroachDB REGIONAL BY ROW tables. I learned the hard way about the limitations and to pay close attention to the documentation on the topic.

Artem Ervits user avatar by
Artem Ervits
CORE ·
Mar. 07, 23 · Tutorial
Like (1)
Save
Tweet
Share
3.73K Views

Join the DZone community and get the full member experience.

Join For Free

Previous Articles on Multi-Region

CockroachDB Multi-Region Abstractions for MongoDB Developers With FerretDB


Motivation

I was working with a prospect on a use case where my knowledge of CockroachDB multi-region abstractions was put to the test. After trying and failing to gain low latency write behavior from a geographically distributed table, I reached out to the engineering team who corrected my understanding. This is my way of documenting my understanding and committing it to memory.

High-Level Steps

  • Provision a multi-region CockroachDB cluster
  • Demonstrate the problem
  • Demonstrate the solution
  • Workaround for non-UUID primary keys

Step-by-Step Instructions

Provision a Multi-Region CockroachDB Cluster

You will need access to a multi-region CockroachDB cluster. The easiest and most cost-effective way is signing up for CockroachDB Serverless. That's what I am using in my demonstration. My serverless cluster spans three regions in GCP.

We can connect to the regional endpoints using the Connect modal. The default connection string given will be localized as CockroachDB Serverless automatically detects the end user locality. To access the cluster from the remote regions, you must append the regions to the connection string. In essence, the host artem-serverless-mr-26.h4f.cockroachlabs.cloud becomes artem-serverless-mr-26.h4f.gcp-us-east1.cockroachlabs.cloud, artem-serverless-mr-26.h4f.gcp-us-west2.cockroachlabs.cloud, and artem-serverless-mr-26.h4f.gcp-europe-west1.cockroachlabs.cloud.

We can verify the gateway region we're connecting from by querying the helper function select gateway_region(); from each of the connections.

  gateway_region
------------------
  gcp-us-east1

  gateway_region
------------------
  gcp-us-west2

  gateway_region
--------------------
  gcp-europe-west1


Demonstrate the Problem

Considering an inventory system moving to CockroachDB and leveraging multi-region abstractions, typically the challenge is getting the primary key correct to optimize for writes. The following examples should demonstrate where the challenges are and how to work around them.

To get started, we're going to create a database called demo and enable multi-region abstractions on it.

CREATE DATABASE demo;
ALTER DATABASE demo SET PRIMARY REGION "gcp-us-east1";
ALTER DATABASE demo ADD region "gcp-us-west2";
ALTER DATABASE demo ADD region "gcp-europe-west1";

SET override_multi_region_zone_config = true;


Feel free to review the docs for more information but in summary, we're enrolling our database into multi-region abstractions and setting gcp-us-east1 as the primary region as it's the closest region to my location.

Then I'm passing a property to override certain zone configurations. This step is optional unless you are coming back to make further changes after.

Finally, we want to pin the leaseholders to the US region so that schema changes in multi-region context complete faster. You can learn more here.

ALTER DATABASE system CONFIGURE ZONE USING constraints = '{"+region=gcp-us-east1": 1}', lease_preferences = '[[+region=gcp-us-east1]]';


Let's create a table to demonstrate the problem.

CREATE TABLE demo_int_pk (productID PRIMARY KEY, count) AS SELECT unordered_unique_rowid() AS productID, generate_series(1, 10) AS count;


Make the table REGIONAL BY ROW.

ALTER TABLE demo_int_pk SET LOCALITY REGIONAL BY ROW;


The primary key on the table is an integer type. This problem is not specific to integers. You will have similar behavior with other types too, except for UUID — but I'm jumping ahead a bit.

Let's look at the schema of the table.

  CREATE TABLE public.demo_int_pk (
      productid INT8 NOT NULL,
      count INT8 NULL,
      crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region,
      CONSTRAINT demo_int_pk_pkey PRIMARY KEY (productid ASC)
  ) LOCALITY REGIONAL BY ROW


We can confirm the PK is indeed an integer. We can also see a new column called crdb_region which was added when we enabled REGIONAL BY ROW.

Let's look at the data.

 SELECT crdb_region, * FROM demo_int_pk;
  crdb_region  |      productid      | count
---------------+---------------------+--------
  gcp-us-east1 | 1021351265574322179 |     5
  gcp-us-east1 | 2174272770181169155 |     9
  gcp-us-east1 | 2750733522484592643 |     1
  gcp-us-east1 | 3327194274788016131 |     7
  gcp-us-east1 | 3903655027091439619 |     3
  gcp-us-east1 | 5633037284001710083 |     6
  gcp-us-east1 | 6209498036305133571 |     2
  gcp-us-east1 | 6785958788608557059 |    10
  gcp-us-east1 | 7938880293215404035 |     8
  gcp-us-east1 | 8515341045518827523 |     4


Notice the crdb_region column reports the US East region which is the region where I executed the insert statement and the rows are homed in. Let's attempt to do an update on the inventory count for one of the records.

WITH update_demo_int_pk AS (
    UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-east1   | gcp-us-east1 |   100

Time: 40ms total (execution 11ms / network 29ms)


It takes 11ms to update the count using the U.S. East connection. Notice I am using the gateway_region() function, which will confirm I am using the US East connection. Let's now update the record from another region. I am going to use gcp-us-west2 region.

WITH update_demo_int_pk AS (
    UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-west2   | gcp-us-east1 |   100

Time: 1.280s total (execution 1.204s / network 0.076s)


The query took 1.2 seconds to execute. This is not surprising considering the record is physically stored in the US East region. Let's look at the plan of the query:

distribution: local
  vectorized: true

  • root
  │
  ├── • render
  │   │
  │   └── • scan buffer
  │         estimated row count: 1
  │         label: buffer 2 (update_demo_int_pk)
  │
  └── • subquery
      │ id: @S1
      │ original sql: UPDATE demo_int_pk SET count = 100 WHERE productid = 2750733522484592643 RETURNING crdb_region, count
      │ exec mode: all rows
      │
      └── • buffer
          │ label: buffer 2 (update_demo_int_pk)
          │
          └── • update
              │ estimated row count: 1
              │ table: demo_int_pk
              │ set: count
              │
              └── • render
                  │
                  └── • union all
                      │ estimated row count: 1
                      │ limit: 1
                      │
                      ├── • scan
                      │     estimated row count: 1 (10% of the table; stats collected 19 seconds ago; using stats forecast for 12 minutes in the future)
                      │     table: demo_int_pk@demo_int_pk_pkey
                      │     spans: [/'gcp-us-west2'/2750733522484592643 - /'gcp-us-west2'/2750733522484592643]
                      │
                      └── • scan
                            estimated row count: 1 (10% of the table; stats collected 19 seconds ago; using stats forecast for 12 minutes in the future)
                            table: demo_int_pk@demo_int_pk_pkey
                            spans: [/'gcp-europe-west1'/2750733522484592643 - /'gcp-europe-west1'/2750733522484592643] [/'gcp-us-east1'/2750733522484592643 - /'gcp-us-east1'/2750733522484592643]


If you read the bottom part, we union all the results of the scans in the gcp-us-west2 and gcp-europe-west1 regions to validate the record is unique across all regions.

Let's add a record in the US West region. We're expecting the write to be quick.

INSERT INTO demo_int_pk (productID, count) VALUES (unordered_unique_rowid(), 1) RETURNING gateway_region(), crdb_region, productID, count;
  gateway_region | crdb_region  |      productid      | count
-----------------+--------------+---------------------+--------
  gcp-us-west2   | gcp-us-west2 | 5746135747274211350 |     1

Time: 296ms total (execution 217ms / network 78ms)


It takes 217ms to insert, even though we are writing from the US West region.

The story is the same if we were to insert from the EU:

   gateway_region  |   crdb_region    |      productid      | count
-------------------+------------------+---------------------+--------
  gcp-europe-west1 | gcp-europe-west1 | 1222035725603831819 |     1

Time: 338ms total (execution 250ms / network 88ms)


Updating a record from the outside region will also be costly:

WITH update_demo_int_pk AS (
    UPDATE demo_int_pk SET count = 100 WHERE productID = 2750733522484592643 RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_int_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-west2   | gcp-us-east1 |   100

  Time: 354ms total (execution 277ms / network 76ms)


Just to explain the statement — I updated the record physically stored in the US East (crdb_region) going through the US West endpoint (gateway_region).

We now have very fast writes from the local region but very slow writes from everywhere else.

Let's now focus on how we can make the scenario better. The hard lesson for me was to realize that the docs call out an optimization that I had failed to see originally.

Note: When using DEFAULT gen_random_uuid() on columns in REGIONAL BY ROW tables, uniqueness checks on those columns are disabled by default for performance purposes. CockroachDB assumes uniqueness based on the way this column generates UUIDs. To enable this check, you can modify the sql.optimizer.uniqueness_checks_for_gen_random_uuid.enabled cluster setting. Note that while there is virtually no chance of a collision occurring when enabling this setting, it is not truly zero.

With that, let's create a new table with PRIMARY KEY being UUID.

Demonstrate the Solution

CREATE TABLE demo_uuid_pk (productID PRIMARY KEY, count) AS SELECT gen_random_uuid() AS productID, generate_series(1, 10) AS count;
ALTER TABLE demo_uuid_pk SET LOCALITY REGIONAL BY ROW;


The create statement is below:

  CREATE TABLE public.demo_uuid_pk (
      productid UUID NOT NULL,
      count INT8 NULL,
      crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region,
      CONSTRAINT demo_uuid_pk_pkey PRIMARY KEY (productid ASC)
  ) LOCALITY REGIONAL BY ROW


The data looks like so:

  crdb_region  |              productid               | count
---------------+--------------------------------------+--------
  gcp-us-east1 | 229f4258-cac2-4b06-815f-5dd8be8483c7 |     9
  gcp-us-east1 | 525845a9-30ab-42dd-95bb-fb1d72fc7137 |     8
  gcp-us-east1 | 58b174b5-c2f6-433e-a373-74879d81ff92 |     5
  gcp-us-east1 | 5b10da77-e750-431d-a7ea-f0b5f0d61c77 |     3
  gcp-us-east1 | 930e64db-fbae-4743-9419-c274853b4cb3 |     2
  gcp-us-east1 | 994f1714-5d0e-4b9a-8cbb-ab0c94efe226 |     4
  gcp-us-east1 | a5b70013-b8cb-4ba1-86f5-1f892285cf99 |     7
  gcp-us-east1 | d3778386-153b-4431-ab8c-7d9ee9215bec |    10
  gcp-us-east1 | de6b91f8-f5a8-4575-ae3d-fe1b516ae906 |     6
  gcp-us-east1 | f5fa80ea-4fbf-4234-86b7-5592f1bfdda2 |     1


Let's update the inventory from the US East region.

WITH update_demo_uuid_pk AS (
    UPDATE demo_uuid_pk SET count = 10 WHERE productID = '229f4258-cac2-4b06-815f-5dd8be8483c7' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-east1   | gcp-us-east1 |    10

Time: 42ms total (execution 12ms / network 30ms)


This is similar to our original results. Let's add records in the US West region and update the inventory.

INSERT INTO demo_uuid_pk (productID, count) SELECT gen_random_uuid(), generate_series(1, 10) RETURNING gateway_region(), crdb_region, productID, count;
  gateway_region | crdb_region  |              productid               | count
-----------------+--------------+--------------------------------------+--------
  gcp-us-west2   | gcp-us-west2 | cfb86888-e9b5-4328-9c60-0c102035b5e6 |     1
  gcp-us-west2   | gcp-us-west2 | b243affe-47e9-47b2-a7ba-cbc966d42797 |     2
  gcp-us-west2   | gcp-us-west2 | 82dc438b-d819-4a52-85b8-5ac4ecf4c725 |     3
  gcp-us-west2   | gcp-us-west2 | 4a7bda92-b37a-461a-8f76-ea33dba899dc |     4
  gcp-us-west2   | gcp-us-west2 | 55954dba-6892-4c8a-9f84-2a70b7b6f204 |     5
  gcp-us-west2   | gcp-us-west2 | 95c7dbf6-093a-4afb-af71-196833ab469c |     6
  gcp-us-west2   | gcp-us-west2 | 01689a64-9ae5-408a-b150-590f51efde31 |     7
  gcp-us-west2   | gcp-us-west2 | 7387bac4-e5c6-43b3-9bf6-149774de7ba5 |     8
  gcp-us-west2   | gcp-us-west2 | 6f0f8abe-6735-41d1-a549-755daddfa14a |     9
  gcp-us-west2   | gcp-us-west2 | 3d8c9ad6-d8ac-4551-b25a-067b8efd2db4 |    10

Time: 143ms total (execution 66ms / network 77ms)


Notice the execution is 66ms, which is 5x better than 250ms in the earlier attempts with integers.

Similarly, updates are just as fast.

WITH update_demo_uuid_pk AS (
    UPDATE demo_uuid_pk SET count = 10 WHERE productID = 'e19aa60b-d449-478d-8867-7ecabe6eb361' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-west2   | gcp-us-west2 |    10

Time: 86ms total (execution 10ms / network 77ms)


Notice the execution is now 10ms, which is significantly faster.

For posterity, here are results for inserting and updating in the EU:

   gateway_region  |   crdb_region    |              productid               | count
-------------------+------------------+--------------------------------------+--------
  gcp-europe-west1 | gcp-europe-west1 | f27db778-20ba-4458-b75f-0f3a6d46c6ab |     1
  gcp-europe-west1 | gcp-europe-west1 | fbff7233-468f-49d1-a7a2-719f851fed54 |     2
  gcp-europe-west1 | gcp-europe-west1 | 01709624-1cf6-4f72-8e8c-6188db8e517a |     3
  gcp-europe-west1 | gcp-europe-west1 | 21c4999a-d4e4-47c4-a034-ad005dd9e9b2 |     4
  gcp-europe-west1 | gcp-europe-west1 | 64ea0759-129d-4174-88cb-020ef7e947a8 |     5
  gcp-europe-west1 | gcp-europe-west1 | 946daa03-3932-4694-a7bc-a87b0fa5d242 |     6
  gcp-europe-west1 | gcp-europe-west1 | c1cbe899-d981-4223-993c-0d5987b9320b |     7
  gcp-europe-west1 | gcp-europe-west1 | 5dcca454-2b66-4dd1-aee2-d540eac6b710 |     8
  gcp-europe-west1 | gcp-europe-west1 | 1a6209c4-d804-432e-a239-60916c2f889b |     9
  gcp-europe-west1 | gcp-europe-west1 | 2df436d4-0f3f-40c3-8303-bec776e591c7 |    10

Time: 115ms total (execution 27ms / network 88ms)
WITH update_demo_uuid_pk AS (
    UPDATE demo_uuid_pk SET count = 10 WHERE productID = 'f27db778-20ba-4458-b75f-0f3a6d46c6ab' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM update_demo_uuid_pk;
   gateway_region  |   crdb_region    | count
-------------------+------------------+--------
  gcp-europe-west1 | gcp-europe-west1 |    10

Time: 100ms total (execution 11ms / network 88ms)


The reason this is so fast is because we're using UUID type for our primary key. The chance of collision for UUID PK across all regions is minimal, therefore we skip uniqueness checks in the remote regions.

Workaround for Non-UUID Primary Keys

You may be asking, what if we would like to maintain the old type for primary keys? Well, in that case, we do have one more way to maintain backwards compatibility to your legacy primary keys and still leverage great performance. We can take advantage of composite primary keys where one column is going to be your legacy non-UUID primary key and the other being a locality column like crdb_region.

CREATE TABLE demo_composite_pk (productID PRIMARY KEY, count) AS SELECT unordered_unique_rowid() AS productID, generate_series(1, 10) AS count;


Now we're going to add a new column called region, which you've seen before — but this time, we are going to add it as a default expression. The reason being, it will force to evaluate gateway_region() which is the function we determine the locality of the write statement, at the insert time. The additional benefit to this approach is that region column in PK will avoid the uniqueness checks globally.

ALTER TABLE demo_composite_pk ADD COLUMN region crdb_internal_region NOT NULL DEFAULT gateway_region()::crdb_internal_region;


CockroachDB is capable of changing primary keys online. Here is a highly simplified example on a small table, and in many cases, this is immensely useful! Changing PK in production is not a frequent occurrence, but it's good to know there are products that can do that.

ALTER TABLE demo_composite_pk ALTER PRIMARY KEY USING COLUMNS (region, productID);


Finally, let's set the table to REGIONAL BY ROW.

ALTER TABLE demo_composite_pk SET LOCALITY REGIONAL BY ROW;


The final schema is below:

  CREATE TABLE public.demo_composite_pk (
      productid INT8 NOT NULL,
      count INT8 NULL,
      region demo.public.crdb_internal_region NOT NULL DEFAULT gateway_region()::demo.public.crdb_internal_region,
      crdb_region demo.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::demo.public.crdb_internal_region,
      CONSTRAINT demo_composite_pk_pkey PRIMARY KEY (region ASC, productid ASC),
      UNIQUE INDEX demo_composite_pk_productid_key (productid ASC)
  ) LOCALITY REGIONAL BY ROW


Notice there's an additional unique index on the productid. It is not just the remnants of the legacy PK (CockroachDB promotes a former PK to Unique Index when PK changes to a new one). Additionally, it serves another purpose to maintain uniqueness for productids.

Let's test this from the US East:

INSERT INTO demo_composite_pk (productID, count) SELECT unordered_unique_rowid(), generate_series(1, 10) RETURNING gateway_region(), crdb_region, productID, count;
  gateway_region | crdb_region  |      productid      | count
-----------------+--------------+---------------------+--------
  gcp-us-east1   | gcp-us-east1 | 5199323410302500866 |     1
  gcp-us-east1   | gcp-us-east1 | 8658087924123041794 |     2
  gcp-us-east1   | gcp-us-east1 |  299407015723401218 |     3
  gcp-us-east1   | gcp-us-east1 | 4911093034150789122 |     4
  gcp-us-east1   | gcp-us-east1 | 2605250024937095170 |     5
  gcp-us-east1   | gcp-us-east1 | 7216936043364483074 |     6
  gcp-us-east1   | gcp-us-east1 | 1452328520330248194 |     7
  gcp-us-east1   | gcp-us-east1 | 6064014538757636098 |     8
  gcp-us-east1   | gcp-us-east1 | 3758171529543942146 |     9
  gcp-us-east1   | gcp-us-east1 | 8369857547971330050 |    10

Time: 352ms total (execution 329ms / network 23ms)


Let's update a row homed in the US East region:

WITH demo_composite_pk AS (
    UPDATE demo_composite_pk SET count = 10 WHERE productID = '5199323410302500866' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-east1   | gcp-us-east1 |    10

Time: 35ms total (execution 12ms / network 22ms)


Alright, we're not seeing anything we haven't seen before. Let's insert a row in the US West region and then update it.

  gateway_region | crdb_region  |      productid      | count
-----------------+--------------+---------------------+--------
  gcp-us-west2   | gcp-us-west2 |  695146676766375958 |     1
  gcp-us-west2   | gcp-us-west2 | 6459754199800610838 |     2
  gcp-us-west2   | gcp-us-west2 | 4153911190586916886 |     3
  gcp-us-west2   | gcp-us-west2 | 8765597209014304790 |     4
  gcp-us-west2   | gcp-us-west2 |  406916300614664214 |     5
  gcp-us-west2   | gcp-us-west2 | 5018602319042052118 |     6
  gcp-us-west2   | gcp-us-west2 | 2712759309828358166 |     7
  gcp-us-west2   | gcp-us-west2 | 7324445328255746070 |     8
  gcp-us-west2   | gcp-us-west2 | 1559837805221511190 |     9
  gcp-us-west2   | gcp-us-west2 | 6171523823648899094 |    10

Time: 986ms total (execution 724ms / network 262ms)
WITH demo_composite_pk AS (
    UPDATE demo_composite_pk SET count = 10 WHERE productID = '6459754199800610838' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
  gateway_region | crdb_region  | count
-----------------+--------------+--------
  gcp-us-west2   | gcp-us-west2 |    10

Time: 85ms total (execution 10ms / network 75ms)


This is good. We now have similar update performance from the US East and the West regions. What about EU?

   gateway_region  |   crdb_region    |      productid      | count
-------------------+------------------+---------------------+--------
  gcp-europe-west1 | gcp-europe-west1 | 3574678019714056204 |     1
  gcp-europe-west1 | gcp-europe-west1 | 2998217267410632716 |     2
  gcp-europe-west1 | gcp-europe-west1 | 7609903285838020620 |     3
  gcp-europe-west1 | gcp-europe-west1 | 1845295762803785740 |     4
  gcp-europe-west1 | gcp-europe-west1 | 6456981781231173644 |     5
  gcp-europe-west1 | gcp-europe-west1 | 4151138772017479692 |     6
  gcp-europe-west1 | gcp-europe-west1 | 8762824790444867596 |     7
  gcp-europe-west1 | gcp-europe-west1 | 5015829900472614924 |     8
  gcp-europe-west1 | gcp-europe-west1 | 2709986891258920972 |     9
  gcp-europe-west1 | gcp-europe-west1 | 7321672909686308876 |    10

Time: 1.046s total (execution 0.885s / network 0.162s)
WITH demo_composite_pk AS (
    UPDATE demo_composite_pk SET count = 10 WHERE productID = '3574678019714056204' RETURNING crdb_region, count
) SELECT gateway_region(), crdb_region, count FROM demo_composite_pk;
   gateway_region  |   crdb_region    | count
-------------------+------------------+--------
  gcp-europe-west1 | gcp-europe-west1 |    10

Time: 129ms total (execution 14ms / network 116ms)


This is great. We're able to maintain backward compatibility to legacy keys and take advantage of local writes.

It's worth calling out that having a partition column as part of PK is not absolutely necessary. As long as the field you're partitioning by region has a unique constraint, you should be good to go. To emphasize this point, let's do the following. Let's drop the unique constraint on the productid field.

DROP INDEX demo_composite_pk_productid_key;   


Execute an update on the table from any region.

   gateway_region  |   crdb_region    | count
-------------------+------------------+--------
  gcp-europe-west1 | gcp-europe-west1 |    10

Time: 411ms total (execution 157ms / network 254ms)


Latency goes up immediately. We can no longer maintain uniqueness across regions and must visit each region.

distribution: local
  vectorized: true

  • root
  │
  ├── • render
  │   │
  │   └── • scan buffer
  │         estimated row count: 0
  │         label: buffer 2 (demo_composite_pk)
  │
  └── • subquery
      │ id: @S1
      │ original sql: UPDATE demo_composite_pk SET count = 10 WHERE productid = '3574678019714056204' RETURNING crdb_region, count
      │ exec mode: all rows
      │
      └── • buffer
          │ label: buffer 2 (demo_composite_pk)
          │
          └── • update
              │ estimated row count: 0
              │ table: demo_composite_pk
              │ set: count
              │
              └── • render
                  │
                  └── • scan
                        estimated row count: 0 (0.26% of the table; stats collected 16 minutes ago; using stats forecast for 5 hours in the future)
                        table: demo_composite_pk@demo_composite_pk_pkey
                        spans: [/'gcp-europe-west1'/'gcp-europe-west1'/3574678019714056204 - /'gcp-europe-west1'/'gcp-europe-west1'/3574678019714056204] [/'gcp-europe-west1'/'gcp-us-east1'/3574678019714056204 - /'gcp-europe-west1'/'gcp-us-east1'/3574678019714056204] [/'gcp-europe-west1'/'gcp-us-west2'/3574678019714056204 - /'gcp-europe-west1'/'gcp-us-west2'/3574678019714056204] [/'gcp-us-east1'/'gcp-europe-west1'/3574678019714056204 - /'gcp-us-east1'/'gcp-europe-west1'/3574678019714056204] … (5 more)
                        locking strength: for update


Let's add it back using the example in the link above.

ALTER TABLE demo_composite_pk ADD CONSTRAINT productid_unique UNIQUE (productid);


Let's take a look at the partitions for productid across every region. We are going to filter the output of the following command as it is verbose.

SHOW PARTITIONS FROM TABLE demo_composite_pk;


Changing it to:

SELECT table_name, partition_name, column_names, index_name FROM [SHOW PARTITIONS FROM TABLE demo_composite_pk];
     table_name     |  partition_name  | column_names |                index_name
--------------------+------------------+--------------+-------------------------------------------
  demo_composite_pk | gcp-europe-west1 | crdb_region  | demo_composite_pk@demo_composite_pk_pkey
  demo_composite_pk | gcp-us-east1     | crdb_region  | demo_composite_pk@demo_composite_pk_pkey
  demo_composite_pk | gcp-us-west2     | crdb_region  | demo_composite_pk@demo_composite_pk_pkey
  demo_composite_pk | gcp-europe-west1 | crdb_region  | demo_composite_pk@productid_unique
  demo_composite_pk | gcp-us-east1     | crdb_region  | demo_composite_pk@productid_unique
  demo_composite_pk | gcp-us-west2     | crdb_region  | demo_composite_pk@productid_unique


Conclusion

In summary, if your application has a choice to migrate and pay the one-time cost to capture all of the benefits of what a multi-region database can offer, then opt for UUID PK. If your app is not that flexible, make sure you add a deterministic partition column and add a unique index on it. In my experiments, I still see a better overall performance with UUID, but it's the next best alternative.

CockroachDB Database cluster Cloud database

Published at DZone with permission of . See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 10 Best Practices for Web Application Testing
  • Fargate vs. Lambda: The Battle of the Future
  • Little's Law and Lots of Kubernetes
  • 10 Easy Steps To Start Using Git and GitHub

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: