CockroachDB TIL: Volume 5
This is the fifth installment in this series of articles covering short "Today I learned" topics as I work with various layers of CockroachDB.
Join the DZone community and get the full member experience.
Join For FreeRead the previous installments here:
- Volume 1
- Volume 2
- Volume 3
- Volume 4
Topics
Topic 1: Multi-Row Updates
There's a multi-row insert, upsert, and delete. I've not heard of a multi-row update. To my surprise, Postgresql support multi-row update with the following syntax: UPDATE FROM. It sends several rows in a single batch, thereby improving performance and reducing network hops.
CREATE TABLE t1 (key INT PRIMARY KEY, val1 STRING, val2 STRING);
INSERT INTO t1
(key, val1, val2)
VALUES
(1, 'a', 'b'),
(2, 'c', 'd'),
(3, 'e', 'f');
SELECT * FROM t1;
key | val1 | val2
------+------+-------
1 | a | b
2 | c | d
3 | e | f
UPDATE t1 AS t SET
val1 = t2.val1,
val2 = t2.val2
FROM (VALUES
(1, 'aa', 'bb'),
(2, 'cc', 'dd'),
(3, 'ee', 'ff')
) as t2(key, val1, val2)
WHERE t2.key = t.key;
key | val1 | val2
------+------+-------
1 | aa | bb
2 | cc | dd
3 | ee | ff
It took a few tries to get it right, the following Stack Overflow post had helped to get it right.
The following syntax also works, in other words, aliasing is optional, and referencing the table by the actual name:
UPDATE t1
SET
val1 = t2.val1,
val2 = t2.val2
FROM (VALUES
(1, 'aaa', 'bba'),
(2, 'ccc', 'ddd'),
(3, 'eee', 'fff')
) as t2(key, val1, val2)
WHERE t2.key = t1.key;
key | val1 | val2
------+------+-------
1 | aaa | bba
2 | ccc | ddd
3 | eee | fff
Topic 2: Batching Statements
Taking an example from the previous topic, if the goal is to reduce the number of hops, sending an entire batch of updates on a single line also reduces network hops. A single line statement is sent to the gateway as a single batch and CRDB will implicitly batch the entire code block.
UPDATE t1 SET val1 = 'aaaa', val2 = 'bbbb'
WHERE key = 1; UPDATE t1 SET val1 = 'cccc', val2 = 'dddd'
WHERE key = 2; UPDATE t1 SET val1 = 'eeef', val2 = 'ffff'
WHERE key = 3;
artem@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/defaultdb> UPDATE t1 SET val1 = 'aaaa', val2 = 'bbbb'
WHERE key = 1; UPDATE t1 SET val1 = 'cccc', val2 = 'dddd'
WHERE key = 2; UPDATE t1 SET val1 = 'eeef', val2 = 'ffff'
WHERE key = 3;
UPDATE 1
Note: timings for multiple statements on a single line are not supported. See https://go.crdb.dev/issue-v/48180/v21.1.
As opposed to sending each at a time:
UPDATE t1 SET val1 = 'aaaaa', val2 = 'bbbbb'
WHERE key = 1;
UPDATE t1 SET val1 = 'ccccc', val2 = 'ddddd'
WHERE key = 2;
UPDATE t1 SET val1 = 'eeeee', val2 = 'fffff'
WHERE key = 3;
artem@free-tier.gcp-us-central1.cockroachlabs.cloud:26257/defaultdb> UPDATE t1 SET val1 = 'aaaaa', val2 = 'bbbbb'
WHERE key = 1;
UPDATE t1 SET val1 = 'ccccc', val2 = 'ddddd'
WHERE key = 2;
UPDATE t1 SET val1 = 'eeeee', val2 = 'fffff'
WHERE key = 3;
UPDATE 1
Time: 47ms total (execution 5ms / network 41ms)
UPDATE 1
Time: 44ms total (execution 6ms / network 38ms)
UPDATE 1
Time: 43ms total (execution 6ms / network 38ms)
Topic 3: Inspect Binary SST Files in CockroachDB
CockroachDB is backed by a Pebble storage engine. We recently made a switch from RocksDB to our own RocksDB implementation written in Go. I leave it to you to read the blog above to learn why. The main point of this section is the underlying data in RocksDB and Pebble is stored in files with .sst
extension. If you try to browse the file with any standard utility, you won't see much as the file is binary. There are various sst viewers available and we have our own! It is built into the binary, remember CockroachDB comes with all the batteries included! The best part is you can read these files with cluster offline.
I have a CockroachDB directory with some data, let's take a look:
cd ./cockroach-data/
tree | grep *.sst
├── 000021.sst
The following command will help us introspect the file:
cockroach debug pebble sstable scan 000021.sst --count 10
000021.sst
/Local/RangeID/1/r/RangeGCThreshold/0,0#0,SET : EMPTY
/Local/RangeID/1/r/RangeAppliedState/0,0#0,SET raft_applied_index:123 lease_applied_index:32 range_stats:<last_update_nanos:1642197117283621000 gc_bytes_age:42086 live_bytes:1944 live_count:46 key_bytes:899 key_count:46 val_bytes:1784 val_count:61 sys_bytes:291 sys_count:6 > raft_closed_timestamp:<wall_time:1642197114400256000 >
/Local/RangeID/1/r/RangeLease/0,0#0,SET repl=(n1,s1):1 seq=1 start=0,0 exp=1642197146.792966000,0 pro=1642197137.792966000,0
/Local/RangeID/1/r/RangePriorReadSummary/0,0#0,SET {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[65 103 5 187 3 10 2 10 0 18 2 10 0] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}
/Local/RangeID/1/r/RangeVersion/0,0#0,SET 21.2
/Local/RangeID/1/u/RaftHardState/0,0#0,SET term:6 vote:1 commit:123
/Local/RangeID/1/u/RaftLog/logIndex:11/0,0#0,SET Term:6 Index:11 : EMPTY
/Local/RangeID/1/u/RaftLog/logIndex:12/0,0#0,SET Term:6 Index:12 by lease #0
replicated_eval_result:<write_timestamp:<> delta:<> >
write batch:
<nil>
/Local/RangeID/1/u/RaftLog/logIndex:13/0,0#0,SET Term:6 Index:13 by lease #0
closed_timestamp:<> replicated_eval_result:<state:<lease:<start:<> expiration:<wall_time:1642196984800220000 > replica:<node_id:1 store_id:1 replica_id:1 > deprecated_start_stasis:<wall_time:1642196984800220000 > proposed_ts:<wall_time:1642196975800220000 > sequence:1 acquisition_type:Request > raft_closed_timestamp:<> > is_lease_request:true write_timestamp:<wall_time:1642196975801033000 > delta:<sys_bytes:76 sys_count:1 > prior_read_summary:<local:<low_water:<> > global:<low_water:<> > > > logical_op_log:<>
write batch:
Put: 0,0 /Local/RangeID/1/r/RangeLease (0x01698972726c6c2d00): repl=(n1,s1):1 seq=1 start=0,0 exp=1642196984.800220000,0 pro=1642196975.800220000,0
Put: 0,0 /Local/RangeID/1/r/RangePriorReadSummary (0x016989727270727300): {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[65 103 5 187 3 10 2 10 0 18 2 10 0] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}
/Local/RangeID/1/u/RaftLog/logIndex:14/0,0#0,SET Term:6 Index:14 by lease #1
proposer_lease_sequence:1 max_lease_index:1 closed_timestamp:<wall_time:1642196974801675000 > replicated_eval_result:<write_timestamp:<wall_time:1642196977801415000 > delta:<sys_bytes:57 sys_count:1 > > logical_op_log:<>
write batch:
Put: 0,0 /Local/Range/Min/QueueLastProcessed/"consistencyChecker" (0x016b120001716c7074636f6e73697374656e6379436865636b657200): {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[112 221 93 101 3 8 240 156 197 200 181 174 144 229 22] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>}
The explanation of the entire contents of the file is beyond the scope of this article but one helpful application of this capability is to confirm native at rest encryption is functioning properly.
I have some data loaded into this cluster, let's look for any data containing "New York":
cockroach debug pebble sstable scan 000021.sst | grep "New York" | tail -n 10
/Table/52/1/"\xfeD\xe7\x1f\x90;H\x00\xbf\"s\x8f\xc8\x1d\xea\xc4"/0/1642197103.541702000,0#0,SET "\xa8\xf5&\xb7\n&\x02NY\x16\rNew York City\x13\xbc\xe9\x81\b"
/Table/52/1/"\xfe{\x90,\xdb\xebH\x00\xbf=\xc8\x16m\xf5\xc4\x00"/0/1642196996.701019000,0#0,SET "\xe5\xa0O\xa7\n&\x02NY\x16\rNew York City\x13\xa4\xa6\xb6\x02"
/Table/52/1/"\xfe\xe9\xec\xdc\xc7KH\x00\xbft\xf6nc\xa5\xeaR"/0/1642196996.701019000,0#0,SET "\xf5\xb2\xce\xf3\n&\x02NY\x16\rNew York City\x13\xa2\xfe\x8a\x04"
/Table/52/1/"\xff\x00\xb7\xdd~\x01H\x00\xbf\x80[\xee\xbf\x00\xfc\xf5"/0/1642197080.834041000,0#0,SET "\xa7\xf9/W\n&\x02NY\x16\rNew York City\x13\x92\xb3\xc2\x04"
/Table/52/1/"\xff\x1a\xae\xfb\xe9\xbdH\x00\xbf\x8dW}\xf4\u07ba$"/0/1642197066.200311000,0#0,SET "\xab\xe6\xdaV\n&\x02NY\x16\rNew York City\x13\xee\xb2\xda\x01"
/Table/52/1/"\xffB\x1d8\x01\xef@\x00\xbf\xa1\x0e\x9c\x00\xf7\xb6\xe9"/0/1642197003.063291000,0#0,SET "\x01\x91\x15\xcc\n&\x02NY\x16\rNew York City\x13\xfa\xa6\xe2\x01"
/Table/52/1/"\xffX\xbaّb@\x00\xbf\xac]lȱ>\x10"/0/1642196996.701019000,0#0,SET "\xbb\xfb\f\xd2\n&\x02NY\x16\rNew York City\x13\x86\x92\xab\x04"
/Table/52/1/"\xff\xc60\xef+AH\x00\xbf\xe3\x18w\x95\xa0\xd4C"/0/1642196996.701019000,0#0,SET "\xfc!\xf2\xd6\n&\x02NY\x16\rNew York City\x13\xa2\xb9\xe8\a"
/Table/52/1/"\xff\xe8>\x1c2\xde@\x00\xbf\xf4\x1f\x0e\x19o8\xa6"/0/1642197052.306718000,0#0,SET "\x9f\x80l\xf6\n&\x02NY\x16\rNew York City\x13ƣ\xcb\x02"
/Table/52/1/"\xff\xf7n\x8b\xb5\xda@\x00\xbf\xfb\xb7E\xda\xed)4"/0/1642197109.857052000,0#0,SET "-\xd1:\xec\n&\x02NY\x16\rNew York City\x13\xaa\x91\x93\x03"
Topic 4: Save Docker Containers as Files
This topic is not particularly about CockroachDB as much as the ecosystem. Once in a while, you get customer inquiries about strict compliance rules preventing these organizations from executing typical workflows we take for granted. Case in point, a financial services organization has rigid guidelines for container image lifecycle. In a nutshell, they have to go through a process of fetching images from public registries. Instead, they maintain their own regulated registry which they populate with images their internal team deems safe. This particular customer has no means of pulling our Kubernetes Operator.
There are several ways you can fetch our Kubernetes artifacts, either through the Docker Hub or using our github repo. For the sake of accelerating the path to success for this customer as well as my genuine curiosity, I decided to see what are the possibilities. Little did I know that this capability is built-in and fairly straightforward. The magic is in docker save
command. I was able to quickly stitch together a runbook for this particular customer to follow and at the same time learn a new trick with Docker! Without further ado:
docker pull cockroachdb/cockroach-operator:v2.5.0
docker save cockroachdb/cockroach-operator:v2.5.0 | gzip > cockroach-operator.tar.gz
docker pull cockroachdb/cockroach:v21.2.4
docker save cockroachdb/cockroach:v21.2.4 | gzip > cockroachdb.tar.gz
The first command pulls the tagged version of our operator. The second command saves the container and pipes it as stdout to gzip utility which then compresses the image to a file. Since we're going to need CockroachDB for this to work as well, I repeat the steps for that container.
So now that you exported the images to a file, the way to load them back is with docker load
command.
docker image load -i cockroachdb.tar.gz
e8228e50fe18: Loading layer 107MB/107MB
4716779a2c02: Loading layer 20.48kB/20.48kB
2da1a510186d: Loading layer 16.61MB/16.61MB
1a93f3a5fad3: Loading layer 5.632kB/5.632kB
0dc09568edcd: Loading layer 234.8MB/234.8MB
bbcfddf195ea: Loading layer 153.1kB/153.1kB
884cf86fb787: Loading layer 12.6MB/12.6MB
Loaded image: cockroachdb/cockroach:v21.2.4
docker image load -i cockroach-operator.tar.gz
d77bd8c80675: Loading layer 85.11MB/85.11MB
0237df4acf88: Loading layer 20.48kB/20.48kB
c82cef4821d2: Loading layer 211.5MB/211.5MB
d74de946f1f0: Loading layer 51.91MB/51.91MB
Loaded image: cockroachdb/cockroach-operator:v2.5.0
Topic 5: Evaluate Health Check Options for CockroachDB
Remember my TIL 3 post on pg_isready
? Well, as part of my investigation of Unleash and CockroachDB I stumbled on the Unleash docker-compose file where they use a health check for Postgresql.
db:
expose:
- "5432"
image: postgres:13
environment:
POSTGRES_DB: "db"
POSTGRES_HOST_AUTH_METHOD: "trust"
healthcheck:
test: ["CMD", "pg_isready", "--username=postgres", "--host=127.0.0.1", "--port=5432"]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
Since we know pg_isready
works with CockroachDB, wouldn't it be nice to use it for a health check with containers as well? The problem stems from our CockroachDB image not having pg_isready
installed. We can build a new image that includes cockroach binary as well as pg_isready but that goes against the mantra of having slim images. We do however expose a health check endpoint we can use in our health check. I created a quick docker-compose file to illustrate the scenario:
version: '3.9'
services:
roach-0:
container_name: roach-0
hostname: roach-0
image: cockroachdb/cockroach:v21.2.4
command: start-single-node --insecure
ports:
- 26257:26257
- 8080:8080
healthcheck:
test: ["CMD", "curl", "http://roach-0:8080/health?ready=1"]
interval: 2s
timeout: 1m
retries: 5
start_period: 10s
You can now monitor your containers and leverage health checks with them.
Published at DZone with permission of Artem Ervits. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments