Kubernetes 1.8: Using Cronjobs to Take Neo4j Backups

It's easier than ever to run Neo4j backup jobs against Kubernetes clusters. Check out how to use a Cronjob to execute a backup now that Kubernetes 1.8 has been released.

Mark Needham

Dec. 31, 17 · Tutorial

Likes (6)

Comment

Save

7.7K Views

With the release of Kubernetes 1.8, Cronjobs have graduated to beta, which means we can now more easily run Neo4j backup jobs against Kubernetes clusters.

Before we learn how to write a Cronjob, let’s first create a local Kubernetes cluster and deploy Neo4j.

Spin Up Kubernetes and Helm

minikube start --memory 8192
helm init && kubectl rollout status -w deployment/tiller-deploy --namespace=kube-system

Deploy a Neo4j Cluster

helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com/
helm install incubator/neo4j --name neo-helm --wait --set authEnabled=false,core.extraVars.NEO4J_dbms_backup_address=0.0.0.0:6362

Populate Our Cluster With Data

We can run the following command to check that our cluster is ready:

kubectl exec neo-helm-neo4j-core-0 \
  -- bin/cypher-shell --format verbose \
  "CALL dbms.cluster.overview() YIELD id, role RETURN id, role"
 
+-----------------------------------------------------+
| id                                     | role       |
+-----------------------------------------------------+
| "0b3bfe6c-6a68-4af5-9dd2-e96b564df6e5" | "LEADER"   |
| "09e9bee8-bdc5-4e95-926c-16ea8213e6e7" | "FOLLOWER" |
| "859b9b56-9bfc-42ae-90c3-02cedacfe720" | "FOLLOWER" |
+-----------------------------------------------------+

Now, let’s create some data:

kubectl exec neo-helm-neo4j-core-0 \
  -- bin/cypher-shell --format verbose \
  "UNWIND range(0,1000) AS id CREATE (:Person {id: id})"
 
0 rows available after 653 ms, consumed after another 0 ms
Added 1001 nodes, Set 1001 properties, Added 1001 labels

Now that our Neo4j cluster is running and contains data, we want to take regular backups.

Neo4j Backups

The Neo4j backup tool supports full and incremental backups.

Full Backup

A full backup streams a copy of the store files to the backup location and any transactions that happened during the backup. Those transactions are then applied to the backed-up copy of the database.

Incremental Backup

An incremental backup is triggered if there is already a Neo4j database in the backup location. In this case, there will be no copying of store files. Instead, the tool will copy any new transactions from Neo4j and apply them to the backup.

Backup Cronjob

We will use a Cronjob to execute a backup. In the example below, we attach a PersistentVolumeClaim to our Cronjob so that we can see both of the backup scenarios in action.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: backupdir-neo4j
  labels:
    app: neo4j-backup
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: neo4j-backup
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
            - name: backupdir-neo4j
              persistentVolumeClaim:
                claimName: backupdir-neo4j
          containers:
            - name: neo4j-backup
              image: neo4j:3.3.0-enterprise
              env:
                - name: NEO4J_ACCEPT_LICENSE_AGREEMENT
                  value: "yes"
              volumeMounts:
              - name: backupdir-neo4j
                mountPath: /tmp
              command: ["bin/neo4j-admin",  "backup", "--backup-dir", "/tmp", "--name", "backup", "--from", "neo-helm-neo4j-core-2.neo-helm-neo4j.default.svc.cluster.local:6362"]
          restartPolicy: OnFailure

$ kubectl apply -f cronjob.yaml 
cronjob "neo4j-backup" created

kubectl get jobs -w
NAME                      DESIRED   SUCCESSFUL   AGE
neo4j-backup-1510940940   1         1            34s

Now let’s view the logs from the job:

kubectl logs $(kubectl get pods -a --selector=job-name=neo4j-backup-1510940940 --output=jsonpath={.items..metadata.name})
 
Doing full backup...
2017-11-17 17:49:05.920+0000 INFO [o.n.c.s.StoreCopyClient] Copying neostore.nodestore.db.labels
...
2017-11-17 17:49:06.038+0000 INFO [o.n.c.s.StoreCopyClient] Copied neostore.labelscanstore.db 48.00 kB
2017-11-17 17:49:06.038+0000 INFO [o.n.c.s.StoreCopyClient] Done, copied 18 files
2017-11-17 17:49:06.094+0000 INFO [o.n.b.BackupService] Start recovering store
2017-11-17 17:49:07.669+0000 INFO [o.n.b.BackupService] Finish recovering store
Doing consistency check...
2017-11-17 17:49:07.716+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Selected RecordFormat:StandardV3_2[v0.A.8] record format from store /tmp/backup
2017-11-17 17:49:07.716+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Format not configured. Selected format from the store: RecordFormat:StandardV3_2[v0.A.8]
2017-11-17 17:49:07.755+0000 INFO [o.n.m.MetricsExtension] Initiating metrics...
....................  10%
...
.................... 100%
Backup complete.

All good so far. Now, let’s create more data:

kubectl exec neo-helm-neo4j-core-0 \
  -- bin/cypher-shell --format verbose \
  "UNWIND range(0,1000) AS id CREATE (:Person {id: id})"
 
 
0 rows available after 114 ms, consumed after another 0 ms
Added 1001 nodes, Set 1001 properties, Added 1001 labels

And wait for another backup job to run:

kubectl get jobs -w
NAME                      DESIRED   SUCCESSFUL   AGE
neo4j-backup-1510941180   1         1            2m
neo4j-backup-1510941240   1         1            1m
neo4j-backup-1510941300   1         1            17s

kubectl logs $(kubectl get pods -a --selector=job-name=neo4j-backup-1510941300 --output=jsonpath={.items..metadata.name})
 
Destination is not empty, doing incremental backup...
Doing consistency check...
2017-11-17 17:55:07.958+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Selected RecordFormat:StandardV3_2[v0.A.8] record format from store /tmp/backup
2017-11-17 17:55:07.959+0000 INFO [o.n.k.i.s.f.RecordFormatSelector] Format not configured. Selected format from the store: RecordFormat:StandardV3_2[v0.A.8]
2017-11-17 17:55:07.995+0000 INFO [o.n.m.MetricsExtension] Initiating metrics...
....................  10%
...
.................... 100%
Backup complete.

If we were to extend this job further, we could have it copy each backup it takes to an S3 bucket, but I’ll leave that for another day.

Backup Kubernetes Neo4j

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending