Cassandra backup and recovery Stay organized with collections Save and categorize content based on your preferences.
This section discusses how to configure data backup and recovery for the Apache Cassandra database ring installed in the Apigee hybrid runtime plane. See alsoCassandra datastore.
What you need to know about Cassandra backups
Cassandra is a replicated database that is configured to have at least three copies of your data in each region or data center. Cassandra uses streaming replication and read repairs to maintain the data replicas in each region or data center at any given point.
In hybrid, Cassandra backups arenot enabled by default. It's a good practice, however, to enable Cassandra backups in case your data is accidentally deleted.
What is backed up?
The backup configuration described in this topic backs up the following entities:
- Cassandra schema including the user schema (Apigee keyspace definitions)
- Cassandra partition token information per node
- A snapshot of the Cassandra data
Where is backup data stored?
Backed up data is stored in a Google Cloud Storage bucket that you must create. Bucket creation and configuration is covered in this topic.
Scheduling Cassandra backups
Backups are scheduled ascron jobs in the runtime plane. To schedule Cassandra backups:
- Run the following
create-service-accountcommand to create a Google Cloud service account (SA) with the standardroles/storage.objectAdminrole. This SA role allows you to write backup data to Cloud Storage. Execute the following command in the hybrid installation root directory: For example:./tools/create-service-account apigee-cassandraOUTPUT_DIR
For more information about Google Cloud service accounts, seeCreating and managing service accounts../tools/create-service-account apigee-cassandra ./service-accounts
- The
create-service-accountcommand saves a JSON file containing the service account private key. The file is saved in the same directory where the command executes. You will need the path to this file in the following steps. - Create a Cloud Storage bucket. Specify a reasonable data retention policy for the bucket. Apigee recommends a data retention policy of 15 days.
- Open your
overrides.yamlfile. - Add the following
cassandra.backupproperties to enable backup. Do not remove any of the properties that are already configured.Parameters
cassandra: ... backup: enabled: true serviceAccountPath:SA_JSON_FILE_PATH dbStorageBucket:CLOUD_STORAGE_BUCKET_PATH schedule:BACKUP_SCHEDULE_CODE ...
Example
...cassandra:storage:type:gcepdcapacity:50Gigcepd:replicationType:regional-pdsslRootCAPath:"/Users/myhome/ssh/cassandra.crt"sslCertPath:"/Users/myhome/ssh/cassandra.crt"sslKeyPath:"/Users/myhome/ssh/cassandra.key"auth:default:password:"abc123"admin:password:"abc234"ddl:password:"abc345"dml:password:"abc456"nodeSelector:key:cloud.google.com/gke-nodepoolvalue:apigee-databackup:enabled:trueserviceAccountPath:"/Users/myhome/.ssh/my_cassandra_backup.json"dbStorageBucket:"gs://myname-cassandra-backup"schedule:"45 23 * * 6"...
Where: - Apply the configuration changes to the new cluster. For example:
./apigeectl apply -f overrides.yaml
| Property | Description |
|---|---|
backup:enabled | Backup is disabled by default. You must set this property totrue. |
backup:serviceAccountPath | SA_JSON_FILE_PATH The path on your filesystem to the service account JSON file that was downloaded when you ran |
backup:dbStorageBucket | CLOUD_STORAGE_BUCKET_PATH The Cloud Storage bucket path in this format: |
backup:schedule | BACKUP_SCHEDULE_CODE The time when the backup starts, specified instandard crontab syntax. Default: |
Restoring backups
Restoration takes your data from the backup location and restores the data into a new Cassandra cluster with the same number of nodes. No data is taken from the old Cassandra cluster.
The restoration instructions below are for single region deployments that use Google Cloud Storage for backups. For other deployments, see the following:
- For single region deployments that do not use Google Cloud Storage for backups see Backup and recovery without Google Cloud.
- For multi-region deployments, see Multi-region deployment on GKE and GKE on-prem.
To restore Cassandra backups:
- Create a new namespace within the existing Kubernetes cluster that will be used to restore the hybrid runtime deployment. Do not use the original namespace name for the new namespace. Do not use the old namespace for restoration.
- In the root hybrid installation directory, create a new
overrides-restore.yamlfile. - Copy the complete Cassandra configuration from your original
overrides.yamlfile into the newoverrides-restore.yamlfile. See the following command for an example.cp ./overrides.yaml ./overrides-restore.yaml
- Add a namespace element to the new
overrides-restore.yamlfile. Do not use the same namespace that was used for your original cluster.Parameters
namespace:YOUR_RESTORE_NAMESPACEcassandra:...restore:enabled:truesnapshotTimestamp:TIMESTAMPserviceAccountPath:SA_JSON_FILE_PATHdbStorageBucket:CLOUD_STORAGE_BUCKET_PATHimage:pullPolicy:Always...
Example
...namespace:cassandra-restorecassandra:storage:type:gcepdcapacity:50Gigcepd:replicationType:regional-pdsslRootCAPath:"/Users/myhome/ssh/cassandra.crt"sslCertPath:"/Users/myhome/ssh/cassandra.crt"sslKeyPath:"/Users/myhome/ssh/cassandra.key"auth:default:password:"abc123"admin:password:"abc234"ddl:password:"abc345"dml:password:"abc456"nodeSelector:key:cloud.google.com/gke-nodepoolvalue:apigee-datarestore:enabled:truesnapshotTimestamp:"20210203213003"serviceAccountPath:"/Users/myhome/.ssh/my_cassandra_backup.json"dbStorageBucket:"gs://myname-cassandra-backup"image:pullPolicy:Always...
Where:
Property Description namespaceYOUR_RESTORE_NAMESPACE
The name of the new namespace you created instep 1 for the new Cassandra cluster. Do not use the same namespace you used for your original cluster.
restore:enabledRestore is disabled by default. You must set this property to true.restore:snapshotTimestampTIMESTAMP
The timestamp of the backup snapshot to restore. To check what timestamps can be used, go to the
dbStorageBucketand look at the files that are present in the bucket. Each file name contains a timestamp value such as the following:backup_20210203213003_apigee-cassandra-default-0.tgzWhere20210203213003 is the
snapshotTimestampvalue you would use if you wanted to restore the backups created at that point in time.restore:serviceAccountPathSA_JSON_FILE_PATH
The path on your filesystem to the service account you created for the backup.
restore:dbStorageBucketCLOUD_STORAGE_BUCKET_PATH
The Cloud Storage bucket path where your backup data is stored in the following format:
gs://BUCKET_NAMEThe
gs://is required. - Change the
applabel on any Cassandra nodes in the old namespace by executing the following command:kubectl label pods --overwrite --namespace=OLD_NAMESPACE -l app=apigee-cassandra app=apigee-cassandra-old
- Create a new hybrid runtime deployment. This will create a new Cassandra cluster and begin restoring the backup data into the cluster:
./apigeectl init -f ../overrides-restore.yaml
./apigeectl apply -f ../overrides-restore.yaml
Once the restoration is complete, the traffic must be switched to use the Cassandra cluster in the new namespace. Run the following commands to switch the traffic:
kubectl get rs -nOLD_NAMESPACE # look for the 'apigee-connect' replicaset
kubectl patch rs -nOLD_NAMESPACEAPIGEE_CONNECT_RS_NAME -p '{"spec":{"replicas" : 0}}'- Once the traffic switch is complete, you can reconfigure backups on the restored cluster by removing the
restoreconfiguration and adding thebackupconfiguration to theoverrides-restore.yamlfile. ReplaceYOUR_RESTORE_NAMESPACE with the new namespace name created instep 1.namespace:YOUR_RESTORE_NAMESPACEcassandra:...backup:enabled:trueserviceAccountPath:SA_JSON_FILE_PATHdbStorageBucket:CLOUD_STORAGE_BUCKET_PATHschedule:BACKUP_SCHEDULE_CODE...
Then apply the
backupconfiguration with the following command:./apigeectl apply -f ../overrides-restore.yaml
Viewing the restore logs
You can check the restore job logs and usegrep to check forerror to make sure the restore log has no errors.
Verify the restore completed
Use the following command to check if the restore operation completed:
kubectl get pods
The output is similar to the following:
NAME READY STATUS RESTARTS AGEapigee-cassandra-default-0 1/1 Running 0 1hapigee-cassandra-default-1 1/1 Running 0 1hapigee-cassandra-default-2 1/1 Running 0 59mapigee-cassandra-restore-b4lgf 0/1 Completed 0 51m
View the restore logs
Use the following command to view the restore logs:
kubectl logs -f apigee-cassandra-restore-b4lgf
The output is similar to the following:
RestoreLogs:Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]todownloadfilegs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1/backup_20190405011309_schema.tgzINFO:downloadsuccessfullyextractedthebackupfilesfromgs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1finisheddownloadingschema.cqltocreateschemafrom10.32.0.28Warnings:dclocal_read_repair_chancetableoptionhasbeendeprecatedandwillberemovedinversion4.0dclocal_read_repair_chancetableoptionhasbeendeprecatedandwillberemovedinversion4.0Warnings:dclocal_read_repair_chancetableoptionhasbeendeprecatedandwillberemovedinversion4.0dclocal_read_repair_chancetableoptionhasbeendeprecatedandwillberemovedinversion4.0INFO:theschemahasbeenrestoredstartingapigee-cassandra-default-0indefaultstartingapigee-cassandra-default-1indefaultstartingapigee-cassandra-default-2indefault8495106waitingonwaitingnodes$pidtofinish84Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]INFO:restoredownloadedtarballandextractedthefilefromgs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO:restoredownloadedtarballandextractedthefilefromgs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO:restoredownloadedtarballandextractedthefilefromgs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO12:02:28Configurationlocation:file:/etc/cassandra/cassandra.yaml…...INFO12:02:41[Stream#e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedSummarystatistics:Connectionsperhost:3Totalfilestransferred:2Totalbytestransferred:0.378KiBTotalduration:5048msAveragetransferrate:0.074KiB/sPeaktransferrate:0.075KiB/sprogress:[/10.32.1.155]0:1/1100%1:1/1100%[/10.32.0.28]1:1/1100%0:1/1100%[/10.32.3.220]0:1/1100%1:1/1100%total:100%0.000KiB/s(avg:0.074KiB/s)INFO12:02:41[Stream#e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedprogress:[/10.32.1.155]0:1/1100%1:1/1100%[/10.32.0.28]1:1/1100%0:1/1100%[/10.32.3.220]0:1/1100%1:1/1100%total:100%0.000KiB/s(avg:0.074KiB/s)INFO12:02:41[Stream#e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedINFO12:02:41[Stream#e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedINFO:./apigee/data/cassandra/data/ks1/user-9fbae960571411e99652c7b15b2db6ccrestoredsuccessfullyINFO:Restore20190405011309completedINFO:./apigee/data/cassandra/data/ks1/user-9fbae960571411e99652c7b15b2db6ccrestoredsuccessfullyINFO:Restore20190405011309completedwaitingonwaitingnodes$pidtofinish106Restorefinished
Verify backup job
You can also verify your backup job after your backup cronjob is scheduled. After the cronjob has been scheduled, you should see something like this:
kubectl get pods
The output is similar to the following:
NAME READY STATUS RESTARTS AGEapigee-cassandra-default-0 1/1 Running 0 2hapigee-cassandra-default-1 1/1 Running 0 2hapigee-cassandra-default-2 1/1 Running 0 2hapigee-cassandra-backup-1554515580-pff6s 0/1 Running 0 54s
Check the backup logs
The backup job:
- Creates a
schema.cqlfile. - Uploads it to your storage bucket.
- Echoes the node to backup the data and uploads it at the same time.
- Waits until all of the data is uploaded.
kubectl logs -f apigee-cassandra-backup-1554515580-pff6s
The output is similar to the following:
myusername-macbookpro:cassandra-backup-utilitymyusername$kubectllogs-fapigee-cassandra-backup-1554577680-f9sc4startingapigee-cassandra-default-0indefaultstartingapigee-cassandra-default-1indefaultstartingapigee-cassandra-default-2indefault354657waitingonprocess35Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Requestedcreatingsnapshot(s)for[allkeyspaces]withsnapshotname[20190406190808]andoptions{skipFlush=false}Snapshotdirectory:20190406190808INFO:backupcreatedcassandrasnapshot20190406190808tar:Removingleading`/'frommembernames/apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots//apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots/20190406190808//apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Data.dbRequestedcreatingsnapshot(s)for[allkeyspaces]withsnapshotname[20190406190808]andoptions{skipFlush=false}Requestedcreatingsnapshot(s)for[allkeyspaces]withsnapshotname[20190406190808]andoptions{skipFlush=false}Snapshotdirectory:20190406190808INFO:backupcreatedcassandrasnapshot20190406190808tar:Removingleading`/'frommembernames/apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots//apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots/20190406190808//apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots//apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots/20190406190808//apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots//apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots/20190406190808//apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots//apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots/20190406190808//apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots//apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots/20190406190808//apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots/20190406190808/manifest.json……/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots//apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808//apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-CompressionInfo.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/schema.cql/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-CompressionInfo.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-CompressionInfo.db……/tmp/tokens.txt/[1files][0.0B/0.0B]Operationcompletedover1objects./[1files][0.0B/0.0B]Operationcompletedover1objects.INFO:backupcreatedtarballandtransferredthefiletogs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO:removingcassandrasnapshotINFO:backupcreatedtarballandtransferredthefiletogs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO:removingcassandrasnapshotRequestedclearingsnapshot(s)for[allkeyspaces]INFO:Backup20190406190808completedwaitingonprocess46Requestedclearingsnapshot(s)for[allkeyspaces]INFO:Backup20190406190808completedRequestedclearingsnapshot(s)for[allkeyspaces]waitingonprocess57INFO:Backup20190406190808completedwaitingresulttogetschemafrom10.32.0.28INFO:/tmp/schema.cqlhasbeengeneratedActivatedserviceaccountcredentialsfor:[apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]tar:removingleading'/'frommembernamestmp/schema.cqlCopyingfrom<TDIN>.../[1files][0.0B/0.0B]Operationcompletedover1objects.INFO:backupcreatedtarballandtransferredthefiletogs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1finisheduploadingschema.cql
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.