Cassandra backup and recovery

You are currently viewing version 1.2 of the Apigee hybrid documentation.This version is end of life. You should upgrade to a newer version. For more information, seeSupported versions.

This section discusses how to configure data backup and recoveryfor the Apache Cassandra database ring installed inthe Apigee hybrid runtime plane. See alsoCassandra database.

What you need to know about Cassandra backups

Cassandra is a replicated database that is configured tohave at least 3 copies of your data in each region or data center.Cassandra uses streaming replication and read repairs to maintain the data replicasin each region or data center at any given point.

In hybrid, Cassandra backups arenot enabled by default. It's a good practice, however, to enable Cassandra backups in case your data is accidentally deleted.

What is backed up?

The backup configuration described in this topic backs up the following entities:

  • Cassandra schema including the user schema (Apigee keyspace definitions)
  • Cassandra partition token information per node
  • A snapshot of the Cassandra data

Where is backup data stored?

Backed up data is stored in a Google Cloud Storage (GCS) bucket that you must create. Bucket creation and configuration is covered in this topic.

Scheduling Cassandra backups

Backups are scheduled as cron jobs in the runtime plane. To schedule Cassandra backups:

  1. Run the followingcreate-service-account command to create a GCP service account (SA) with the standardroles/storage.objectAdmin role.This SA role allows you to write backup datato Google Cloud Storage (GCS). Execute the following command in the hybrid installationroot directory:
    ./tools/create-service-account apigee-cassandraoutput-dir
    For example:
    ./tools/create-service-account apigee-cassandra ./service-accounts
    For more information about GCP service accounts, seeCreating and managing service accounts.
  2. Thecreate-service-account command saves a JSON file containing the service account private key. The file is saved in the same directory where the command executes. You will need the path to this file in the following steps.
  3. Create a GCS bucket. Specify a reasonable dataretention policy for the bucket. Apigee recommends a data retention policy of 15 days.
  4. Open youroverrides.yaml file.
  5. Add the followingcassandra.backup properties to enable backup. Do not remove any of the properties that are already configured.
    cassandra:  ...  backup:    enabled: true    serviceAccountPath:sa_json_file_path    dbStorageBucket:gcs_bucket_path    schedule:backup_schedule_code  ...
    Where:
    PropertyDescription
    enabledBackup is disabled by default. You must set this property totrue
    serviceAccountPathThe path on your filesystem to the service account JSON file that was downloaded when you ran./tools/create-service-account
    dbStorageBucketGCS storage bucket path in this format:gs://bucket_name. Thegs:// is required.
    scheduleThe time when the backup starts, specified instandard crontab syntax. Default:0 2 * * *

    Note: Avoid scheduling a backup that starts a short time after you apply the backup configuration to your cluster. When you apply the backup configuration, Kubernetes recreates the Cassandra nodes. If the backup starts before the nodes restart (possibly several minutes) the backup will fail.

    For example:
    ...cassandra:storage:type:gcepdcapacity:50Gigcepd:replicationType:regional-pdsslRootCAPath:"/Users/myhome/ssh/cassandra.crt"sslCertPath:"/Users/myhome/ssh/cassandra.crt"sslKeyPath:"/Users/myhome/ssh/cassandra.key"auth:default:password:"abc123"admin:password:"abc234"ddl:password:"abc345"dml:password:"abc456"nodeSelector:key:cloud.google.com/gke-nodepoolvalue:apigee-databackup:enabled:trueserviceAccountPath:"/Users/myhome/.ssh/my_cassandra_backup.json"dbStorageBucket:"gs://myname-cassandra-backup"schedule:"45 23 * * 6"...
  6. Apply the configuration changes to the new cluster. For example:
    ./apigeectl apply -c cassandra -fmy-overrides.yaml

Restoring backups

Restoration takes the data from the backup location and restores it into a new Cassandra cluster with the same number of pods. The new cluster must have a namespace that is different than your runtime plane cluster.

To restore Cassandra backups:

  1. Create a new Kubernetes cluster with a new namespace. You cannot use the same cluster/namespace that you used for the original hybrid installation.
  2. In the root hybrid installation directory, create a newoverrides-restore.yaml file.
  3. Copy the complete Cassandra configuration from your originaloverrides.yaml file into the new one.
  4. Add a namespace element. Do not use the same namespace you used for your original cluster.
  5. namespace:your-restore-namespacecassandra:storage:type:gcepdcapacity:50Gigcepd:replicationType:regional-pdnodeSelector:key:cloud.google.com/gke-nodepoolvalue:apigee-datasslRootCAPath:path_to_root_ca_filesslCertPath:path_to_ssl_cert_filesslKeyPath:path_to_ssl_key_fileauth:default:password:your_cassandra_passwordadmin:password:admin_passwordddl:password:ddl_passworddml:password:dml_passwordrestore:enabled:truesnapshotTimestamp:timestampserviceAccountPath:sa_json_file_pathdbStorageBucket:gcs_bucket_pathimage:pullPolicy:Always
    Where:
    PropertyDescription
    ssl*Path,auth.*Use the sameTLS auth credentials you used to create the original Cassandra database.
    snapshotTimestampThe timestamp of the backup snapshot to restore.
    serviceAccountPathThe path on your filesystem to the service account you created for the backup.
    dbStorageBucketGCS storage bucket path where your backup is stored, in this format:gs://bucket_name. Thegs:// is required.
    For example:
    namespace:cassandra-restorecassandra:storage:type:gcepdcapacity:50Gigcepd:replicationType:regional-pdsslRootCAPath:"/Users/myhome/ssh/cassandra.crt"sslCertPath:"/Users/myhome/ssh/cassandra.crt"sslKeyPath:"/Users/myhome/ssh/cassandra.key"auth:default:password:"abc123"admin:password:"abc234"ddl:password:"abc345"dml:password:"abc456"nodeSelector:key:cloud.google.com/gke-nodepoolvalue:apigee-datarestore:enabled:truesnapshotTimestamp:"20190417002207"serviceAccountPath:"/Users/myhome/.ssh/my_cassandra_backup.json"dbStorageBucket:"gs://myname-cassandra-backup"image:pullPolicy:Always

    WheresnapshotTimestamp is the timestamp associated with the backup you are restoring.

  6. Create the new Cassandra cluster: When you apply a new or modified backup configuration to your cluster, the existing Cassandra nodes will be restarted.
    ./apigeectl apply -c cassandra -f ./overrides-restore.yaml

Viewing the restore logs

You can check the restore job logs and grep forerror to make sure the restore log has no errors.

Verify the restore completed

To check if the restore operation completed:

kubectl get podsNAME                           READY     STATUS      RESTARTS   AGEapigee-cassandra-0             1/1       Running     0          1hapigee-cassandra-1             1/1       Running     0          1hapigee-cassandra-2             1/1       Running     0          59mapigee-cassandra-restore-b4lgf 0/1       Completed   0          51m

View the restore logs

To view the restore logs:

kubectl logs -f apigee-cassandra-restore-b4lgfRestore Logs:Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]to download file gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1/backup_20190405011309_schema.tgzINFO: download sucessfully extracted the backup files from gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1finished downloading schema.cqlto create schema from 10.32.0.28Warnings :dclocal_read_repair_chance table option has been deprecated and will be removed in version 4.0dclocal_read_repair_chance table option has been deprecated and will be removed in version 4.0Warnings :dclocal_read_repair_chance table option has been deprecated and will be removed in version 4.0dclocal_read_repair_chance table option has been deprecated and will be removed in version 4.0INFO: the schema has been restoredstarting apigee-cassandra-0 in defaultstarting apigee-cassandra-1 in defaultstarting apigee-cassandra-2 in default84 95 106waiting on waiting nodes $pid to finish  84Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]INFO: restore downloaded  tarball and extracted the file from  gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO: restore downloaded  tarball and extracted the file from  gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO: restore downloaded  tarball and extracted the file from  gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO  12:02:28 Configuration location: file:/etc/cassandra/cassandra.yaml…...INFO  12:02:41 [Stream #e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedSummary statistics:   Connections per host    : 3   Total files transferred : 2   Total bytes transferred : 0.378KiB   Total duration          : 5048 ms   Average transfer rate   : 0.074KiB/s   Peak transfer rate      : 0.075KiB/sprogress: [/10.32.1.155]0:1/1 100% 1:1/1 100% [/10.32.0.28]1:1/1 100% 0:1/1 100% [/10.32.3.220]0:1/1 100% 1:1/1 100% total: 100% 0.000KiB/s (avg: 0.074KiB/s)INFO  12:02:41 [Stream #e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedprogress: [/10.32.1.155]0:1/1 100% 1:1/1 100% [/10.32.0.28]1:1/1 100% 0:1/1 100% [/10.32.3.220]0:1/1 100% 1:1/1 100% total: 100% 0.000KiB/s (avg: 0.074KiB/s)INFO  12:02:41 [Stream #e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedINFO  12:02:41 [Stream #e013ee80-5863-11e9-8458-353e9e3cb7f9] All sessions completedINFO: ./apigee/data/cassandra/data/ks1/user-9fbae960571411e99652c7b15b2db6cc restored successfullyINFO: Restore 20190405011309 completedINFO: ./apigee/data/cassandra/data/ks1/user-9fbae960571411e99652c7b15b2db6cc restored successfullyINFO: Restore 20190405011309 completedwaiting on waiting nodes $pid to finish  106Restore finished

Verify backup job

You can also verify your backup job after your backup cronjob is scheduled. After the cronjob has been scheduled, you should see something like this:

kubectl get podsNAME                        READY     STATUS      RESTARTS   AGEapigee-cassandra-0          1/1       Running     0          2hapigee-cassandra-1          1/1       Running     0          2hapigee-cassandra-2          1/1       Running     0          2hapigee-cassandra-backup-1554515580-pff6s   0/1       Running     0          54s

Check the backup logs

The backup job:

  • Creates aschema.cql file.
  • Uploads it to your storage bucket.
  • Echoes the node to backup the data and uploads it at the same time.
  • Waits until all of the data is uploaded.
kubectl logs -f apigee-cassandra-backup-1554515580-pff6smyusername-macbookpro:cassandra-backup-utility myusername$ kubectl logs -f apigee-cassandra-backup-1554577680-f9sc4starting apigee-cassandra-0 in defaultstarting apigee-cassandra-1 in defaultstarting apigee-cassandra-2 in default35 46 57waiting on process  35Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Activated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]Requested creating snapshot(s) for [all keyspaces] with snapshot name [20190406190808] and options {skipFlush=false}Snapshot directory: 20190406190808INFO: backup created cassandra snapshot 20190406190808tar: Removing leading `/' from member names/apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots//apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots/20190406190808//apigee/data/cassandra/data/ks1/mytest3-37bc2df0587811e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Data.dbRequested creating snapshot(s) for [all keyspaces] with snapshot name [20190406190808] and options {skipFlush=false}Requested creating snapshot(s) for [all keyspaces] with snapshot name [20190406190808] and options {skipFlush=false}Snapshot directory: 20190406190808INFO: backup created cassandra snapshot 20190406190808tar: Removing leading `/' from member names/apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots//apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots/20190406190808//apigee/data/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots//apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots/20190406190808//apigee/data/cassandra/data/system/prepared_statements-18a9c2576a0c3841ba718cd529849fef/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots//apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots/20190406190808//apigee/data/cassandra/data/system/range_xfers-55d764384e553f8b9f6e676d4af3976d/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots//apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots/20190406190808//apigee/data/cassandra/data/system/peer_events-59dfeaea8db2334191ef109974d81484/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots//apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots/20190406190808//apigee/data/cassandra/data/system/built_views-4b3c50a9ea873d7691016dbc9c38494a/snapshots/20190406190808/manifest.json……/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots//apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808//apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-CompressionInfo.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Statistics.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Index.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/manifest.json/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Filter.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-2-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Summary.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/schema.cql/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-CompressionInfo.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-TOC.txt/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-Data.db/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-3-big-Digest.crc32/apigee/data/cassandra/data/ks2/user-d6d39d70586311e98e8d875b0ed64754/snapshots/20190406190808/mc-1-big-CompressionInfo.db……/tmp/tokens.txt/ [1 files][    0.0 B/    0.0 B]Operation completed over 1 objects./ [1 files][    0.0 B/    0.0 B]Operation completed over 1 objects.INFO: backup created tarball and transfered the file to gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO: removing cassandra snapshotINFO: backup created tarball and transfered the file to gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1INFO: removing cassandra snapshotRequested clearing snapshot(s) for [all keyspaces]INFO: Backup 20190406190808 completedwaiting on process  46Requested clearing snapshot(s) for [all keyspaces]INFO: Backup 20190406190808 completedRequested clearing snapshot(s) for [all keyspaces]waiting on process  57INFO: Backup 20190406190808 completedwaiting resultto get schema from 10.32.0.28INFO: /tmp/schema.cql has been generatedActivated service account credentials for: [apigee-cassandra-backup-svc@gce-myusername.iam.gserviceaccount.com]tar: removing leading '/' from member namestmp/schema.cqlCopying from.../ [1 files][    0.0 B/    0.0 B]Operation completed over 1 objects.INFO: backup created tarball and transfered the file to gs://gce-myusername-apigee-cassandra-backup/apigeecluster/dc-1finished uploading schema.cql

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.