Restoring GCP Cloud SQL Instance using Backup of a different Instance in a different region

Harinderjit Singh
ITNEXT
Published in
9 min readMay 24, 2023

--

Purpose

This augments my previous post about Testing Business continuity of a sample application using GKE and GCP Cloud SQL. The purpose of compiling this post is to share the caveats related to restoring backups when the target Instance is different and is in different region. We will test and document the restoration process for Cloud SQL Instance and databases in the different target region (DR region).

Note : This is the current state as of May 2023. The intention of this post is to share the caveats related to Cloud SQL backups and exports. I hope this helps users in choosing the appropriate RDBMS PaaS product for their enterprise needs.

We will create the below Infrastructure using gcloud commands

  1. 3 KMS keyrings one in region us-central1, one in region us-east4 and one multi-region.
  2. One Key in each keyring.
  3. Service identity for Cloud SQL
  4. Cloud SQL Instance named server10 in region us-central1 of type Cloud SQL SQLServer 2019 Enterprise, having CMEK disk encryption.
  5. A Database in Instance server10 called “thordb”
  6. Cloud SQL Instance named replica1 in region us-east4 which will be the read replica of server10, having CMEK disk encryption.
  7. A Storage Bucket, which is multi-region

We will conduct following tests

  1. Restore Cloud SQL Instance in region us-east4 with a snapshot backup from a Cloud SQL Instance in Region us-central1 to test the Cloud SQL Instance restore.
  2. Restore Databases in an Instance in region us-east4 with export backup from a Cloud SQL Databases in region us-central1 to test the Cloud SQL database restore.

We will use CMEK Disk encryption for the Instances in both regions. We will also use CMEK encryption for the bucket (in us-central1) where the export files (backup sets) will be stored.

NOTE : Point in time recovery is not considered in this post.

Create and configure KMS keys

Currently, you can only use gcloud CLI commands to create the type of service account you need for customer-managed encryption keys.

gcloud auth login
PROJECT=<your project-id>
gcloud config set project $PROJECT

PROJ_NUM=`gcloud projects describe $PROJECT --format json | jq -r .projectNumber`

gcloud beta services identity create \
--service=sqladmin.googleapis.com \
--project=$PROJECT
Service identity created: service-302619737320@gcp-sa-cloud-sql.iam.gserviceaccount.com

Create a new key ring and Create a key on the key ring for both the regions.

gcloud services enable cloudkms.googleapis.com && \
gcloud kms keyrings create kmskeyring1 --location=us-central1 && \
gcloud kms keys create kmskey1 --location=us-central1 --keyring=kmskeyring1 --purpose=encryption

Note: The Cloud KMS key ring location must match the region where you want to create a Cloud SQL instance. A multi-region or global region key will not work. A request for creating a Cloud SQL instance fails if the regions don’t match. If No CMEK is supplied, google managed encryption key is used and is obscured from the customer.

gcloud kms keyrings create kmskeyring2 --location=us-east4 && \
gcloud kms keys create kmskey2 --location=us-east4 --keyring=kmskeyring2 --purpose=encryption

Create the key ring and the key for the Bucket as well. Why a different keyring and key for the bucket? Reason is Cloud SQL doesn't allow multi-region key and we need multi-region key for storing exports.

The next question can be “Why do we need the multi-region encryption key to encrypt storage bucket for database exports?”. It is needed because Storage Bucket should be multi-region and multi-region Bucket needs a multi-region encryption key. Storage Bucket is multi-region because we have a read replica in different regions and in the event of a disaster when the primary region is not available, the replica is promoted and if any database needs to be restored from an older export backup taken on Primary site, the export Storage Bucket should be accessible.

gcloud kms keyrings create kmskeyringstorage --location=us && \
gcloud kms keys create kmskey3 --location=us --keyring=kmskeyringstorage --purpose=encryption

Grant the service account (service identity/service agent) of Cloud SQL access to the keys. You only need to perform this procedure if you are using gcloud or the API. We also need to grant access of the multi-region Key to service account used by Storage Buckets at project Level.

# granting access of Keys to service identity of the Cloud SQL at project Level

gcloud kms keys add-iam-policy-binding kmskey1 \
--location=us-central1 \
--keyring=kmskeyring1 \
--member=serviceAccount:service-$PROJ_NUM@gcp-sa-cloud-sql.iam.gserviceaccount.com \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter
gcloud kms keys add-iam-policy-binding kmskey2 \
--location=us-east4 \
--keyring=kmskeyring2 \
--member=serviceAccount:service-$PROJ_NUM@gcp-sa-cloud-sql.iam.gserviceaccount.com \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter
gcloud kms keys add-iam-policy-binding kmskey3 \
--location=us \
--keyring=kmskeyringstorage \
--member=serviceAccount:service-$PROJ_NUM@gcp-sa-cloud-sql.iam.gserviceaccount.com \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter

# granting access of Key to service account used by Storage Buckets at project Level
gcloud kms keys add-iam-policy-binding kmskey3 \
--location=us \
--keyring=kmskeyringstorage \
--member=serviceAccount:service-$PROJ_NUM@gs-project-accounts.iam.gserviceaccount.com \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter

Create and configure Cloud SQL Instance and Replica

Enable sqladmin API and generate a password for root account.

pass generate -n temp "20"
PASSWORD=`pass temp`
gcloud services enable sqladmin.googleapis.com

Create Primary Cloud SQL Instance name server10 in us-central1.

gcloud sql instances create server10 --project=$PROJECT \
--disk-encryption-key=projects/$PROJECT/locations/us-central1/keyRings/kmskeyring1/cryptoKeys/kmskey1 \
--database-version=SQLSERVER_2019_ENTERPRISE --region=us-central1 --root-password=$PASSWORD --cpu=2 --memory=4GB

Create a Database in the Cloud SQL Instance and name it “thordb” .

gcloud sql databases create thordb --instance=server10

Create a read replica named replica1 in region us-east4

gcloud sql instances create replica1 \
--master-instance-name=server10 \
--region us-east4 --tier="db-custom-2-6144" \
--disk-encryption-key=projects/$PROJECT/locations/us-east4/keyRings/kmskeyring2/cryptoKeys/kmskey2

Execute Backup and Exports

Snapshot Backup

Take Backup on the Instance server10. On demand backups, by default, are available in multiple regions. Snapshot Backups include databases and logins.

gcloud sql backups create --async --instance=server10

gcloud sql backups list --instance=server10

Your backup will be encrypted with this instance’s customer-managed encryption key. If anyone destroys this key, all data encrypted with it will be permanently lost.

Database Exports

Create Bucket where we can store the database exports.

BUCKET_NAME=exportsbucket524203 


gcloud storage buckets create gs://$BUCKET_NAME \
--project $PROJECT --location US \
--uniform-bucket-level-access \
--default-encryption-key projects/$PROJECT/locations/us/keyRings/kmskeyringstorage/cryptoKeys/kmskey3 \
--public-access-prevention

Grant the appropriate permissions on the storage bucket to the Cloud SQL Instances.

  • server10 Instance requires role storage.objectCreator because it need to write to the Storage Bucket
  • replica1 Instance requires role storage.objectViewer because it need to read from the Storage Bucket
PSQLSA=`gcloud sql instances describe server10 --format json | jq -r .serviceAccountEmailAddress`
RSQLSA=`gcloud sql instances describe replica1 --format json | jq -r .serviceAccountEmailAddress`

gcloud storage buckets add-iam-policy-binding gs://$BUCKET_NAME \
--member=serviceAccount:$PSQLSA \
--role=roles/storage.objectCreator

gcloud storage buckets add-iam-policy-binding gs://$BUCKET_NAME \
--member=serviceAccount:$RSQLSA \
--role=roles/storage.objectViewer

Execute the export backup of the database named “thordb”. The export Backup piece(file) will be encrypted with the key configured while creating the Storage bucket. Export is used to backup only specific database within the Cloud SQL Instance.

gcloud sql export bak server10 gs://$BUCKET_NAME/thordb.bak --database=thordb

Restoring Instance from Snapshot backup

Since the target Instance has to be Running, we promote the replica to be the new Primary.

gcloud sql instances promote-replica replica1

Before restoring we should be able to verify the existing backups

sinh@sinh01:~$ gcloud sql backups list --instance=server10
ID WINDOW_START_TIME ERROR STATUS INSTANCE
1684949461280 2023-05-24T17:31:01.280+00:00 - SUCCESSFUL server10
1684948765170 2023-05-24T17:19:25.170+00:00 - SUCCESSFUL server10
sinh@sinh01:~$

Perform the restore on instance replica1 using the backup taken on Instance server10

gcloud sql backups \
restore $(gcloud sql backups list --instance=server10 --format json | jq -r .[0].id) \
--restore-instance=replica1 \
--backup-instance=server10

Restoring Cloud SQL instance...done.
Restored [https://sqladmin.googleapis.com/sql/v1beta4/projects/harin-sandbox/instances/replica1].
Restore successful

If the Primary instance where backup was take is offline(down), would it work?

I tried restoring while the primary was running and while it was down, for both scenarios it worked fine.

What if the region of the Cloud SQL instance server10 was unavailable, Would it still work?

If the region itself is down that means the Cloud SQL is down (it doesn't matter as we saw earlier) and the regional Key used for CMEK encryption is also not available, which means even if the backup is available in the other region i.e. us-east4 of replica1, it can not be used to restore because the key is regional and region is not available.

This scenario is same as when a disaster happens, primary regions goes down and you failover to your DR region. So it means you can not restore from the backups of the Cloud SQL in your original primary region once that region goes down.

If the Source Cloud SQL Instance is deleted, can we restore the backups related to that Instance in same region or different region?

If the source database is deleted you can’t restore the backups related to that. All data on an instance, including backups, is permanently lost when that instance is deleted. To preserve your data, export it to Cloud Storage before you delete it. This is a major drawback when we switchover to the DR region and we want to keep the Instance names in both regions. If you want to failback and want to keep the Instance names same as the original once, you will need to recreate the read replica and promote and in that process you will end up loosing the older snapshot backups. Only way to get around is to use database exports but those don't have point in recovery capabilities.

Tips about performing a restore

When you restore an instance from a backup, whether to the same instance or to a different instance, consider the following items:

  • The restore operation overwrites all data and users/logins on the target instance.
  • The target instance is unavailable for connections during the restore operation and existing connections are lost.
  • If you are restoring to an instance with read replicas, you must delete all replicas and recreate them after the restore operation completes.
  • The restore operation restarts the instance.

Tips for restoring to a different instance

When you are restoring a backup to a different instance, keep in mind the following restrictions and best practices:

  • The target Instance must exist within same project
  • The target instance must have the same database version and edition as the instance from which the backup was taken.
  • Cloud SQL always sets the storage capacity of the target instance to the maximum value of the size of both the configured disk and the backup disk which is of the size of the disk when the backup is taken.
  • The storage capacity of the target instance must be at least as large as the capacity of the instance being backed up. This requirement applies whether or not you are doing a point-in-time recovery of a single database.
  • The target instance must be in the RUNNABLE state.
  • The target instance can be in a different region from the source instance.
  • During an outage, you can still retrieve a list of backups in a particular project.
gcloud sql backups list --instance -
ID WINDOW_START_TIME ERROR STATUS INSTANCE
1684955917401 2023-05-24T19:18:37.401+00:00 - SUCCESSFUL replica1
1684949461280 2023-05-24T17:31:01.280+00:00 - SUCCESSFUL server10
1684948765170 2023-05-24T17:19:25.170+00:00 - SUCCESSFUL server10

Restoring the Database by Importing BAK export

Check the available exports to perform the import

gsutil ls -l gs://$BUCKET_NAME/

503808 2023-05-24T17:36:35Z gs://exportsbucket524203/thordb.bak
TOTAL: 1 objects, 503808 bytes (492 KiB)

Database name should be different or you first have to delete the existing database of the same name before you import.

gcloud sql import bak replica1 gs://$BUCKET_NAME/thordb.bak --database=thordb1

Importing data into Cloud SQL instance...done.
Imported data from [gs://exportsbucket524203/thordb.bak] into [https://sqladmin.googleapis.com/sql/v1beta4/projects/harin-sandbox/instances/replica1].

What if the region of the Cloud SQL instance server10 was unavailable, Would import still work?

Imports are totally segregated from the source Cloud SQL instance. To use execute an import using a backupset, you need to grant access of Bucket’s encryption key to cloud SQL service identity and you also need to grant ObjectViewer access of Bucket to Target cloud SQL Service account. Since Bucket and encryption key both are multi-region, unavailability of the source region doesn't matter in this case.

Things to consider for performing export and Import

  • Before starting a large export, ensure that at least 25 percent of the database size is free (on the instance). Doing so helps prevent issues with aggressive autogrowth, which can affect the availability of the instance.
  • Exports use database resources, but exports do not interfere with normal database operations unless the instance is under-provisioned.
  • Target Cloud SQL Instance may exist in different region and in different project as well.
  • You will need to map your database user and Instance logins again.
  • Database name should be different or you first have to delete the existing database of the same name before you execute import.
  • You cannot import a database that was exported from a higher version of SQL Server or import from a higher compatibility level into a lower one. For example, if you exported a SQL Server 2017 version, you cannot import it into a SQL Server 2014 version.
  • If your instance version is a Microsoft SQL Server Enterprise Edition, you can import encrypted BAK files.
  • Cloud SQL only supports importing a full backup with a single backup set.
  • Larger the database size longer it takes to do complete database exports and hence it affects the desired RPO.

Please read my other articles as well and share your feedback. If you like the content shared please like, comment, and subscribe for new articles.

--

--

Technical Solutions Developer (GCP). Writes about significant learnings and experiences at work.