How to do cross-cloud backup replication

·

4 min read

How to do cross-cloud backup replication

Introduction

A good practice to do backups is to follow the 3-2-1 rule. That's having

  • 3 backups of the same data

  • 2 of them on separate media

  • 1 of them off-site

Translating this to the cloud we can do:

  • 3 backups of the same data

  • 2 of them are on separate clouds

  • 1 of them offline

In this guide we'll deal with the "separate cloud" approach.

Options you may have

There are various options out there and generally, you have a category of these options:

  1. Self-build scripts

  2. Native Cloud products (AWS, GCP, Azure, etc.)

  3. Off-the-shelf 3rd party products

While all of the above could be a solution to your problem, this guide will focus on #2, and specifically AWS => GCP

Initial setup

You'll need the following:

  • An S3 bucket on AWS with something to back up

  • A GCP bucket you'll clone the files to

The tools we'll use:

  • AWS S3

  • AWS IAM

  • GCP Storage tranfer service

  • GCP Cloud Storage

  • GCP Pub/Sub

  • GCP Cloud Functions

  • (GCP Event Arc)

How it will work?

  1. Item is uploaded to AWS S3

  2. GCP Storage Transfer Service fires weekly/daily/etc. to copy over files

  3. On fail/success, Storage service publishes a message on GCP Pub/Sub

  4. On a Pub/Sub message, EventArc is triggered, which calls GCP Cloud Functions

  5. GCP Cloud function takes the event, extracts the status and sends it to Slack

Step-by-step tutorial on how to do the above

Create a GCP bucket

This is a fairly simple task, you can pick a name, storage class, region, etc. for your bucket.

Make sure the bucket isn't public, and its name is unique.

Create an IAM user and access key in AWS

This will be used by your GCP service.

Here’s an example permission:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::your-db-backups",
                "arn:aws:s3:::your-db-backups/*"
            ]
        }
    ]
}

Then create an Access Key and Secret Key for this user and take (temporary) note of it.

The job is done with AWS for this tutorial. To GCP!

Create a Storage Transfer service job

  1. Choose source (S3) and destination (GCP)

  2. Set up the source details with the Access key you generated in the previous step

  3. Choose destination GCP bucket (you may need to enable permissions here for the service account)

  4. Choose how often to run the job (or just run it once)

  5. Choose further misc. options (like deletion, overwrite, etc.) and make sure "Get transfer operation status updates via Cloud Pub/Sub notifications" is clicked.
    Create a new Pub/Sub topic here and select that.

  6. Done. At this point you can already test the job.

Create a Cloud function

  1. Go to your Pub/Sub topic that was created on storage service creation

  2. Click on “Trigger Cloud Function”, which will prompt you to create a new function there

  3. Add anything there (for now). This will create a basic function along with EventArc trigger, and hook them up together

  4. Go to your function and write the code to call slack. Here’s the one I used:

const axios = require('axios');

const functions = require('@google-cloud/functions-framework');
const url = `to be filled in`;

functions.cloudEvent('notifySlack', cloudEvent => {
  let payload;

  try {
    console.log('CloudEvent:', JSON.stringify(cloudEvent, null, 2));

    payload = {
      status: cloudEvent?.data?.message?.attributes?.eventType,
      description: `JOB: ${cloudEvent?.data?.message?.attributes?.transferJobName}`
    }
  }
  catch (error) {
    payload = {
      status: "GCP db backup status",
      description: "An exception has happened in GCP CloudFunctions, backups has failed."
    }
  }


  axios.post(url, payload)
    .then(response => {
      callback(null, response.data);
    })
    .catch(error => {
      callback(error);
    });
});
  1. You can test it via CloudShell and see the console log outputting some text.

Create your Slack Webhook

You can of course create any other notification method, or call and endpoint, but here we'll use Slack's built in webhooks.

  1. Find the "Workflow builder" in Slack Automations and create a new "Workflow"

  2. As trigger create a Webhook and take a note of it

  3. As an action send a message to your slack channel

  4. Go back to your Cloud function and add the webhook URL

  5. Done!

At this point you should be able to test your implementation.

A good way to test it without much effect is to disable the Access keys in S3.
This will fail the Storage Transfer job and will send a message to Slack.