How to set up automatic SQL dumps from an EC2 hosted DB to S3 bucket

This guide

This guide covers automated, regular SQL dumps to S3.

The AWS services to use for this are:

Systems manager “runcommand” feature to start the backup
S3 to have the backup stored
A Lambda function that does the runcommand operation
Eventbridge to schedule the lambda execution

Prerequisites:

- AWS Session manager (within Systems manager)

Advantages

You're not required to have a separate machine doing backups or use the DB host EC2 for this purpose.
No need to open any inbound ports (beyond outbound port 443 to systems manager client EP)
You can add logging and alerts to Cloudwatch to complement your existing logging
All backup would still be at a central place within Lambda and Eventbridge
Data traffic goes straight to database to S3 (no intermediary)

Disadvantages

A little long to set this up
Costs a tiny bit of money
No charge for => RunCommand, Session manager, EventBridge(free tier), Lambda (free tier),
Small charge for => S3, depending on DB size

Set up

Session manager

Make sure you have Session manager enabled and working with the EC2 that hosts the database

Lambda setup

Create a lambda with the following code:

import boto3
from botocore.exceptions import ClientError

def lambda_handler(event, context):
    ssm = boto3.client('ssm')
    instance_id = 'i-123456789' 
    database_user = 'db_user' 
    database_password = 'db_password' 
    database_name = 'db_name'
    bucket_name = 'db-backups-bucket'
    backup_file = 'backup_filename' 

    # The command to backup the database and upload to S3
    backup_command = f"mysqldump -h 127.0.0.1 -u {database_user} -p{database_password} {database_name} | aws s3 cp - s3://{bucket_name}/{backup_file}_$(date +%Y%m%d).sql"

    response = ''
    try:
        response = ssm.send_command(
            InstanceIds=[
                instance_id,
            ],
            DocumentName='AWS-RunShellScript',  # runs commands in the EC2 instance
            Parameters={
                'commands': [backup_command]
            },
        )
        print(f"Command sent to instance {instance_id}. Response: {response}")
    except ClientError as e:
        print(f"Unexpected error when sending command to EC2 instance: {e}")

    return {
        'statusCode': 200,
        'body': f"Command sent to instance {instance_id}. Command ID: {response['Command']['CommandId']}"
    }

This will create an IAM ROLE for this Lambda function. Add this extra permission to the role (beyond the Log creation one that was added automatically on creation)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "ssm:SendCommand",
            "Resource": [
                "arn:aws:ec2:region:account:instance/i-123456789",
                "arn:aws:ssm:region::document/AWS-RunShellScript"
            ]
        }
    ]
}

This will give permission to your lambda function to run a shell script on a particular EC2 machine.

💡

Lamba will ask you for logging. If you use Cloudwatch for logs, turn the log retention down to your liking, as the default is "never expire".

You can test this code by clicking "Test" and run it with no test input.

You'll either see an error => in this case it's probably a permission issue

You see no error => your shell script run, however it may have failed.

You can look at the execution logs in Run Command within Systems Manager and see something like this:

Clicking on the Failed item, you can see some details about the error:

As at this point this isn't set up fully, you'll probably see an error corresponding to not having access to S3. Let's do that next!

S3 setup

Make sure you have a bucket you want to place the backups in. Please don't have this bucket public 🙂

Add an IAM role to the EC2 machine to allow upload to S3 (this way Lambda isn't required to handle large amounts of data or wait until execution finishes)

This policy worked for me to access the bucket. It omits delete permission for safety.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets"
            ],
            "Resource": "arn:aws:s3:::*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::db-backups-bucket"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts",
                "s3:ListBucketMultipartUploads"
            ],
            "Resource": [
                "arn:aws:s3:::db-backups-bucket",
                "arn:aws:s3:::db-backups-bucket/*"
            ]
        }
    ]
}

Once this is done, re-test your lambda and it should work.

You can verify that the SQL backup file is in S3

Eventbridge

Go to Eventbridge Scheduler (it's a new feature).

Create a daily schedule to run a Lambda function:

Eventbridge then will create an IAM role for itself to give itself permissions to execute the Lambda function (or you can use your own)

Once this is done, you have a functioning daily backup.

Some things to improve

As currently S3 will keep the backup forever, there's an option here to delete the latest one regularly within S3 lifecycle roles.
Create an alarm via cloudwatch to notify of any issues with the backup
Regularly check the integrity of the backups

How to set up automatic SQL dumps from an EC2 hosted DB to S3 bucket

This guide

Set up

Session manager

Lambda setup

S3 setup

Eventbridge

Some things to improve

Comments

More from this blog

Add your own domain to your S3 hosted static website

How to AI code on your PC locally and privately (even with an AMD laptop!)

Create a close to $0 Static Website on AWS: Step-by-Step

How to access S3 like it's your local files

How to create AI reminders

Command Palette

This guide

Set up

Session manager

Lambda setup

S3 setup

Eventbridge

Some things to improve

Comments

More from this blog