“I’d love to restore a Database Backup” said no one ever. When you’re forced to do so, that means that the production system you’re maintaining is going belly up and your company (or the company you’re working for) is probably losing an outrageous amount of money for every second of downtime. It can happen for a variety of reasons:

  • Someone running a DELETE query with a WHERE clause larger than intended, or worse with no WHERE clause at all.
  • TRUNCATE and DROP queries being executed on the wrong tables or columns.
  • A faulty Disk that suddenly stops working.
  • A hacker decided to ruin your company and deleted everything or even worse, you got a ransomware that is now asking you to pay some Bitcoins to get your data back (spoiler: you wont).

So, if you worked in IT for more than a day, it should be clear that storing your Database Backups appropriately is of extreme importance. We live in the hope that we won’t need them, but when urgency calls, you’ll be glad of having gone the extra mile to protect your data and to have that database backup ready to be restored.

In this situations you, the DevOps engineer, or SysAdmin, or Cloud expert or even “the only developer in the company”, can either be the hero that saved the day or the one who didn’t care about taking backups.


If you are using Amazon RDS you are pretty much sorted out as RDS will create and safely store backups for you every night automatically.

But if you are not in the Cloud, or running a Database server on an EC2 instance to retain more control, or on a third-party VPS to save some cents, you may have to manage this yourself.

In this article, we will use Amazon S3 as a safe heaven for one of our most precious assets.

Amazon S3 has many advanced features that make it a great place to save your backups. In this tutorial we are going to use:

  • The AWS CLI: a command line interface to the AWS API, to manage the bucket and to upload the backups;
  • Object Lifecycle Management: reduce costs by moving old backups to Infrequent Access Storage Class;
  • Server-Side Encryption: make sure that data is encrypted at rest;
  • MFA Delete: avoid accidental or malicious backup deletion with a Multi Factor Authentication token;
  • Logging: keep track of who, where and why uploads and downloads.

Having some, even tiny, experience with AWS, it will make your life much easier. This guide, however, should be easy for you to follow even if it is your first time in the Cloud.

Create an AWS account

If you already have an AWS account, you can skip this section

  • Open the AWS Homepage and click on “Create an AWS Account”
  • Insert your email address, select “I am a new user” and click on “Sign in using our secure server”:

At this point you will be asked to fill in some informations about you or your company and a telephone number that will be verified by calling you.

Finally they will ask you for credit card details. But don’t worry, you don’t pay a penny just for having an AWS accounts. All AWS services are priced to the hour or to the GB of storage. An empty AWS account is free. And by the way, the first year of AWS is rich of freebies to let you experiment.

Get API Credentials

If you do have some experience with AWS you should already know how this works. Avoid using Root Credentials and prefer using an IAM user with restricted access. Keep the keys as secret as your bank account credentials.

For everyone else yet to be initiated to the Amazing world of Amazon Web Services, some more details:

When you create an AWS Account you login using what is called the “Root Account”. This Root Account is pretty much the Unix root user. It is almighty, powerful and pays the bills. Its credentials should be kept with the maximum secrecy, as they have the power to start multiple 9600$/month instances on your card.

Aside the Login credentials you just created you can also generate a pair of API credentials that are used in applications and CLIs to interact with AWS. Needless to say, API Credentials for the Root Account, are just as dangerous as the Login one, and you should avoid generating them entirely.

Countless botnets have been created using stolen AWS credentials leaked by someone so naïve to push them to GitHub. Save yourself a possible pain and stay away from root user API credentials.

Instead learn to use AWS Identity and Access Management (IAM) and IAM Users.

You can think of IAM Users as subaccounts you would generally handle to employees in your company, with well defined powers and capabilities, to let them do what they strictly need to do. By default a new IAM User has no permissions at all, and we use policies to declare what they should be allowed to do.

As the root user, we can generate Login credentials and/or API credentials. In this case we will only need the latter. These keys still have the power of doing damages, but with a much limited scope.

If your Database server is running on an AWS EC2 instance, you may not need an IAM User at all. You can assign a credential-less IAM Role to the EC2 instance with the permissions needed to save backups to S3. This is by any means the safest solution as it doesn’t involve handling dangerous secrets. IAM Roles can be a bit harsh to understand. Ask your AWS expert for more details.

Back to the original route. Open the IAM Console. On the left sidebar select “Users”, then, “New User”

Give the user a name (for example backups), and check the “Programmatic Access” checkbox. This will enable API Access for the user and generate its credentials. Proceed to the next page.

Here you have to choose permissions for the user. If you are an experienced AWS user, you may want to write your own policy to pick only the exact permissions needed for the job. In this tutorial we will use the AWS managed S3FullAccess Policy, which grants S3 superpowers to the user.

Review the settings and on the final page you will receive your secret credentials. Please, remember to keep them safe! The safety of your backups and of your S3 account in general depends on it.

First rule: do not push them to repositories, whether public or private.

From now on, I will assume the user executing the commands has all the right permissions to do so. In a production environment remember to follow the Principle of Least Privilege, which in this case means that the user creating the bucket should not be the same uploading the backups. To do so you’ll have to learn to write IAM Policies.

Install the AWS CLI

The Command Line Interface is a great tool to experiment and to interface with the AWS Platform. Most of the steps listed in this tutorial could be manually applied from the AWS Console.

But you don’t want to upload backups manually every day. You will need the CLI anyway.

If you are lucky enough to work with a Mac, and already use Homebrew, install the CLI using:

brew install awscli

On Linux and on Windows, ensure that you have Python and pip installed, then run:

pip install awscli

Before running any commands with the CLI, you must set the access credentials that the console uses to authenticate the API requests. Run aws configure and set the appropriate values for Access Key Id and Secret Access Key.

You will also be asked for a region. In this case you should probably use the region that is geographically close to you. You can find a list of available regions on the AWS Documentation.

Setting up the Bucket

At this point you have to create a bucket where to save backups. You could use one that you already own, but the purpose of buckets is to separate objects of different nature. Also some of the customizations that we are going to apply may not work for your other objects.

Time to choose your bucket name. Remember that bucket names must be globally unique, so “database-backups” is not going to work for everyone of you. My suggestion for non-public buckets is to suffix them with an UUID to ensure randomness (like backups-b3bd1643-8cbf-4927-a64a-f0cf9b58dfab). Once you have the name:

aws s3api create-bucket --bucket <bucketname>

Since we are going to enable logging, we’ll use the backups/ key prefix (or directory if you insist) to store the actual database dumps, and the logs/ prefix for S3 access logs.

If you prefer using your mouse and an UI, open the S3 Console, and click on “Create Bucket”.

When the popup appears, insert the name for your bucket and the region. Then leave all other settings to defaults and continue. By default your bucket is only accessible by the Root Account and authorized IAM Users.

Containing backup costs

Using Object Lifecycle Management we are going to move objects older than 30 days to the One Zone Infrequent Access Storage Class (pay less for storage, more for downloads). After 6 months the backup are probably going to be so old that would have no real use case, so we are going to expire them.

Copy the following JSON Lifecycle Configuration to a file (I will name mine lifecycle.json) and feel free to make the appropriate edits for your case:

  "Rules": [
      "ID": "Backups Lifecycle Configuration",
      "Status": "Enabled",
      "Prefix": "backups/",
      "Transitions": [
          "Days": 30,
          "StorageClass": "ONEZONE_IA"
      "Expiration": {
        "Days": 180
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 2

This configuration contains a single “Rule” definition, which is applied to objects prefixed with backups/. We are instructing the bucket to move objects to the STANDARD_IA Storage Class, to expire them after 180 days (roughly 6 months). Finally we are making sure that incomplete Multipart Uploads are aborted after 2 days, which is a best practice to apply to any bucket.

Run the following command to apply this configuration to your newly created bucket:

aws s3api put-bucket-lifecycle-configuration \
          --bucket <bucketname> \
          --lifecycle-configuration file://lifecycle.json

The same can be easily achieved on the S3 Console too. Select your bucket Management tab and click on the “Add Lifecycle rule” button. Get yourself confident with the wizard and fill all the required fields.

Protecting against accidental deletion

The last thing you want is to accidentally delete your precious backups. Or even worse to have some malicious actor trying to ruin the company.

MFA Delete protects against this scenario. All delete requests must be further authenticated with a Two Factor Authentication token, like the one used to protect your Gmail, Facebook or Bank account, to put it simply. There are two requirements to enable it:

  • First, your AWS Root account must have MFA enabled. Head over to your IAM Dashboard and enable it with either a Virtual or Physical device.
  • Second, MFA Delete requires Bucket Versioning to be enabled. We don’t really need it as we are not going to overwrite our objects, and it may add some costs should it happen…

Note: this step is optional. If it does feel like an overkill for your Kitten Blog Wordpress database, feel free to skip to the next section. Also, MFA Delete cannot be enabled on buckets using Object Lifecycles (as those may delete the files!), so chose which of the two features you care the most.

Once you have MFA Enabled on your account proceed to enable MFA Delete on the bucket. This is done using the same command as to enable versioning, which comes handy:

aws s3api put-bucket-versioning \
          --mfa <otp> \
          --bucket <bucketname> \
          --versioning-configuration Status=Enabled,MFADelete=Enabled

You will notice that this command line requires a One Time Password. Yes, AWS authenticates the request to enable MFA with MFA. In this way they are assured that your root account has MFA enabled and that you are authorized to make such change.

Unfortunately this operation cannot be completed with the S3 Console. You’ll have to use the CLI to enable MFA Delete.


By enabling Logging on the bucket we keep track of all uploads and downloads, authorized or malicious. This can be required for compliance reasons or just to have an IP trace in case of a data leak. This step is optional too.

We are going to prefix log objects with logs/.

Create a second JSON file (logging.json) with this content:

    "LoggingEnabled": {
        "TargetBucket": "<bucketname>",
        "TargetPrefix": "logs/"

And execute:

aws s3api put-bucket-logging \
          --bucket <bucketname> \
          --bucket-logging-status file://logging.json

Again, to do the same using the friendlier AWS Console, select your bucket from the S3 console, click on the Properties tab then on the Logging box.

Select the current bucket as target and logs/ as prefix and Save.

Great! Your bucket is all set to receive your DB Backups!

Generating a backup

The first step to saving backups is of course creating them.

If you are using PostgreSQL you can use pg_dumpall:

pg_dumpall -h [host] \
           -U [user] \
           --file=postgresql_backup.sql | gzip > postgres_backup.sql.gz

If you are backing up a single database, you can exploit the Postgres “Custom” dump format, which is an already compressed and optimized backup format:

pg_dump -h [host] \
        -U [user] \
        -Fc --file=postgres_db.dump [database_name]

If you are instead running a MySQL server, you can backup your database with:

mysqldump -u [user] \
          -p [password] \
          -h [host] \
          --single-transaction \
          --routines --triggers \
          --all-databases | gzip > mysql_backup.sql.gz

Naturally, you can follow this tutorial with any other database engine, like Oracle or SQL Server, once you find the appropriate command to generate a snapshot.

Storing the Backup in the Bucket

We’re almost there. Now we have a snapshot of the whole database into a single file. The last step is to actually upload the file to the bucket.

If you can use standard uploads, the next command will do the job:

S3_KEY=<bucketname>/backups/$(date "+%Y-%m-%d")-backup.gz
aws s3 cp <backupfile> s3://$S3_KEY --sse AES256

For the first time in this tutorial we used aws s3 instead of aws s3api. The latter is the “official” S3 client, supporting all API operations. While the s3 client is a useful abstraction on top of s3api, which supports a small number of operations with a reduced set of options.

In this case it makes our life easier by automatically using multipart uploads if our backup files start growing in size, which unless your file is only a few megabytes, is very much needed.

Doing Multipart Uploads manually with the s3api is quite painful. The s3 client takes care of all the nitty-gritty details for us and it just works nicely.

By using --sse AES256 we are asking S3 to perform encryption for data at rest. This is usually only needed for compliance reasons, unless you’re scared that an AWS employee may steal your data.

So you’re looking for a script to automate this?

Once you have set up the bucket, it’s very easy to script this and run it daily:


export AWS_ACCESS_KEY_ID=<iam_user_access_key>
export AWS_SECRET_ACCESS_KEY=<iam_user_secret_key>


mysqldump -u $MYSQL_USER \
          -p $MYSQL_PASSWORD \
          -h $MYSQL_HOST \
          --single-transaction \
          --routines --triggers \
          --all-databases | gzip > backup.gz

S3_KEY=$BUCKET/backups/$(date "+%Y-%m-%d")-backup.gz
aws s3 cp backup.gz s3://$S3_KEY --sse AES256

rm -f backup.gz

Save this to a file somewhere on your server, for example in your home, and make it executable:

chmod +x ./backup.sh

of course replace the <placeholders> with your actual values. Also if you’re not using MySQL replace the snapshot line with the appropriate command.

Setting up a cron job

To run this every day, at 12pm for example, run crontab -e and add the following line:

0 12 * * * /home/<youruser>/backup.sh

Save and celebrate. 🎉

If you prefer a more “cloud native” approach, check out my article on how to use a Lambda function to trigger scheduled events

Bonus: infrastructure as code

Since we all love Terraform and don’t use the AWS Console at all, the following Terraform stack is all you need to create a bucket configured exactly as we discussed up to now, with the exception of mfa_delete which must be enabled by hand on the AWS console.

variable "bucket_name" {}
variable "region" {}

provider "aws" {
    version = "~> 1.2"
    region = var.region

resource "aws_s3_bucket" "backup" {
  bucket = var.bucket_name
  acl    = "private"

  versioning {
    enabled = true

  logging {
    target_bucket = var.bucket_name
    target_prefix = "logs/"

  lifecycle_rule {
    id      = "backups"
    enabled = true

    prefix  = "backups/"

    transition {
      days          = 30
      storage_class = "ONEZONE_IA"

    expiration {
      days = 180

    abort_incomplete_multipart_upload_days = 2


If you made it this far, you are now generating and storing backups in an proper and secure way.

If your job is to maintain the infrastructure for a company (or your company), you probably cannot just copy and past whatever I did here. Your legal or compliance requirements will affect what and how you actually do most of this. You may not need Server Side Encryption or Logging for example, or you may be asked to never expire Database backups. In that case, give a look to Amazon Glacier for long term storage of cold data. It’s your job as the “AWS expert” find and customize the solution that best adapts to your use case.