Guillermo Ojeda
Cloudy Things: How to build on AWS

Follow

Cloudy Things: How to build on AWS

Follow
Best way to Automate AWS EBS Snapshots (without scripts)

Best way to Automate AWS EBS Snapshots (without scripts)

How to use Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of EBS Snapshots

Guillermo Ojeda's photo
Guillermo Ojeda
·Dec 16, 2022·

7 min read

I'm sure you're familiar with EBS. If you're not: It stands for Elastic Block Store, and it's a block-level storage service for EC2 instances. Essentially, it's a persistent (virtual) storage device that you can attach to your EC2 instances and use like a physical hard drive. For reference, EBS volumes are priced at $0.10 per GB-month in the us-east-1 region. There are multiple volume types, each with its own pricing and use cases, but I won't dive into that in this article. The point of this article is not EBS itself, but EBS snapshots.

What are EBS snapshots

EBS snapshots are point-in-time copies of your EBS volumes that you can use to back up your data. You create a backup (an EBS snapshot) of your disk (the EBS volume), and then you can restore a new EBS volume from that snapshot.

One thing to note about EBS snapshots is that they're incremental. This means that they only capture the data that has changed since the last snapshot. So, if you have a 100 GB volume and you take a snapshot, then make a small change to the volume and take another snapshot, the second snapshot will only contain the data that has changed since the first snapshot. This makes EBS snapshots more efficient and cost-effective than full-volume backups. The size of an EBS snapshot is calculated based on the amount of data stored in the volume at the time the snapshot was taken.

There are two categories of EBS snapshots: Standard snapshots and Archive snapshots. Standard snapshots are stored in Amazon S3 and are designed for fast recovery of data. They're the default type of snapshot and are suitable for most use cases. Archive snapshots, on the other hand, are stored in Amazon S3 Glacier and are designed for long-term data retention. They're more cost-effective than Standard snapshots, but have retrieval times usually of a few hours (because they're retrieved from Glacier instead of S3).

Using EBS snapshots for Disaster Recovery

The thing with EBS snapshots is that they're regional. This means that they can only be used in the region where they were created. If you need to use an EBS snapshot in another region (e.g. for disaster recovery) you'll need to export it. Keep in mind, if you're considering the scenario where a region becomes unavailable, you need to export the snapshot BEFORE the region becomes unavailable! Sounds obvious, but it's worth saying.

Here's the CLI command to export a snapshot (in case you want to write a script):

aws ec2 copy-snapshot --source-region <source-region> --source-snapshot-id <snapshot-id> --region <destination-region> --description <description>

Let's break it down:

  • aws ec2 copy-snapshot: This is the command to copy an EBS snapshot.

  • --source-region <source-region>: This parameter specifies the region where the snapshot is located. Replace <source-region> with the name of the source region, such as us-east-1.

  • --source-snapshot-id <snapshot-id>: This parameter specifies the ID of the snapshot to be copied. Replace <snapshot-id> with the actual ID of the snapshot.

  • --region <destination-region>: This parameter specifies the region where the snapshot should be exported. Replace <destination-region> with the name of the destination region, such as eu-west-1.

  • --description <description>: This parameter specifies a description for the snapshot. Replace <description> with a brief description of the snapshot, such as "backup of X data". This is for humans to read, but be descriptive!

For example, let's copy my awesome data stored in snapshot snap-0123456789abcdef from us-east-1 to eu-west-1:

aws ec2 copy-snapshot --source-region us-east-1 --source-snapshot-id snap-0123456789abcdef --region eu-west-1 --description "backup of my awesome data"

How to automate EBS snapshots with Amazon Data Lifecycle Manager (DLM)

There's 3 ways to automate EBS snapshot creation and copying to another region:

  • The bad way: Write a script to automate EBS snapshot creation, using the CLI command above.

  • The good way: Use AWS Systems Manager Automation to create automated EBS snapshots.

  • The better way: Use Data Lifecycle Manager (DLM) to automate EBS snapshot creation, retention and deletion.

AWS Data Lifecycle Manager (DLM) allows you to automate the creation and management of EBS snapshots. With DLM you create snapshot policies that specify the schedule, retention, and other settings for snapshot creation.

To use DLM to automatically take snapshots of EBS volumes, you can specify the volumes to include in the snapshot policy using tags. For example, you can create a snapshot policy that applies to all EBS volumes with the tag key "Snapshot" and value "true". This way, you can easily identify which volumes should be included in the snapshot policy by simply tagging them with this tag.

In addition to taking snapshots of EBS volumes, DLM also allows you to enable cross-region copy, which copies the snapshots to another region. This can be useful for disaster recovery or to create a backup of your data in a different location. To enable cross-region copy, you need to specify the destination region and the KMS key to use for encryption in the snapshot policy. Once enabled, DLM will automatically copy the snapshots to the specified region on the specified schedule.

Here's a sample CloudFormation template to automate EBS snapshot creation and copying to another region using DLM. I haven't tested this thoroughly, so use it with care.

---
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
  KmsKeyArn:
    Type: String
    Description: The ARN of the KMS key to use for encrypting cross-Region snapshot copies
  DestinationRegion:
    Type: String
    Description: The destination region to copy the snapshots to
Resources:
  SnapshotPolicy:
    Type: AWS::DLM::LifecyclePolicy
    Properties:
      Description: EBS snapshot policy with cross-Region copy
      PolicyDetails:
        ResourceTypes:
          - VOLUME
        TargetTags:
          -
            Key: Snapshot
            Value: true
        Schedules:
          - Name: DailySnapshot
            CopyTags: true
            CreateRule:
              Interval: 1
              IntervalUnit: DAYS
            RetainRule:
              Count: 7
        Parameters:
          ExcludeBootVolume: true
          RestorablePeriod: 0
          CrossRegionCopy:
            DestinationRegion: !Ref DestinationRegion
            Encrypted: true
            KmsKeyArn: !Ref KmsKeyArn

How to restore an EBS snapshot

You can create a new EBS volume from the snapshot and attach it to an EC2 instance like a regular EBS volume (I mean, it is a regular volume after all). If you're using a Standard snapshot, the process is fairly straightforward and the volume is ready for use as soon as it's created. If you're using an Archive snapshot, you need to first retrieve the snapshot from Amazon S3 Glacier and then create the volume from the snapshot. Like I said earlier, this can take several hours, depending on the size of the snapshot and the retrieval tier you choose. Told ya!

The pricing for restoring an Archive EBS snapshot is based on the size of the snapshot. In us-east-1, Archive snapshot restores are priced at $0.03 per GB of data retrieved. For example, if you need to restore a 200 GB Archive snapshot, the cost would be 200 * $0.03 = $6. Ok, still cheaper than storing it as Standard (if you restore it once a month). But remember the delay!

How much do EBS snapshots cost

They're priced per GB-month of data stored. In the us-east-1 region, Standard snapshots are priced at $0.05 per GB-month and Archive snapshots are priced at $0.0125 per GB-month.

Let's say you have a 200 GB Standard snapshot that you store for 30 days. The cost would be 200 $0.05 30 = $30. If you have a 200 GB Archive snapshot that you store for 30 days, the cost would be 200 $0.0125 30 = $7.50. A LOT cheaper, right? Remember the long retrieval times though.

Wrapping up:

  • EBS snapshots are backups of EBS volumes

  • They're incremental, i.e. they only save the diff since the last snapshot

  • There's Standard EBS snapshots (from S3, $0.05/GB-month) and Archive EBS snapshots (from Glacier, $0.0125/GB-month)

  • To restore, you create a new volume from the EBS snapshot. Super fast from Standard, very slow + additional price from Archive.

  • They're regional. If you want to use EBS snapshots for disaster recovery, export them to another region

  • You can automate EBS snapshot creation with a script (I put the command above) or with SSM Automation

  • Or you can automate EBS snapshot creation with DLM, either manually or using the sample CloudFormation template I put above.

Some alternatives to EBS:

  • S3: If you need shared storage where you do infrequent reads and writes, and don't mind that it's not block storage.

  • EFS: If you need shared block storage and/or need frequent reads or writes.

Alternatives to EBS snapshots if you're using EBS but only need to back up a part of the data:

  • Set up 2 EBS volumes: one for the OS and one for data that needs to be backed up. Use automated EBS snapshots for the 2nd volume.

  • Set up a script that copies data to an S3 bucket that has Cross-Region Replication (XRR) enabled. This is harder to maintain in the long run. You might find some sources online that tell you this is a good option, because it's easier to set up than automated EBS snapshots. Except that I already gave you everything you need to set up automated EBS snapshots. So do it the right way!


Thanks for reading!

If you're interested in building on AWS, check out my newsletter: Simple AWS.

It's free, runs every Monday, and tackles one use case at a time, with all the best practices you need.

If you want to know more about me, visit my website, www.guilleojeda.com