Update 07-14-2018: AWS has introduced Amazon Data Lifecycle Manager (DLM) for EBS Snapshots, where you no longer need to use custom scripts like the one discussed in this post to snapshot your EBS volumes. I still find this post is a good beginners lesson if you are trying to learn what AWS Lambda is all about.
Update 07-27-2016: While configuring a Lambda EBS Backup Job for a client, I have noticed AWS has updated the Lambda dashboard since the original blog post. Event Sources is now Triggers..
In our previous blog post we discussed and explained AWS Lambda Pricing vs. EC2, in this post we will be showing how to backup EBS volumes utilizing Python scripts for creating and cleaning up snapshots.
In the past I have utilized EC2 instances to become an Ops Administrative server to run cron jobs to perform different tasks in AWS EC2, such as daily snapshots for backup purposes, resource tag auditing, and EBS volume cleanup of unattached volumes. I have heard about the Lambda service and wanted to see if I could run my batch administrative tasks utilizing Lambda. I found a great post by Ryan S. Brown over at Serverless Code that helped me get my feet wet in the Lambda world. AWS Lambda is a service that lets you run code without provisioning or managing EC2 servers, and only pay for the compute time consumed when the code runs. Lambda can be triggered by other AWS services or by an HTTP call. In our setup we will be setting up a scheduled event in CloudWatch to trigger our code to run daily. Its as easy as you just uploading your code and AWS takes care of the rest, currently as of this writing Java 8, Node.js 0.10 & 4.3, and Python 2.7 are the only supported languages.
All code if I make future changes will be hosted on GitHub, but below I will walk through how to setup the backup jobs in Lambda.
First lets create an IAM policy called “ebs-backup-worker“ with the following policy document:
Next we will create a IAM role also called “ebs-backup-worker“, select “AWS Lambda“ as the Role type, and attach the “ebs-backup-worker“ policy created above. When completed and you check the trust relationship in the role through “Edit Trust Relationship“ is should look like below:
Now lets create the Lambda functions, first we will configure the Backup job, so go into Lambda in the AWS console and create a new function. Skip the “select blueprint” step and move on to step 2. Next we will configure the actual function with our backup Python script from GitHub. Lets give the function a name of “EBS-Backups“ and enter a description to your liking. Now change the Runtime to “Python 2.7“ and copy and paste the raw Python backup script into the “Lambda function code“ section. We will leave the “Handler“ section as the default, but now we will assign the “ebs-backup-worker“ role we created above in IAM. We will keep the memory at 128MB but up the timeout to 10 seconds (make it higher if you have a larger environment), and select no VPC. Below is screenshot of what you should be seeing in the “Configure function“ section.
I have added functionality to the Python scripts to run them against multiple regions. If you go to the README on GitHub I explain how to add your regions as an environment variable. Because of limitations in Lambda the list needs to be a Base64 encoded string with commas deliminating the regions.
Now I made a few modifications to the backup Python script from the guys at Serverless Code, when the snapshot is created I am adding the correlating Instance ID to the description of the snapshot, and the name of the snapshot is the Device Name the volume being snapshotted is attached to (i.e. /dev/sda1). This just makes it easier to see the correlation of what this snapshot belongs to and where.
Click “Next” and “Create function” and we will move on to testing to see the script works. Open up the EC2 dashboard in a new tab so we can run the Lambda function and see that it is creating snapshots for our Instances we tag with “Backup”. So take an instance or two and create two tags, one of just “Backup” which will tell the backup script snapshot all EBS volumes attached to this instance, and a tag of “Retention” for how long you want to retain your snapshots for, the default is 7 days in the script if a retention tag is not found.
Now lets test the function, click the “Test” button in the Lambda dashboard, the sample event template of “Hello World” is fine, click “Save and test” and you should see a Log output telling you if the job was successful or not.
We can see my job was successful, had a duration of 2740.71 ms duration but billing is always rounded up, so we were billed at 2800 ms, and the maximum amount of memory used was 34 MB so the minimum configured of 128 MB was more than enough. We can check the Snapshots in the EC2 dashboard to see they were created with the information we desire.
Now we need to configure and Event source so that the job is triggered somehow, in our case it will be a “CloudWatch Events - Schedule” so go to the “Event sources” tab and click “Add event source.” Below is what we have setup, we are using a cron expression to invoke the job daily at 06:00 UTC. For more information see the following AWS documentation on Schedule Expressions.
That’s it now you should start getting snapshots daily at the time specified.
Final piece it to create a Lambda function for our Snapshot cleanup script. Perform the sames steps above that we did for the snapshot creation script but set the time for the CloudWatch Event to be say 5 minutes later.
All code and up to date information can be found on GitHub (https://github.com/cmachler/aws-lambda-ebs-backups).
Here is a quick YouTube video running through the setup discussed above:
In my next blog post I will be discussing using Code Manager in Puppet Enterprise 2016.1 with an in-house GitLab server to manage your Puppet code.