Developing an AWS Lambda container in Python and deploying via Terraform

Introduction

One of the use cases for Lambda I am intrigued about is report and chart generation. For example one could think of a scenario where groups within an organization familiar with Python and other reporting tools may want to generate reports and charts for the executive audience. If they can be provided some training on developing the python code as Lambda and pushing it to the repository tied to the CI/CD pipeline, then it frees them up to focus on what they do best and reduce dependency on the IT team to deploy artifacts that provide critical business insights. IT or the web team can just provide the cloud locations for these reports and the Lambda developer will ensure that the code they write is configured with that target location. The ultimate destination like the web site or portal will just render the reports from these locations within the web layout already familiar to the target audience.

This blog post shows how to implement a non-trivial python lambda that generates a chart from a public data source. Since the chart library dependencies caused unzipped deployment package size to exceed the file size limit of 250Mb, I had to implement it as a python lambda container. The python code in question is the same as my previous post showcasing some of python's charting capabilities. It is fairly straightforward to take python code written using Jupyter notebook and modify it to work as a AWS Lambda function. But the unfamiliar nuances for a developer are all around packaging and deployment. This blog will also walk you through the Terraform script to build and deploy the Lambda as a container to AWS. If you implement this code in your AWS account you may incur some minimal charges when you run the Lambda. If you are also interested in other cloud serverless implementations like Azure Function and Google Cloud Function, your learning exercise can be to take this code and modify it to work on those platforms as well.

You will need your test AWS account and have programmatic access enabled from your local development environment in order to deploy the code described in this post. If you do not have an AWS account, please create a free account using this. Next, get programmatic access following the steps in here.

Implementing existing python code as AWS Lambda

I added the def handler(event, context) function to the code you see in my previous post to implement the Lambda as you can see in here. The lambda will be triggered when the source CSV file is uploaded to a S3 location. This S3 location is specified as a resource inputlocation in the Terraform script. The Lambda will process the CSV and generate the chart in an output location which is specified as outputlocation in terraform/lambda.tf. The chart is a .png file defined in locals.outputfilename in the Terraform script. See the code on Github for more details.

The Docker file that describes what goes into the container

Here is what the Docker file describes:

Copy the requirements.txt with the list of dependent libraries onto the container's current directory.
Install the libraries on the root folder as defined by the AWS reserved variable LAMBDA_TASK_ROOT.
The matplotlib library uses a folder defined in the environment variable MPLCONFIGDIR to store some caches to improve performance. The Lambda execution environment provides a file system for our code to use at /tmp. So the Docker file has commands to create a /tmp/matplotlib folder and set the environment variable to it.
Finally the CMD is set to app.handler function - The handler() function in the app.py file.

The Terraform code to package the Lambda and deploy it

Install Terraform. The Terraform script does the following:

Creates the input and output s3 locations.
Creates the ECR repository such that it can be deleted when terraform destroy is called.
Creates the docker image and pushes the same to the ECR repository.
Creates the AWS IAM role that allows the AWS Lambda service to execute the Lambda function.
Creates the IAM policy and assigns it to the role. The IAM policy will allow the creation of Cloudwatch log group and log stream, provide read access on the input s3 location and write access on the output s3 location.
Creates the lambda function and assigns the AWS IAM role to it. Also, adds the environment variable for the output s3 bucket and the output filename.
Finally creates the trigger for the Lambda which is upload of a file to the input s3 location.

The terraform script has been tested on Linux subsystem in Windows. Also, this was tested using Docker Desktop on Windows. So if you are on Windows:

Install Docker desktop from here.
Run Docker Desktop.
Please enable the Linux subsystem. See these instructions.
Launch the bash terminal and execute these:

cd <location where you cloned the repository from Github>/terraform

terraform init

terraform plan -out=./plan.txt

terraform apply ./plan.txt

If you see no errors from the above commands , you should see the Lambda deployed in your AWS account as reportgen--function Upload the CSV file cost_of_living_v2.csv. You will see the output .png file generated in the output S3 location.

Once you have tested, cleanup using:

terraform destroy

so that you do not incur charges.

Hope this post was useful to you. You can leave your comments in the Enter comments here.. box at end of this page and hit Submit Reply.