How to install external modules in a Python Lambda Function created by AWS CDK?

Question:

I’m using the Python AWS CDK in Cloud9 and I’m deploying a simple Lambda function that is supposed to send an API request to Atlassian’s API when an Object is uploaded to an S3 Bucket (also created by the CDK). Here is my code for CDK Stack:

from aws_cdk import core
from aws_cdk import aws_s3
from aws_cdk import aws_lambda
from aws_cdk.aws_lambda_event_sources import S3EventSource


class JiraPythonStack(core.Stack):
    def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        # The code that defines your stack goes here
        jira_bucket = aws_s3.Bucket(self,
                                    "JiraBucket",
                                    encryption=aws_s3.BucketEncryption.KMS)

        event_lambda = aws_lambda.Function(
            self,
            "JiraFileLambda",
            code=aws_lambda.Code.asset("lambda"),
            handler='JiraFileLambda.handler',
            runtime=aws_lambda.Runtime.PYTHON_3_6,
            function_name="JiraPythonFromCDK")

        event_lambda.add_event_source(
            S3EventSource(jira_bucket,
                          events=[aws_s3.EventType.OBJECT_CREATED]))

The lambda function code uses the requests module which I’ve imported. However, when I check the CloudWatch Logs, and test the lambda function – I get:

Unable to import module ‘JiraFileLambda’: No module named ‘requests’

My Question is: How do I install the requests module via the Python CDK?

I’ve already looked around online and found this. But it seems to directly modify the lambda function, which would result in a Stack Drift (which I’ve been told is BAD for IaaS). I’ve also looked at the AWS CDK Docs too but didn’t find any mention of external modules/libraries (I’m doing a thorough check for it now) Does anybody know how I can work around this?

Edit: It would appear I’m not the only one looking for this.

Here’s another GitHub issue that’s been raised.

Asked By: Jamie

||

Answers:

You should install the dependencies of your lambda locally before deploying the lambda via CDK. CDK does not have idea how to install the dependencies and which libraries should be installed.

In you case, you should install the dependency requests and other libraries before executing cdk deploy.

For example,

pip install requests --target ./asset/package

There is an example for reference.

Answered By: Kane

UPDATE:

It now appears as though there is a new type of (experimental) Lambda Function in the CDK known as the PythonFunction. The Python docs for it are here. And this includes support for adding a requirements.txt file which uses a docker container to add them to your function. See more details on that here. Specifically:

If requirements.txt or Pipfile exists at the entry path, the construct will handle installing all required modules in a Lambda compatible Docker container according to the runtime.

Original Answer:

So this is the awesome bit of code my manager wrote that we now use:


    def create_dependencies_layer(self, project_name, function_name: str) -> aws_lambda.LayerVersion:
        requirements_file = "lambda_dependencies/" + function_name + ".txt"
        output_dir = ".lambda_dependencies/" + function_name
        
        # Install requirements for layer in the output_dir
        if not os.environ.get("SKIP_PIP"):
            # Note: Pip will create the output dir if it does not exist
            subprocess.check_call(
                f"pip install -r {requirements_file} -t {output_dir}/python".split()
            )
        return aws_lambda.LayerVersion(
            self,
            project_name + "-" + function_name + "-dependencies",
            code=aws_lambda.Code.from_asset(output_dir)
        )

It’s actually part of the Stack class as a method (not inside the init). The way we have it set up here is that we have a folder called lambda_dependencies which contains a text file for every lambda function we are deploying which just has a list of dependencies, like a requirements.txt.

And to utilise this code, we include in the lambda function definition like this:


        get_data_lambda = aws_lambda.Function(
            self,
            .....
            layers=[self.create_dependencies_layer(PROJECT_NAME, GET_DATA_LAMBDA_NAME)]
        )

Answered By: Jamie

I ran into this issue as well. I used a solution like @Kane and @Jamie suggest just fine when I was working on my ubuntu machine. However, I ran into issue when working on MacOS. Apparently some (all?) python packages don’t work on lambda (linux env) if they are pip installeded on a different os (see stackoverflow post)

My solution was to run the pip install inside a docker container. This allowed me to cdk deploy from my macbook and not run into issues with my python packages in lambda.

suppose you have a dir lambda_layers/python in your cdk project that will house your python packages for the lambda layer.

current_path = str(pathlib.Path(__file__).parent.absolute())
pip_install_command = ("docker run --rm --entrypoint /bin/bash -v "
            + current_path
            + "/lambda_layers:/lambda_layers python:3.8 -c "
            + "'pip3 install Pillow==8.1.0 -t /lambda_layers/python'")
subprocess.run(pip_install_command, shell=True)
lambda_layer = aws_lambda.LayerVersion(
    self,
    "PIL-layer",
    compatible_runtimes=[aws_lambda.Runtime.PYTHON_3_8],
    code=aws_lambda.Code.asset("lambda_layers"))
Answered By: alex9311

It is not even necessary to use the experimental PythonLambda functionality in CDK – there is support built into CDK to build the dependencies into a simple Lambda package (not a docker image). It uses docker to do the build, but the final result is still a simple zip of files. The documentation shows it here: https://docs.aws.amazon.com/cdk/api/latest/docs/aws-lambda-readme.html#bundling-asset-code ; the gist is:

new Function(this, 'Function', {
  code: Code.fromAsset(path.join(__dirname, 'my-python-handler'), {
    bundling: {
      image: Runtime.PYTHON_3_9.bundlingImage,
      command: [
        'bash', '-c',
        'pip install -r requirements.txt -t /asset-output && cp -au . /asset-output'
      ],
    },
  }),
  runtime: Runtime.PYTHON_3_9,
  handler: 'index.handler',
});

I have used this exact configuration in my CDK deployment and it works well.

And for Python, it is simply

aws_lambda.Function(
    self,
    "Function",
    runtime=aws_lambda.Runtime.PYTHON_3_9,
    handler="index.handler",
    code=aws_lambda.Code.from_asset(
        "function_source_dir",
        bundling=core.BundlingOptions(
            image=aws_lambda.Runtime.PYTHON_3_9.bundling_image,
            command=[
                "bash", "-c",
                "pip install --no-cache -r requirements.txt -t /asset-output && cp -au . /asset-output"
            ],
        ),
    ),
)
Answered By: lxop

Wanted to share 2 template repos I made for this (heavily inspired by some of the above):

Hope they are helpful for folks 🙂

Lastly; if you want to see a long thread on this subject, see here: https://github.com/aws/aws-cdk/issues/3660

Answered By: John Peurifoy

As an alternative to my other answer, here’s a slightly different approach that also works with docker-in-docker (the bundling-options approach doesn’t).

Set up the Lambda function like

lambda_fn = aws_lambda.Function(
    self,
    "Function",
    runtime=lambdas.Runtime.PYTHON_3_9,
    code=lambdas.Code.from_docker_build(
        "function_source_dir",
    ),
    handler="index.lambda_handler",
)

and in function_source_dir/ have these files:

  • index.py (to match the above code – you can name this whatever you like)
  • requirements.txt
  • Dockerfile

Set up your Dockerfile like

# Note that this dockerfile is only used to build the lambda asset - the
# lambda still just runs with a zip source, not a docker image.
# See the docstring for aws_lambda.Code.from_docker_build
FROM public.ecr.aws/lambda/python:3.9.2022.04.27.10-x86_64

COPY index.py /asset/
COPY requirements.txt /tmp/
RUN pip3 install -r /tmp/requirements.txt -t /asset

and the synth step will build your asset in docker (using the above dockerfile) then pull the built Lambda source from the /asset/ directory in the image.

I haven’t looked into too much detail about why the BundlingOptions approach fails to build when running inside a docker container, but this one does work (as long as docker is run with -v /var/run/docker.sock:/var/run/docker.sock to enable docker-in-docker). As always, be sure to consider your security posture when doing this.

Answered By: lxop