How to install Numpy and Pandas for AWS Lambdas?

Question:

Problem:
I wanted to use Numpy and Pandas in my AWS lambda function. I am working on Windows 10 with PyCharm. My function compiles and works fine on local machine, however, as soon as package it up and deploy on AWS, it breaks down giving errors in importing the numpy and pandas packages. I tried reinstalling both packages and then redeploying however, error remained the same.

StackOverFlow Solutions:
Other people are having similar issues and fellow users have suggested that this is mainly compatibility issue, because Python libraries are compiled on Windows whereas, AWS Lambda runs on linux machines.

Question:
What’s the best way to create a deployment package for AWS on windows 10? Is there a way I can specify targeted platform while installing packages through PIP. Apparently there is an option in pip with tag –platform but I cannot figure out how to use it. Any helps?

Asked By: exan

||

Answers:

What you need is compressing the codes then upload them.

Pack all your dependencies

zip -r9 ../function.zip .

Pack your function

zip -g function.zip function.py

Update to lambda

aws lambda update-function-code --function-name python37 --zip-file fileb://function.zip (python37 is the function name here)

As for Windows users

to use zip commands, the easiest way is to using cygwin or use Windows Subsystem for Linux, but as zip command is just a command tool to compress files, any GUI compressing tool should work too.

References

Answered By: tim

Like often there is more than one way to come to a solution.

The preferred way imho is to use AWS lambda layers, because it separates the functional code from the dependencies. The basics are explained here.

  1. Get all your dependencies. Like you mentioned correctly, pandas and numpy have to be compiled for the AMI Linux. This can be done with the tool: “serverless python requirements” or with a docker container based on this image. A more detailed instruction can be found here.
  2. Put the dependencies in a folder called python.
  3. zip the whole folder e.g. with the preinstalled windows zipping tool.
  4. Upload the zip file to AWS as a layer: Go to AWS Lambda, from the left choose Layers and “Create a new layer”.
  5. After you saved the layer, go to your Lambda Function and choose “Layers”. Click “Add a layer” choose your newly created layer and click on save. Now your function should not get import errors anymore.
Answered By: ediordna

I also had a similar question (how to use numpy within a lambda function).

James provided a recent answer that now makes this much easier than before: AWS now provides a “native” (ie AWS provided) layer for SciPy that you can simply configure to be added as a layer when you define your function.

Putting a link to his answer here, for people that come across this thread first (like me)

…So while you still use layers, you no longer have to build/maintain/install your own SciPy layer yourself, but simply use the one provided by AWS.

So a much better solution now.

Answered By: Richard

Both Numpy and Pandas are available as public layers here: https://github.com/keithrozario/Klayers

As you mention, it’s difficult to build this on Windows, typically you’d need a Linux system to build the python requirements, or build it in a docker container.

I’m not a Windows 10 user, but I’m guessing you can use WSL (Windows Subsystem for Linux) to build the requirements locally and zip them up before using them as layers. To save trouble though, I’d rather just use the public layers I mentioned earlier.

Full disclosure: I own that repo that published the layers. It’s a free project –but with one downside. I delete older layers if a new version of the package (or one of it’s dependencies) are upgraded. Generally speaking, if you use the latest layer version, you’re guaranteed at least 30 days before the layer will be deleted. Functions that use deleted layers will still work, but you’d be unable to deploy new versions of functions on old layers.

Hope that clears things up.

Answered By: keithRozario

Amazon created a repository that deals with your situation:
https://github.com/awsdocs/aws-lambda-developer-guide/tree/master/sample-apps/blank-python

The blank app is an example on how to push a lambda function that depends on requirements, with the bonus that being made by Amazon.

One concern: it uses bash scripts, therefore you’ll need to adapt those or use WSL to get it work. (I’m pretty confident that its doable on windows, most of the work is done by the aws cli)

Answered By: Morti

I also was able to use Layers on my Lambda function where

  1. I could upload the latest version of Numpy on my windows environment
  2. Upload library package to AWS using Create a Layer.
  3. Go to my funtion’s info link and click
    Add A Layer and select the Numpy library that I just uploaded

No need to carry extra libarries in a zip with my integration code.

Answered By: Anthony Williams