Problems using MySQL with AWS Lambda in Python
Question:
I am trying to get up and running with AWS Lambda Python (beginner in Python btw) but having some problems with including MySQL dependency. I am trying to follow the instructions here on my Mac.
For step number 3, I am getting some problems with doing the command at the root of my project
sudo pip install MySQL-python -t /
Error:
Exception:
Traceback (most recent call last):
File “/Library/Python/2.7/site-packages/pip-1.5.6-py2.7.egg/pip/basecommand.py”, line 122, in main
status = self.run(options, args)
File “/Library/Python/2.7/site-packages/pip-1.5.6-py2.7.egg/pip/commands/install.py”, line 311, in run
os.path.join(options.target_dir, item)
File “/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py”, line 292, in move
raise Error, “Destination path ‘%s’ already exists” % real_dst
Error: Destination path ‘/MySQL_python-1.2.5-py2.7.egg-info/MySQL_python-1.2.5-py2.7.egg-info’ already exists
I end up writing my following lambda function (works fine on my Mac), which is:
import MySQLdb
def lambda_handler(event, context):
# Open database connection
db = MySQLdb.connect(...)
# prepare a cursor object using cursor() method
cursor = db.cursor()
sql = "SELECT * FROM Users"
try:
# Execute the SQL command
cursor.execute(sql)
# Fetch all the rows in a list of lists.
results = cursor.fetchall()
for row in results:
fname = row[0]
lname = row[1]
age = row[2]
sex = row[3]
income = row[4]
# Now print fetched result
print ("lname=%s" %(lname))
except:
print "Error: unable to fecth data"
# disconnect from server
db.close()
What I went on to do is go to /Library/Python/2.7/site-packages and copying over the the MySQLdb folders/files that were downloaded when I did sudo pip install MySQL-python (without -t /) (I’m sure I’m doing something wrong here), to my lambda project, and then zipped the content along with the lambda_function.py and uploaded to AWS Lambda.
Then I get:
Unable to import module ‘lambda_function’: No module named MySQLdb
Grateful for any help and suggestions!
EDIT
Was able to do make sudo pip install MySQL-python -t /pathToProject work (thanks for the help in the comments) but now I get this when runing the lambda function:
Unable to import module ‘lambda_function’: /var/task/_mysql.so: invalid ELF header
I know that if I work on a Linux box, then it should work fine (as suggested by some people), but I am wondering if I can make it work from an OS X box.
Answers:
I believe your issue is mostly down to missing development packages. I think you will need the following:
sudo yum -y install mysql-devel
The problem happens similarly in my Ubuntu installer, the real problem is because it is pending on a mysql client connector driver. So the solution is install Mysql client-dev package to make MySQL-python happy(to make use of the client library).
# Ubuntu only(or setup vm for ubuntu inside your mac)
# Three dependencies for MySQL python recompilation
sudo apt-get install python-dev libssl-dev
#Now the mysql client-dev
sudo apt-get install libmysqlclient-dev
# If you like mariadb client
sudo apt-get install libmariadbclient-dev
For MAC
# try this first
fink install mysql-unified-dev
# or this if above fail.
brew install mysql
# you must add this to your user profile startup if you use brew
export PATH=$PATH:/usr/local/mysql/bin
You can get similar answer here : Mac OS X – EnvironmentError: mysql_config not found
Then try the pip install.
I don’t recommend anyone use “sudo pip”. You should setup Virtualenv and virtualwrapper for your python development, that allow you to pip without sudo. And it is easier to isolate and test new deployment. (although it doesn’t fix the mysqlclient-dev library issue)
You’ll have to use Amazon Linux instance to build your python packages and then to include them in your Lambda deployment package. Check out this excellent article about how to do it. Packages mentioned in the article are different from the one you need, but similarly it helped me to build psycopg2 and pymssql for my lambdas.
For a use case like Lambda you’ll be a lot happier using a pure python implementation like PyMySQL.
It’s a drop in replacement for MySQLdb that follows the Python Database API specification. For most things like triggered Lambda events it will be just as fast.
I’ve used it in production a lot and it works great.
Using lambda-docker you can set up and test your Lambda functions without access to a like-wise Linux environment.
To set up your lambda, use a lambda-docker build image to run a detached docker container and run pip install <package>
commands on the container. Then export the container, grab the installed packages under usr/lib
, and place them in your AWS Lambda package.
Then you can test for compatibility by running your lambda on a lambda-docker image. If it works, go forth and upload to AWS Lambda with confidence.
docker run -d -v "$PWD":/var/task lambci/lambda:build-python2.7 tail -f /dev/null
docker ps
docker exec 0c55aae443e6 pip install pandas
docker exec 0c55aae443e6 pip install sqlalchemy
docker exec 0c55aae443e6 pip freeze
docker exec 0c55aae443e6 python -c "import site; print(site.getsitepackages())"
docker container export -o lambda_ready_container 0c55aae443e6
Just update your lambda layer by uploading two packages:
– sqlalchemy
– PyMySQL (driver to use instead of mysqlclient)
Now update your driver url to “mysql+pymysql://…”.
This makes you use pymysql driver which is compatible with Lambda environment for your existing environments.
Don’t forget to set VPC endpoint for RDS. This keeps performance and security in check.
AWS recently came out with a great solution for the issue of database drivers and database access in Lambda: the Aurora Data API. The Data API tunnels SQL over HTTP using AWS standard auth. This bypasses the problems with compiling native code and using traditional database connection models in Lambda.
I ended up writing a DB-API compatible driver for it: aurora-data-api (and a SQLAlchemy dialect using it):
import aurora_data_api
cluster_arn = "arn:aws:rds:us-east-1:123456789012:cluster:my-aurora-serverless-cluster"
secret_arn = "arn:aws:secretsmanager:us-east-1:123456789012:secret:MY_DB_CREDENTIALS"
with aurora_data_api.connect(aurora_cluster_arn=cluster_arn, secret_arn=secret_arn, database="my_db") as conn:
with conn.cursor() as cursor:
cursor.execute("select * from pg_catalog.pg_tables")
print(cursor.fetchall())
Lambda -> Layers (add new layer)
download zip of pymysql from https://pypi.org/project/PyMySQL/#files
when you download, unzip then rename parent folder to “python” then rezip (should be python/{where the pysqlfiles are}
add a layer to Lambda called ‘pymysql’ and upload that zip
then in Lambda function import pymysql
TLDR: Yes, you CAN use mysqlclient
in AWS Lambda Python functions.
Here’s one way – by creating your own AWS Lambda Layer for mysqlclient
(i.e. MySQLdb
).
Then I get Unable to import module 'lambda_function': No module named MySQLdb
I know that if I work on a Linux box, then it should work fine (as suggested by some people), but I am wondering if I can make it work from an OS X box.
I too was facing the exact same error while trying to import MySQLdb
in my AWS Lambda Python function.
After a lot of searching for a solution and not happy with using pymysql
as a substitute (for performance and compatibility reasons), I ended up building my own AWS Lambda Layer for mysqlclient
. I could not find a "ready-made" layer for mysqlclient
– not even at the awesome KLayers project. I am glad to share a GitHub repo with an example "ready-made" layer and an easy solution to build your own custom layer for your requirements that uses the recommended procedure by AWS.
mysqlclient
(MySQLdb) is a Python wrapper around a high-performance C implementation of the MySQL API. This makes it typically much faster than pure-python implementations such as pymysql
in most cases (see this list for some examples), but it also brings some problems such as the one you are facing.
Since it is compiled against the mysql-devel
package (e.g. a .rpm
or .deb
file provided by MySQL), mysqlclient
is linked to a platform-specific binary such as libmysqlclient.so
in order to work. In other words, the libmysqlclient.so
from a Mac OS laptop (as an example) won’t work in the AWS Lambda environment which uses some form of Amazon Linux 2
as of this writing. You need a libmysqlclient.so
compiled in and for the AWS Lambda environment (or as close to it as possible) for it to work in your AWS Lambda function.
A closely-simulated AWS-Lambda environment is available in the form of Docker images from lambci.
So to package an AWS-Lambda compatible mysqlclient
you could:
- pull a suitable docker container such as
lambci/lambda:build-python3.8
- import the MySQL repo GPG key
- install the MySQL repo setup RPM so that
yum
can find and download other MySQL repo packages
yum install
the necessary dependencies such as the appropriate mysql-devel
rpm for your use-case
- run
pip install mysqlclient
in the container
- zip the necessary
libmysqlclient.so
file and mysqlclient’s python lib directories
This is more-or-less the officially-recommended procedure by AWS: see How do I create a Lambda layer using a simulated Lambda environment with Docker?
.
The zip thus created can be used to create a new AWS Lambda layer for mysqlclient
. You can use this layer to readily use mysqlclient
without any errors in your Lambda function.
After a lot of hair-pulling, I finally got the full procedure to work and automated it into a single script (build.sh
) in this GitHub project. The code builds a layer.zip
file that you can directly upload as a new AWS Lambda layer. The project currently builds for Python3.8 and MySQL server 8.0.x, but can be easily adapted to a different Python version and target MySQL version using the instructions and tools provided. There is also a ready-to-use layer.zip
in the repo – in case you want to use mysqlclient
against MySQL v8.0.x and in Python 3.8 (both tested) in your AWS Lambda function. Our production env uses SqlAlchemy which uses this MySqlClient Lambda layer and it’s been working great for us.
After you configure your Lambda function to use a layer built as described (e.g. using the tools in the aforementioned repo), you can just import MySQLdb
as usual in your Lambda function and get on with writing your real code:
import MySQLdb
def lambda_handler(event, context):
return {
'statusCode': 200,
'body': 'MySQLdb was successfully imported'
}
Hope this helps.
I am trying to get up and running with AWS Lambda Python (beginner in Python btw) but having some problems with including MySQL dependency. I am trying to follow the instructions here on my Mac.
For step number 3, I am getting some problems with doing the command at the root of my project
sudo pip install MySQL-python -t /
Error:
Exception:
Traceback (most recent call last):
File “/Library/Python/2.7/site-packages/pip-1.5.6-py2.7.egg/pip/basecommand.py”, line 122, in main
status = self.run(options, args)
File “/Library/Python/2.7/site-packages/pip-1.5.6-py2.7.egg/pip/commands/install.py”, line 311, in run
os.path.join(options.target_dir, item)
File “/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py”, line 292, in move
raise Error, “Destination path ‘%s’ already exists” % real_dst
Error: Destination path ‘/MySQL_python-1.2.5-py2.7.egg-info/MySQL_python-1.2.5-py2.7.egg-info’ already exists
I end up writing my following lambda function (works fine on my Mac), which is:
import MySQLdb
def lambda_handler(event, context):
# Open database connection
db = MySQLdb.connect(...)
# prepare a cursor object using cursor() method
cursor = db.cursor()
sql = "SELECT * FROM Users"
try:
# Execute the SQL command
cursor.execute(sql)
# Fetch all the rows in a list of lists.
results = cursor.fetchall()
for row in results:
fname = row[0]
lname = row[1]
age = row[2]
sex = row[3]
income = row[4]
# Now print fetched result
print ("lname=%s" %(lname))
except:
print "Error: unable to fecth data"
# disconnect from server
db.close()
What I went on to do is go to /Library/Python/2.7/site-packages and copying over the the MySQLdb folders/files that were downloaded when I did sudo pip install MySQL-python (without -t /) (I’m sure I’m doing something wrong here), to my lambda project, and then zipped the content along with the lambda_function.py and uploaded to AWS Lambda.
Then I get:
Unable to import module ‘lambda_function’: No module named MySQLdb
Grateful for any help and suggestions!
EDIT
Was able to do make sudo pip install MySQL-python -t /pathToProject work (thanks for the help in the comments) but now I get this when runing the lambda function:
Unable to import module ‘lambda_function’: /var/task/_mysql.so: invalid ELF header
I know that if I work on a Linux box, then it should work fine (as suggested by some people), but I am wondering if I can make it work from an OS X box.
I believe your issue is mostly down to missing development packages. I think you will need the following:
sudo yum -y install mysql-devel
The problem happens similarly in my Ubuntu installer, the real problem is because it is pending on a mysql client connector driver. So the solution is install Mysql client-dev package to make MySQL-python happy(to make use of the client library).
# Ubuntu only(or setup vm for ubuntu inside your mac)
# Three dependencies for MySQL python recompilation
sudo apt-get install python-dev libssl-dev
#Now the mysql client-dev
sudo apt-get install libmysqlclient-dev
# If you like mariadb client
sudo apt-get install libmariadbclient-dev
For MAC
# try this first
fink install mysql-unified-dev
# or this if above fail.
brew install mysql
# you must add this to your user profile startup if you use brew
export PATH=$PATH:/usr/local/mysql/bin
You can get similar answer here : Mac OS X – EnvironmentError: mysql_config not found
Then try the pip install.
I don’t recommend anyone use “sudo pip”. You should setup Virtualenv and virtualwrapper for your python development, that allow you to pip without sudo. And it is easier to isolate and test new deployment. (although it doesn’t fix the mysqlclient-dev library issue)
You’ll have to use Amazon Linux instance to build your python packages and then to include them in your Lambda deployment package. Check out this excellent article about how to do it. Packages mentioned in the article are different from the one you need, but similarly it helped me to build psycopg2 and pymssql for my lambdas.
For a use case like Lambda you’ll be a lot happier using a pure python implementation like PyMySQL.
It’s a drop in replacement for MySQLdb that follows the Python Database API specification. For most things like triggered Lambda events it will be just as fast.
I’ve used it in production a lot and it works great.
Using lambda-docker you can set up and test your Lambda functions without access to a like-wise Linux environment.
To set up your lambda, use a lambda-docker build image to run a detached docker container and run pip install <package>
commands on the container. Then export the container, grab the installed packages under usr/lib
, and place them in your AWS Lambda package.
Then you can test for compatibility by running your lambda on a lambda-docker image. If it works, go forth and upload to AWS Lambda with confidence.
docker run -d -v "$PWD":/var/task lambci/lambda:build-python2.7 tail -f /dev/null
docker ps
docker exec 0c55aae443e6 pip install pandas
docker exec 0c55aae443e6 pip install sqlalchemy
docker exec 0c55aae443e6 pip freeze
docker exec 0c55aae443e6 python -c "import site; print(site.getsitepackages())"
docker container export -o lambda_ready_container 0c55aae443e6
Just update your lambda layer by uploading two packages:
– sqlalchemy
– PyMySQL (driver to use instead of mysqlclient)
Now update your driver url to “mysql+pymysql://…”.
This makes you use pymysql driver which is compatible with Lambda environment for your existing environments.
Don’t forget to set VPC endpoint for RDS. This keeps performance and security in check.
AWS recently came out with a great solution for the issue of database drivers and database access in Lambda: the Aurora Data API. The Data API tunnels SQL over HTTP using AWS standard auth. This bypasses the problems with compiling native code and using traditional database connection models in Lambda.
I ended up writing a DB-API compatible driver for it: aurora-data-api (and a SQLAlchemy dialect using it):
import aurora_data_api
cluster_arn = "arn:aws:rds:us-east-1:123456789012:cluster:my-aurora-serverless-cluster"
secret_arn = "arn:aws:secretsmanager:us-east-1:123456789012:secret:MY_DB_CREDENTIALS"
with aurora_data_api.connect(aurora_cluster_arn=cluster_arn, secret_arn=secret_arn, database="my_db") as conn:
with conn.cursor() as cursor:
cursor.execute("select * from pg_catalog.pg_tables")
print(cursor.fetchall())
Lambda -> Layers (add new layer)
download zip of pymysql from https://pypi.org/project/PyMySQL/#files
when you download, unzip then rename parent folder to “python” then rezip (should be python/{where the pysqlfiles are}
add a layer to Lambda called ‘pymysql’ and upload that zip
then in Lambda function import pymysql
TLDR: Yes, you CAN use mysqlclient
in AWS Lambda Python functions.
Here’s one way – by creating your own AWS Lambda Layer for mysqlclient
(i.e. MySQLdb
).
Then I get
Unable to import module 'lambda_function': No module named MySQLdb
I know that if I work on a Linux box, then it should work fine (as suggested by some people), but I am wondering if I can make it work from an OS X box.
I too was facing the exact same error while trying to import MySQLdb
in my AWS Lambda Python function.
After a lot of searching for a solution and not happy with using pymysql
as a substitute (for performance and compatibility reasons), I ended up building my own AWS Lambda Layer for mysqlclient
. I could not find a "ready-made" layer for mysqlclient
– not even at the awesome KLayers project. I am glad to share a GitHub repo with an example "ready-made" layer and an easy solution to build your own custom layer for your requirements that uses the recommended procedure by AWS.
mysqlclient
(MySQLdb) is a Python wrapper around a high-performance C implementation of the MySQL API. This makes it typically much faster than pure-python implementations such as pymysql
in most cases (see this list for some examples), but it also brings some problems such as the one you are facing.
Since it is compiled against the mysql-devel
package (e.g. a .rpm
or .deb
file provided by MySQL), mysqlclient
is linked to a platform-specific binary such as libmysqlclient.so
in order to work. In other words, the libmysqlclient.so
from a Mac OS laptop (as an example) won’t work in the AWS Lambda environment which uses some form of Amazon Linux 2
as of this writing. You need a libmysqlclient.so
compiled in and for the AWS Lambda environment (or as close to it as possible) for it to work in your AWS Lambda function.
A closely-simulated AWS-Lambda environment is available in the form of Docker images from lambci.
So to package an AWS-Lambda compatible mysqlclient
you could:
- pull a suitable docker container such as
lambci/lambda:build-python3.8
- import the MySQL repo GPG key
- install the MySQL repo setup RPM so that
yum
can find and download other MySQL repo packages yum install
the necessary dependencies such as the appropriatemysql-devel
rpm for your use-case- run
pip install mysqlclient
in the container - zip the necessary
libmysqlclient.so
file and mysqlclient’s python lib directories
This is more-or-less the officially-recommended procedure by AWS: see How do I create a Lambda layer using a simulated Lambda environment with Docker?
.
The zip thus created can be used to create a new AWS Lambda layer for mysqlclient
. You can use this layer to readily use mysqlclient
without any errors in your Lambda function.
After a lot of hair-pulling, I finally got the full procedure to work and automated it into a single script (build.sh
) in this GitHub project. The code builds a layer.zip
file that you can directly upload as a new AWS Lambda layer. The project currently builds for Python3.8 and MySQL server 8.0.x, but can be easily adapted to a different Python version and target MySQL version using the instructions and tools provided. There is also a ready-to-use layer.zip
in the repo – in case you want to use mysqlclient
against MySQL v8.0.x and in Python 3.8 (both tested) in your AWS Lambda function. Our production env uses SqlAlchemy which uses this MySqlClient Lambda layer and it’s been working great for us.
After you configure your Lambda function to use a layer built as described (e.g. using the tools in the aforementioned repo), you can just import MySQLdb
as usual in your Lambda function and get on with writing your real code:
import MySQLdb
def lambda_handler(event, context):
return {
'statusCode': 200,
'body': 'MySQLdb was successfully imported'
}
Hope this helps.