How to connect to a remote PostgreSQL database through SSL with Python
Question:
I want to connect to a remote PostgreSQL database through Python to do some basic data analysis. This database requires SSL (verify-ca), along with three files (which I have):
- Server root certificate file
- Client certificate file
- Client key file
I have not been able to find a tutorial which describes how to make this connection with Python.
Any help is appreciated.
Answers:
Use the psycopg2
module.
You will need to use the ssl options in your connection string, or add them as key word arguments:
import psycopg2
conn = psycopg2.connect(dbname='yourdb', user='dbuser', password='abcd1234', host='server', port='5432', sslmode='require')
In this case sslmode
specifies that SSL is required.
To perform server certificate verification you can set sslmode
to verify-full
or verify-ca
. You need to supply the path to the server certificate in sslrootcert
. Also set the sslcert
and sslkey
values to your client certificate and key respectively.
It is explained in detail in the PostgreSQL Connection Strings documentation (see also Parameter Key Words) and in SSL Support.
You may also use an ssh tunnel with paramiko and sshtunnel:
import psycopg2
import paramiko
from sshtunnel import SSHTunnelForwarder
mypkey = paramiko.RSAKey.from_private_key_file('/path/to/private/key')
tunnel = SSHTunnelForwarder(
(host_ip, 22),
ssh_username=username,
ssh_pkey=mypkey,
remote_bind_address=('localhost', psql_port))
tunnel.start()
conn = psycopg2.connect(dbname='gisdata', user=psql_username, password=psql_password, host='127.0.0.1', port=tunnel.local_bind_port)
Adding this for completeness and because I couldn’t find it anywhere else on SO. Like @mhawke says, you can use psycopg2
, but you can also use any other Python database modules (ORMs, etc) that allow you to manually specify a database postgresql URI (postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&...]
) to connect to since the sslmode="require"
parameter that psycopg2.connect
uses to enforce ssl connections is just part of the postgresql://
URI that you use to connect to your database (see 33.1.2. Parameter Key Words). So, if you wanted to use sqlalchemy
or another ORM instead of vanilla psycopg2
, you can tack your desired sslmode
onto the end of your database URI and connect that way.
import sqlalchemy
DATABASE_URI = "postgresql://postgres:postgres@localhost:5432/dbname"
# sqlalchemy 1.4+ uses postgresql:// instead of postgres://
ssl_mode = "?sslmode=require"
DATABASE_URI += ssl_mode
engine = sqlalchemy.create_engine(URI)
Session = sqlalchemy.orm.sessionmaker(bind=engine)
There’s a nifty figure (Table 33.1) in the postgres documentation on SSL Support that breaks down the different options you can supply. If you want to use any of the fancier options that require you to specify a path to a specific certificate, you can drop it in with a format string.
If you need to connect to your PostgresSQL database with an SSL certificate using psycopg2, you’ll need to put your certificate SSL certificate in a subdirectory of your python program, and then you can reference the certificate in your connection string. I believe you could also set an environment variable as well, but in my example my SSL certificate will be in a subdirectory.
My python script is in a directory which looks like:
/Users/myusername/Desktop/MyCoolPythonProgram/test_database_connection.py
And my SSL certificate is in a directory which looks like:
/Users/myusername/Desktop/MyCoolPythonProgram/database/ssl_certificate/database/ssl_certificate/ca-certificate.crt
My HOSTNAME is a URL from DigitalOcean, but yours might be an IP Address instead.
This is what my test_database_connection.py script looks like:
import psycopg2
import os
POSTGRES_DATABASE_HOST_ADDRESS = "your-database-name-do-user-12345678-0.b.db.ondigitalocean.com"
POSTGRES_DATABASE_NAME = "defaultdb"
POSTGRES_USERNAME = "doadmin"
POSTGRES_PASSWORD = "$uperD00P3Rp@$$W0RDg0E$here"
# HOW TO (Relative Path Python): https://stackoverflow.com/questions/918154/relative-paths-in-python
path_to_current_directory = os.path.dirname(__file__)
relative_path_to_ssl_cert = 'database/ssl_certificate/ca-certificate.crt'
SSL_ROOT_CERT = os.path.join(path_to_current_directory , relative_path_to_ssl_cert )
POSTGRES_CONNECTION_PORT = "1234" # Set this to the correct port! Mine is provided by DigitalOcean and it's NOT 1234
db_info = "host='%s' dbname='%s' user='%s' password='%s' sslmode='require' sslrootcert='%s' port='%s'" % (POSTGRES_DATABASE_HOST_ADDRESS, POSTGRES_DATABASE_NAME, POSTGRES_USERNAME, POSTGRES_PASSWORD, SSL_ROOT_CERT, POSTGRES_CONNECTION_PORT)
postgres_connection = psycopg2.connect(db_info)
with postgres_connection:
with postgres_connection.cursor() as postgres_cursor:
sql = "SELECT * FROM your_table;"
postgres_cursor.execute(sql)
results = postgres_cursor.fetchall()
for row in results:
print("row in result")
print("Connection Success!")
# Close Database Cursor/Connection
postgres_cursor.close()
I want to connect to a remote PostgreSQL database through Python to do some basic data analysis. This database requires SSL (verify-ca), along with three files (which I have):
- Server root certificate file
- Client certificate file
- Client key file
I have not been able to find a tutorial which describes how to make this connection with Python.
Any help is appreciated.
Use the psycopg2
module.
You will need to use the ssl options in your connection string, or add them as key word arguments:
import psycopg2
conn = psycopg2.connect(dbname='yourdb', user='dbuser', password='abcd1234', host='server', port='5432', sslmode='require')
In this case sslmode
specifies that SSL is required.
To perform server certificate verification you can set sslmode
to verify-full
or verify-ca
. You need to supply the path to the server certificate in sslrootcert
. Also set the sslcert
and sslkey
values to your client certificate and key respectively.
It is explained in detail in the PostgreSQL Connection Strings documentation (see also Parameter Key Words) and in SSL Support.
You may also use an ssh tunnel with paramiko and sshtunnel:
import psycopg2
import paramiko
from sshtunnel import SSHTunnelForwarder
mypkey = paramiko.RSAKey.from_private_key_file('/path/to/private/key')
tunnel = SSHTunnelForwarder(
(host_ip, 22),
ssh_username=username,
ssh_pkey=mypkey,
remote_bind_address=('localhost', psql_port))
tunnel.start()
conn = psycopg2.connect(dbname='gisdata', user=psql_username, password=psql_password, host='127.0.0.1', port=tunnel.local_bind_port)
Adding this for completeness and because I couldn’t find it anywhere else on SO. Like @mhawke says, you can use psycopg2
, but you can also use any other Python database modules (ORMs, etc) that allow you to manually specify a database postgresql URI (postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&...]
) to connect to since the sslmode="require"
parameter that psycopg2.connect
uses to enforce ssl connections is just part of the postgresql://
URI that you use to connect to your database (see 33.1.2. Parameter Key Words). So, if you wanted to use sqlalchemy
or another ORM instead of vanilla psycopg2
, you can tack your desired sslmode
onto the end of your database URI and connect that way.
import sqlalchemy
DATABASE_URI = "postgresql://postgres:postgres@localhost:5432/dbname"
# sqlalchemy 1.4+ uses postgresql:// instead of postgres://
ssl_mode = "?sslmode=require"
DATABASE_URI += ssl_mode
engine = sqlalchemy.create_engine(URI)
Session = sqlalchemy.orm.sessionmaker(bind=engine)
There’s a nifty figure (Table 33.1) in the postgres documentation on SSL Support that breaks down the different options you can supply. If you want to use any of the fancier options that require you to specify a path to a specific certificate, you can drop it in with a format string.
If you need to connect to your PostgresSQL database with an SSL certificate using psycopg2, you’ll need to put your certificate SSL certificate in a subdirectory of your python program, and then you can reference the certificate in your connection string. I believe you could also set an environment variable as well, but in my example my SSL certificate will be in a subdirectory.
My python script is in a directory which looks like:
/Users/myusername/Desktop/MyCoolPythonProgram/test_database_connection.py
And my SSL certificate is in a directory which looks like:
/Users/myusername/Desktop/MyCoolPythonProgram/database/ssl_certificate/database/ssl_certificate/ca-certificate.crt
My HOSTNAME is a URL from DigitalOcean, but yours might be an IP Address instead.
This is what my test_database_connection.py script looks like:
import psycopg2
import os
POSTGRES_DATABASE_HOST_ADDRESS = "your-database-name-do-user-12345678-0.b.db.ondigitalocean.com"
POSTGRES_DATABASE_NAME = "defaultdb"
POSTGRES_USERNAME = "doadmin"
POSTGRES_PASSWORD = "$uperD00P3Rp@$$W0RDg0E$here"
# HOW TO (Relative Path Python): https://stackoverflow.com/questions/918154/relative-paths-in-python
path_to_current_directory = os.path.dirname(__file__)
relative_path_to_ssl_cert = 'database/ssl_certificate/ca-certificate.crt'
SSL_ROOT_CERT = os.path.join(path_to_current_directory , relative_path_to_ssl_cert )
POSTGRES_CONNECTION_PORT = "1234" # Set this to the correct port! Mine is provided by DigitalOcean and it's NOT 1234
db_info = "host='%s' dbname='%s' user='%s' password='%s' sslmode='require' sslrootcert='%s' port='%s'" % (POSTGRES_DATABASE_HOST_ADDRESS, POSTGRES_DATABASE_NAME, POSTGRES_USERNAME, POSTGRES_PASSWORD, SSL_ROOT_CERT, POSTGRES_CONNECTION_PORT)
postgres_connection = psycopg2.connect(db_info)
with postgres_connection:
with postgres_connection.cursor() as postgres_cursor:
sql = "SELECT * FROM your_table;"
postgres_cursor.execute(sql)
results = postgres_cursor.fetchall()
for row in results:
print("row in result")
print("Connection Success!")
# Close Database Cursor/Connection
postgres_cursor.close()