Simplify database (psycopg2) usage by creating a module

Question:

Let me preface this by saying that I am fairly new to Python and I apologize if this is not the appropriate place for this question.

I am using the psycopg2 module to manipulate a PostgreSQL database. The general usage would look something like this:

# Example 1

import psycopg2

conn = psycopg2.connect(database="mydb", user="postgres")
cur = conn.cursor()

cur.execute ("SELECT * FROM mytable;")
rows = cur.fetchall()
for i, row in enumerate(rows):
    print "Row", i, "value = ", row

cur.close()
conn.close()

This will open a connection to the mydb database, select all fields from the table mytable and print them, and then close the connection.

What I would like to do is factor out some of these functions into a module because I will need to call them over and over throughout many scripts. For this example, ideally I would have a module named core which contains three functions:

  • core.db_init() – which opens the connection and consists of the first two lines of code above.
  • core.db_query(query) – which performs the desired SQL query and consists of the third line in the above code.
  • core.db_close() – which closes the connection and consists of the last two lines.

I have tried creating a module as follows:

# Module core.py

import psycopg2

def db_init():
    conn = psycopg2.connect(database="mydb", user="postgres")
    cur = conn.cursor()

def db_query(query):
    cur.execute(query)

def db_close():
    cur.close()
    conn.close()

But I get namespace errors when I try to recreate Example 1 using this module:

# Example 2

import core

core.db_init()
core.db_query("SELECT * FROM mytable;")

rows = cur.fetchall()
for i, row in enumerate(rows):
    print "Row", i, "value = ", row    

core.db_close()

I’m not even sure a module is actually what I want. Should I be using a class instead? Again, I’m very new to all of this. But if someone could help me figure out a better way to do this, I would be very grateful.

Asked By: eshapiro42

||

Answers:

Your main issue, is that each variable is limited to the function you wrote it in.
Unless otherwise declared like such:

def db_init():
    global conn
    conn = psycopg2....

A better approach would be to convert this into a class, a basic example would be:

import psycopg2

class MyDatabase():
    def __init__(self, db="mydb", user="postgres"):
        self.conn = psycopg2.connect(dbname=db, user=user)
        self.cur = self.conn.cursor()

    def query(self, query):
        self.cur.execute(query)

    def close(self):
        self.cur.close()
        self.conn.close()

db = MyDatabase()
db.query("SELECT * FROM table;")
db.close()

Now, the SELECT query won’t do much since you’re using cur.execute().
But i kept this on purpose to keep the code similar to what you wrote, you’ll want to swap that out to return the values however if calling a query that is expected to return a value and so on.

Your approach that is focused on functions will have "namespace" issues where variables live in a local scope of that function and there for other functions can’t normally access them.

Instead, class scoped variables can access its own variables and is there for not as limited out of the box.

You could make global variables and declare them as global in the functions, but I think as I mentioned in a comment:

You’d want to make this into a class. A database is a session based entity just as classes are session entities. Handle each connection as a living entity by class-abstracting it, otherwise cur and conn will become scoped variables and you need to work them into the global scope.

Answered By: Torxed

You could do it the way you want to – although I would consider using sqlalchemy or other module to handle those parts for you.

The code you pasted doesn’t work because cursor is not defined in your other methods.

Consider doing it in one call – for example:

# Module core.py

import psycopg2

def execute_query(query):
    conn = psycopg2.connect(database="mydb", user="postgres")
    cur = conn.cursor()
    results = cur.execute(query)
    cur.close()
    conn.close()
    return results

Note that this is not optimal and if you are doing a lot of small queries killing connection for each one is not the best idea.

Answered By: Daniel