How to efficiently use MySQLDB SScursor?

Question:

I have to deal with a large result set (could be hundreds thousands of rows, sometimes more).
They unfortunately need to be retrieved all at once (on start up).

I’m trying to do that by using as less memory as possible.
By looking on SO I’ve found that using SSCursor might be what I’m looking for, but I still don’t really know how to exactly use them.

Is doing a fetchall() from a base cursor or a SScursor the same (in term of memory usage)?

Can I ‘stream’ from the sscursor my rows one by one (or a few by a few), and if yes,
what is the most efficient way to do so?

Asked By: Sylvain

||

Answers:

Definitely use the SSCursor when fetching big result sets. It made a huge difference for me when I had a similar problem. You can use it like this:

import MySQLdb
import MySQLdb.cursors

connection = MySQLdb.connect(
        host=host, port=port, user=username, passwd=password, db=database, 
        cursorclass=MySQLdb.cursors.SSCursor) # put the cursorclass here
cursor = connection.cursor()

Now you can execute your query with cursor.execute() and use the cursor as an iterator.

Edit: removed unnecessary homegrown iterator, thanks Denis!

Answered By: Otto Allmendinger

I am in agreement with Otto Allmendinger’s answer, but to make explicit Denis Otkidach’s comment, here is how you can iterate over the results without using Otto’s fetch() function:

import MySQLdb.cursors
connection=MySQLdb.connect(
    host="thehost",user="theuser",
    passwd="thepassword",db="thedb",
    cursorclass = MySQLdb.cursors.SSCursor)
cursor=connection.cursor()
cursor.execute(query)
for row in cursor:
    print(row)
Answered By: unutbu

Alternatively, you can use SSCursor outside the connection object (it is pretty important when you already define connection and dont want all the connection use SSCursor as a cursorclass).

import MySQLdb
from MySQLdb.cursors import SSCursor # or you can use SSDictCursor

connection = MySQLdb.connect(
        host=host, port=port, user=username, passwd=password, db=database)
cursor = SSCursor(connection)
cursor.execute(query)
for row in cursor:
    print(row)   
Answered By: Yuda Prawira