Python3 – Is there a way to iterate row by row over a very large SQlite table without loading the entire table into local memory?

Question:

I have a very large table with 250,000+ rows, many containing a large text block in one of the columns. Right now it’s 2.7GB and expected to grow at least tenfold. I need to perform python specific operations on every row of the table, but only need to access one row at a time.

Right now my code looks something like this:

c.execute('SELECT * FROM big_table') 
table = c.fetchall()
for row in table:
    do_stuff_with_row

This worked fine when the table was smaller, but the table is now larger than my available ram and python hangs when I try and run it. Is there a better (more ram efficient) way to iterate row by row over the entire table?

Asked By: Marlon Dyck

||

Answers:

cursor.fetchall() fetches all results into a list first.

Instead, you can iterate over the cursor itself:

c.execute('SELECT * FROM big_table') 
for row in c:
    # do_stuff_with_row

This produces rows as needed, rather than load them all first.

Answered By: Martijn Pieters
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.