psycopg2 fetchmany vs named cursor

Question:

As per Psycopg2’s server-side-cursor documentation,

If the dataset is too large to be practically handled on the client side, it is possible to create a server side cursor. Using this kind of cursor it is possible to transfer to the client only a controlled amount of data, so that a large dataset can be examined without keeping it entirely in memory.

I can achieve similar behaviour using fetchmany:

fetchmany(100) # fetch first 100
fetchmany(50) # fetch next 50
fetchall() # fetch remaining

So, in the context of fetching controlled amount of data, how is fetchmany different from server-side-cursor? In what scenarios should I prefer one over the other?

Asked By: Shiva

||

Answers:

I think the key is in the first paragraph of server-side-cursors docs (my boldface):

When a database query is executed, the Psycopg cursor usually fetches all the records returned by the backend, transferring them to the client process. If the query returned an huge amount of data, a proportionally large amount of memory will be allocted by the client.

With a "normal" cursor, the entire result set gets transferred to the client immediately. With a named cursor, you can "pull" data across the client/server interface as you need it.

For example, notice the difference in execution time of this program, depending on what type of cursor it uses (you may have to tweak the connection parameters):

import psycopg2, time
conn = psycopg2.connect(dbname='template1')
if False:
    c = conn.cursor('named')
else:
    c = conn.cursor()
c.execute('SELECT GENERATE_SERIES(1,100000000)')
print('Execute ok.')
rows = c.fetchmany(1)
print('Fetch ok.')
Answered By: Ture Pålsson