Get a list of field values from Python's sqlite3, not tuples representing rows
Question:
It’s annoying how Python’s sqlite3
module always returns a list of tuples! When I am querying a single column, I would prefer to get a plain list.
e.g. when I execute
SELECT somecol FROM sometable
and call
cursor.fetchall()
it returns
[(u'one',), (u'two',), (u'three',)]
but I’d rather just get
[u'one', u'two', u'three']
Is there a way to do this?
Answers:
data=cursor.fetchall()
COLUMN = 0
column=[elt[COLUMN] for elt in data]
(My previous suggestion, column=zip(*data)[COLUMN]
, raises an IndexError
if data
is an empty tuple. In contrast, the list comprehension above just creates an empty list. Depending on your situation, raising an IndexError
may be preferable, but I’ll leave that to you to decide.)
account for the case where cursor.fetchall() returns an empty list:
try:
columnlist = list(zip(*cursor.fetchall())[COLUMN_INDEX])
except IndexError:
columnlist = []
You don’t really want to do this – anything you do along the lines of using zip or a list comprehension is just eating CPU cycles and sucking memory without adding significant value. You are far better served just dealing with the tuples.
As for why it returns tuples, it’s because that is what the Python DBD API 2.0 requires from fetchall
.
I use the module pandas to deal with table-like content:
df = pd.DataFrame(cursor.fetchall(), columns=['one','two'])
The list of values for column ‘one’ is simply reffered as:
df['one'].values
You even can use you own index for the data referencing:
df0 = pd.DataFrame.from_records(cursor.fetchall(), columns=['Time','Serie1','Serie2'],index='Time')
sqlite3.Connection
has a row_factory
attribute.
The documentation states that:
You can change this attribute to a callable that accepts the cursor and the original row as a tuple and will return the real result row. This way, you can implement more advanced ways of returning results, such as returning an object that can also access columns by name.
To return a list of single values from a SELECT
, such as an id
, you can assign a lambda to row_factory
which returns the first indexed value in each row; e.g:
import sqlite3 as db
conn = db.connect('my.db')
conn.row_factory = lambda cursor, row: row[0]
c = conn.cursor()
ids = c.execute('SELECT id FROM users').fetchall()
This yields something like:
[1, 2, 3, 4, 5, 6] # etc.
You can also set the row_factory
directly on the cursor object itself. Indeed, if you do not set the row_factory
on the connection before you create the cursor, you must set the row_factory
on the cursor:
c = conn.cursor()
c.row_factory = lambda cursor, row: {'foo': row[0]}
You may redefine the row_factory
at any point during the lifetime of the cursor object, and you can unset the row factory with None
to return default tuple-based results:
c.row_factory = None
c.execute('SELECT id FROM users').fetchall() # [(1,), (2,), (3,)] etc.
zlist = []
tupls = c.fetchall()
for tup in tupls:
t = str(tup).replace("('","").replace("',)","")
zlist.append(t)
Now you don’t have to deal with tuples, pandas or any of the above that didn’t work.
I started with the following which gave me the same sort of list of tuples:
video_ids = []
for row in c.execute('SELECT id FROM videos_metadata'):
video_ids.append(row)
…and so to resolve it and get the list I expected I just explicitly pulled out the first element within the returned tuple…
video_ids = []
for row in c.execute('SELECT id FROM videos_metadata'):
video_ids.append(row[0])
This seems to do the trick for me with a single column (per the OP’s question) and is quite a simple solution (maybe it’s simplistic in some way I’ve not thought of). Not sure how it scales but runs fast enough to deal with the 5000+ entries I have without caring too much (31ms), we’re talking SQLite here so presumably this isn’t going to be dealing with bazillions of rows.
cursor.fetchall()
returns [(u'one',), (u'two',), (u'three',)]
,
if you want [u'one', u'two', u'three']
, use the following:
[x[0] for x in cursor.fetchall()]
As of some recent version of Python (I am using 3.9.4) it has become even easier to get a dictionary of results from sqlite3. It is in the documentation for Python. Essentially just make the connection equal to a sqlite3.Row and off you go.
con1 = sqlite3.connect("programs_aux.sqlite")
con1.row_factory = sqlite3.Row
cur1 = con1.cursor()
sql = 'select * from Main where watched is 0 order by Genre, Folder_runtime'
cur1.execute(sql)
rows = cur1.fetchall()
for row in rows:
print(row['Title'])
con1.close()
It’s annoying how Python’s sqlite3
module always returns a list of tuples! When I am querying a single column, I would prefer to get a plain list.
e.g. when I execute
SELECT somecol FROM sometable
and call
cursor.fetchall()
it returns
[(u'one',), (u'two',), (u'three',)]
but I’d rather just get
[u'one', u'two', u'three']
Is there a way to do this?
data=cursor.fetchall()
COLUMN = 0
column=[elt[COLUMN] for elt in data]
(My previous suggestion, column=zip(*data)[COLUMN]
, raises an IndexError
if data
is an empty tuple. In contrast, the list comprehension above just creates an empty list. Depending on your situation, raising an IndexError
may be preferable, but I’ll leave that to you to decide.)
account for the case where cursor.fetchall() returns an empty list:
try:
columnlist = list(zip(*cursor.fetchall())[COLUMN_INDEX])
except IndexError:
columnlist = []
You don’t really want to do this – anything you do along the lines of using zip or a list comprehension is just eating CPU cycles and sucking memory without adding significant value. You are far better served just dealing with the tuples.
As for why it returns tuples, it’s because that is what the Python DBD API 2.0 requires from fetchall
.
I use the module pandas to deal with table-like content:
df = pd.DataFrame(cursor.fetchall(), columns=['one','two'])
The list of values for column ‘one’ is simply reffered as:
df['one'].values
You even can use you own index for the data referencing:
df0 = pd.DataFrame.from_records(cursor.fetchall(), columns=['Time','Serie1','Serie2'],index='Time')
sqlite3.Connection
has a row_factory
attribute.
The documentation states that:
You can change this attribute to a callable that accepts the cursor and the original row as a tuple and will return the real result row. This way, you can implement more advanced ways of returning results, such as returning an object that can also access columns by name.
To return a list of single values from a SELECT
, such as an id
, you can assign a lambda to row_factory
which returns the first indexed value in each row; e.g:
import sqlite3 as db
conn = db.connect('my.db')
conn.row_factory = lambda cursor, row: row[0]
c = conn.cursor()
ids = c.execute('SELECT id FROM users').fetchall()
This yields something like:
[1, 2, 3, 4, 5, 6] # etc.
You can also set the row_factory
directly on the cursor object itself. Indeed, if you do not set the row_factory
on the connection before you create the cursor, you must set the row_factory
on the cursor:
c = conn.cursor()
c.row_factory = lambda cursor, row: {'foo': row[0]}
You may redefine the row_factory
at any point during the lifetime of the cursor object, and you can unset the row factory with None
to return default tuple-based results:
c.row_factory = None
c.execute('SELECT id FROM users').fetchall() # [(1,), (2,), (3,)] etc.
zlist = []
tupls = c.fetchall()
for tup in tupls:
t = str(tup).replace("('","").replace("',)","")
zlist.append(t)
Now you don’t have to deal with tuples, pandas or any of the above that didn’t work.
I started with the following which gave me the same sort of list of tuples:
video_ids = []
for row in c.execute('SELECT id FROM videos_metadata'):
video_ids.append(row)
…and so to resolve it and get the list I expected I just explicitly pulled out the first element within the returned tuple…
video_ids = []
for row in c.execute('SELECT id FROM videos_metadata'):
video_ids.append(row[0])
This seems to do the trick for me with a single column (per the OP’s question) and is quite a simple solution (maybe it’s simplistic in some way I’ve not thought of). Not sure how it scales but runs fast enough to deal with the 5000+ entries I have without caring too much (31ms), we’re talking SQLite here so presumably this isn’t going to be dealing with bazillions of rows.
cursor.fetchall()
returns [(u'one',), (u'two',), (u'three',)]
,
if you want [u'one', u'two', u'three']
, use the following:
[x[0] for x in cursor.fetchall()]
As of some recent version of Python (I am using 3.9.4) it has become even easier to get a dictionary of results from sqlite3. It is in the documentation for Python. Essentially just make the connection equal to a sqlite3.Row and off you go.
con1 = sqlite3.connect("programs_aux.sqlite")
con1.row_factory = sqlite3.Row
cur1 = con1.cursor()
sql = 'select * from Main where watched is 0 order by Genre, Folder_runtime'
cur1.execute(sql)
rows = cur1.fetchall()
for row in rows:
print(row['Title'])
con1.close()