Can I pickle a python dictionary into a sqlite3 text field?

Question:

Any gotchas I should be aware of? Can I store it in a text field, or do I need to use a blob?
(I’m not overly familiar with either pickle or sqlite, so I wanted to make sure I’m barking up the right tree with some of my high-level design ideas.)

Asked By: Electrons_Ahoy

||

Answers:

If you want to store a pickled object, you’ll need to use a blob, since it is binary data. However, you can, say, base64 encode the pickled object to get a string that can be stored in a text field.

Generally, though, doing this sort of thing is indicative of bad design, since you’re storing opaque data you lose the ability to use SQL to do any useful manipulation on that data. Although without knowing what you’re actually doing, I can’t really make a moral call on it.

Answered By: SpoonMeiser

Since Pickle can dump your object graph to a string it should be possible.

Be aware though that TEXT fields in SQLite uses database encoding so you might need to convert it to a simple string before you un-pickle.

Pickle has both text and binary output formats. If you use the text-based format you can store it in a TEXT field, but it’ll have to be a BLOB if you use the (more efficient) binary format.

Answered By: John Millikin

If a dictionary can be pickled, it can be stored in text/blob field as well.

Just be aware of the dictionaries that can’t be pickled (aka that contain unpickable objects).

Answered By: kender

Yes, you can store a pickled object in a TEXT or BLOB field in an SQLite3 database, as others have explained.

Just be aware that some object cannot be pickled. The built-in container types can (dict, set, list, tuple, etc.). But some objects, such as file handles, refer to state that is external to their own data structures, and other extension types have similar problems.

Since a dictionary can contain arbitrary nested data structures, it might not be pickle-able.

Answered By: Dan Lenski

SpoonMeiser is correct, you need to have a strong reason to pickle into a database.

It’s not difficult to write Python objects that implement persistence with SQLite. Then you can use the SQLite CLI to fiddle with the data as well. Which in my experience is worth the extra bit of work, since many debug and admin functions can be simply performed from the CLI rather than writing specific Python code.

In the early stages of a project, I did what you propose and ended up re-writing with a Python class for each business object (note: I didn’t say for each table!) This way the body of the application can focus on “what” needs to be done rather than “how” it is done.

Answered By: CyberFonic

The other option, considering that your requirement is to save a dict and then spit it back out for the user’s “viewing pleasure”, is to use the shelve module which will let you persist any pickleable data to file. The python docs are here.

Answered By: mhawke

Depending on what you’re working on, you might want to look into the shove module. It does something similar, where it auto-stores Python objects inside a sqlite database (and all sorts of other options) and pretends to be a dictionary (just like the shelve module).

Answered By: Matthew

I have to agree with some of the comments here. Be careful and make sure you really want to save pickle data in a db, there’s probably a better way.

In any case I had trouble in the past trying to save binary data in the sqlite db.
Apparently you have to use the sqlite3.Binary() to prep the data for sqlite.

Here’s some sample code:

query = u'''insert into testtable VALUES(?)'''
b = sqlite3.Binary(binarydata)
cur.execute(query,(b,))
con.commit()
Answered By: monkut

I wrote a blog about this idea, except instead of a pickle, I used json, since I wanted it to be interoperable with perl and other programs.

http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/

Architecturally, this is a quick and dirty way to get persistence, transactions, and the like for arbitrary data structures. I have found this combination to be really useful when I want persistence, and don’t need to do much in the sql layer with the data (or it’s very complex to deal with in sql, and simple with generators).

The code itself is pretty simple:

#  register the "loader" to get the data back out.
sqlite3.register_converter("pickle", cPickle.loads) 

Then, when you want to dump it into the db,

p_string = p.dumps( dict(a=1,b=[1,2,3]))  
conn.execute(''' 
   create table snapshot( 
      id INTEGER PRIMARY KEY AUTOINCREMENT, 
        mydata pickle); 
''')  

conn.execute(''' 
    insert into snapshot values 
    (null, ?)''', (p_string,))
''')
Answered By: Gregg Lind

See this solution at SourceForge:

y_serial.py module :: warehouse Python objects with SQLite

“Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful “standard” module for a database to store schema-less data.”

http://yserial.sourceforge.net

Answered By: code43

I needed to achieve the same thing too.

I turns out it caused me quite a headache before I finally figured out, thanks to this post, how to actually make it work in a binary format.

To insert/update:

pdata = cPickle.dumps(data, cPickle.HIGHEST_PROTOCOL)
curr.execute("insert into table (data) values (:data)", sqlite3.Binary(pdata))

You must specify the second argument to dumps to force a binary pickling.
Also note the sqlite3.Binary to make it fit in the BLOB field.

To retrieve data:

curr.execute("select data from table limit 1")
for row in curr:
  data = cPickle.loads(str(row['data']))

When retrieving a BLOB field, sqlite3 gets a ‘buffer’ python type, that needs to be strinyfied using str before being passed to the loads method.

Answered By: Benoît Vidis

It is possible to store object data as pickle dump, jason etc but it is also possible to index, them, restrict them and run select queries that use those indices. Here is example with tuples, that can be easily applied for any other python class. All that is needed is explained in python sqlite3 documentation (somebody already posted the link). Anyway here it is all put together in the following example:

import sqlite3
import pickle

def adapt_tuple(tuple):
    return pickle.dumps(tuple)    

sqlite3.register_adapter(tuple, adapt_tuple)    #cannot use pickle.dumps directly because of inadequate argument signature 
sqlite3.register_converter("tuple", pickle.loads)

def collate_tuple(string1, string2):
    return cmp(pickle.loads(string1), pickle.loads(string2))

#########################
# 1) Using declared types
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)

con.create_collation("cmptuple", collate_tuple)

cur = con.cursor()
cur.execute("create table test(p tuple unique collate cmptuple) ")
cur.execute("create index tuple_collated_index on test(p collate cmptuple)")

cur.execute("select name, type  from sqlite_master") # where type = 'table'")
print(cur.fetchall())

p = (1,2,3)
p1 = (1,2)

cur.execute("insert into test(p) values (?)", (p,))
cur.execute("insert into test(p) values (?)", (p1,))
cur.execute("insert into test(p) values (?)", ((10, 1),))
cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
cur.execute("insert into test(p) values (?)", (((9, 5), 33) ,))

try:
    cur.execute("insert into test(p) values (?)", (tuple((9, 33)) ,))
except Exception as e:
    print e

cur.execute("select p from test order by p")
print "nwith declared types and default collate on column:"
for raw in cur:
    print raw

cur.execute("select p from test order by p collate cmptuple")
print "nwith declared types collate:"
for raw in cur:
    print raw

con.create_function('pycmp', 2, cmp)

print "nselect grater than using cmp function:"
cur.execute("select p from test where pycmp(p,?) >= 0", ((10, ),) )
for raw in cur:
    print raw

cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
    print raw 

print "nselect grater than using collate:"
cur.execute("select p from test where p > ?", ((10,),) )
for raw in cur:
    print raw  

cur.execute("explain query plan select p from test where p > ?", ((3,)))
for raw in cur:
    print raw

cur.close()
con.close()
Answered By: pervlad

Many applications use sqlite3 as a backend for SQLAlchemy so, naturally, this question can be asked in the SQLAlchemy framework as well (which is how I came across this question).

To do this, one will have wanted to define the column in which the pickle data is desired to be stored to store “PickleType” data. The implementation is pretty straightforward:

from sqlalchemy import PickleType, Integer
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
import pickle

Base= declarative_base()

class User(Base):
    __tablename__= 'Users'

    id= Column(Integer, primary_key= True)
    user_login_data_array= Column(PickleType)

login_information= {'User1':{'Times': np.arange(0,20),
                             'IP': ['123.901.12.189','123.441.49.391']}}

engine= create_engine('sqlite:///memory:',echo= False) 

Base.metadata.create_all(engine)
Session_maker= sessionmaker(bind=engine)
Session= Session_maker()

# The pickling here is very intuitive! Just need to have 
# defined the column "user_login_data_array" to take pickletype data.

pickled_login_data_array= pickle.dumps(login_information)
user_object_to_add= User(user_login_data_array= pickled_login_data_array)

Session.add(user_object_to_add)
Session.commit()

(I’m not claiming that this example would best be suited to use pickle, as others have noted issues with.)

Answered By: jbplasma
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.