How to properly subclass dict and override __getitem__ & __setitem__
Question:
I am debugging some code and I want to find out when a particular dictionary is accessed. Well, it’s actually a class that subclass dict
and implements a couple extra features. Anyway, what I would like to do is subclass dict
myself and add override __getitem__
and __setitem__
to produce some debugging output. Right now, I have
class DictWatch(dict):
def __init__(self, *args):
dict.__init__(self, args)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
log.info("GET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
return val
def __setitem__(self, key, val):
log.info("SET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
dict.__setitem__(self, key, val)
'name_label'
is a key which will eventually be set that I want to use to identify the output. I have then changed the class I am instrumenting to subclass DictWatch
instead of dict
and changed the call to the superconstructor. Still, nothing seems to be happening. I thought I was being clever, but I wonder if I should be going a different direction.
Thanks for the help!
Answers:
What you’re doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG)
at the top of your script.
Second, __getitem__
and __setitem__
are only called during []
accesses. So make sure you only access DictWatch
via d[key]
, rather than d.get()
and d.set()
That should not really change the result (which should work, for good logging threshold values) :
your init should be :
def __init__(self,*args,**kwargs) : dict.__init__(self,*args,**kwargs)
instead, because if you call your method with DictWatch([(1,2),(2,3)]) or DictWatch(a=1,b=2) this will fail.
(or,better, don’t define a constructor for this)
Another issue when subclassing dict
is that the built-in __init__
doesn’t call update
, and the built-in update
doesn’t call __setitem__
. So, if you want all setitem operations to go through your __setitem__
function, you should make sure that it gets called yourself:
class DictWatch(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
print('GET', key)
return val
def __setitem__(self, key, val):
print('SET', key, val)
dict.__setitem__(self, key, val)
def __repr__(self):
dictrepr = dict.__repr__(self)
return '%s(%s)' % (type(self).__name__, dictrepr)
def update(self, *args, **kwargs):
print('update', args, kwargs)
for k, v in dict(*args, **kwargs).items():
self[k] = v
All you will have to do is
class BatchCollection(dict):
def __init__(self, inpt={}):
super(BatchCollection, self).__init__(inpt)
A sample usage for my personal use
### EXAMPLE
class BatchCollection(dict):
def __init__(self, inpt={}):
super(BatchCollection, self).__init__(inpt)
def __setitem__(self, key, item):
if (isinstance(key, tuple) and len(key) == 2
and isinstance(item, collections.Iterable)):
# self.__dict__[key] = item
super(BatchCollection, self).__setitem__(key, item)
else:
raise Exception(
"Valid key should be a tuple (database_name, table_name) "
"and value should be iterable")
Note: tested only in python3
As Andrew Pate’s answer proposed, subclassing collections.UserDict
instead of dict
is much less error prone.
Here is an example showing an issue when inheriting dict
naively:
class MyDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Bad! MyDict.__setitem__ not called
d.update(c=3) # Bad! MyDict.__setitem__ not called
d['d'] = 4 # Good!
print(d) # {'a': 1, 'b': 2, 'c': 3, 'd': 40}
UserDict
inherits from collections.abc.MutableMapping
, so this works as expected:
class MyDict(collections.UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Good: MyDict.__setitem__ correctly called
d.update(c=3) # Good: MyDict.__setitem__ correctly called
d['d'] = 4 # Good
print(d) # {'a': 10, 'b': 20, 'c': 30, 'd': 40}
Similarly, you only have to implement __getitem__
to automatically be compatible with key in my_dict
, my_dict.get
, …
Note: UserDict
is not a subclass of dict
, so isinstance(UserDict(), dict)
will fail (but isinstance(UserDict(), collections.abc.MutableMapping)
will work).
I am debugging some code and I want to find out when a particular dictionary is accessed. Well, it’s actually a class that subclass dict
and implements a couple extra features. Anyway, what I would like to do is subclass dict
myself and add override __getitem__
and __setitem__
to produce some debugging output. Right now, I have
class DictWatch(dict):
def __init__(self, *args):
dict.__init__(self, args)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
log.info("GET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
return val
def __setitem__(self, key, val):
log.info("SET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
dict.__setitem__(self, key, val)
'name_label'
is a key which will eventually be set that I want to use to identify the output. I have then changed the class I am instrumenting to subclass DictWatch
instead of dict
and changed the call to the superconstructor. Still, nothing seems to be happening. I thought I was being clever, but I wonder if I should be going a different direction.
Thanks for the help!
What you’re doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG)
at the top of your script.
Second, __getitem__
and __setitem__
are only called during []
accesses. So make sure you only access DictWatch
via d[key]
, rather than d.get()
and d.set()
That should not really change the result (which should work, for good logging threshold values) :
your init should be :
def __init__(self,*args,**kwargs) : dict.__init__(self,*args,**kwargs)
instead, because if you call your method with DictWatch([(1,2),(2,3)]) or DictWatch(a=1,b=2) this will fail.
(or,better, don’t define a constructor for this)
Another issue when subclassing dict
is that the built-in __init__
doesn’t call update
, and the built-in update
doesn’t call __setitem__
. So, if you want all setitem operations to go through your __setitem__
function, you should make sure that it gets called yourself:
class DictWatch(dict):
def __init__(self, *args, **kwargs):
self.update(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
print('GET', key)
return val
def __setitem__(self, key, val):
print('SET', key, val)
dict.__setitem__(self, key, val)
def __repr__(self):
dictrepr = dict.__repr__(self)
return '%s(%s)' % (type(self).__name__, dictrepr)
def update(self, *args, **kwargs):
print('update', args, kwargs)
for k, v in dict(*args, **kwargs).items():
self[k] = v
All you will have to do is
class BatchCollection(dict):
def __init__(self, inpt={}):
super(BatchCollection, self).__init__(inpt)
A sample usage for my personal use
### EXAMPLE
class BatchCollection(dict):
def __init__(self, inpt={}):
super(BatchCollection, self).__init__(inpt)
def __setitem__(self, key, item):
if (isinstance(key, tuple) and len(key) == 2
and isinstance(item, collections.Iterable)):
# self.__dict__[key] = item
super(BatchCollection, self).__setitem__(key, item)
else:
raise Exception(
"Valid key should be a tuple (database_name, table_name) "
"and value should be iterable")
Note: tested only in python3
As Andrew Pate’s answer proposed, subclassing collections.UserDict
instead of dict
is much less error prone.
Here is an example showing an issue when inheriting dict
naively:
class MyDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Bad! MyDict.__setitem__ not called
d.update(c=3) # Bad! MyDict.__setitem__ not called
d['d'] = 4 # Good!
print(d) # {'a': 1, 'b': 2, 'c': 3, 'd': 40}
UserDict
inherits from collections.abc.MutableMapping
, so this works as expected:
class MyDict(collections.UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Good: MyDict.__setitem__ correctly called
d.update(c=3) # Good: MyDict.__setitem__ correctly called
d['d'] = 4 # Good
print(d) # {'a': 10, 'b': 20, 'c': 30, 'd': 40}
Similarly, you only have to implement __getitem__
to automatically be compatible with key in my_dict
, my_dict.get
, …
Note: UserDict
is not a subclass of dict
, so isinstance(UserDict(), dict)
will fail (but isinstance(UserDict(), collections.abc.MutableMapping)
will work).