Javascript style dot notation for dictionary keys unpythonic?
Question:
I’ve started to use constructs like these:
class DictObj(object):
def __init__(self):
self.d = {}
def __getattr__(self, m):
return self.d.get(m, None)
def __setattr__(self, m, v):
super.__setattr__(self, m, v)
Update: based on this thread, I’ve revised the DictObj implementation to:
class dotdict(dict):
def __getattr__(self, attr):
return self.get(attr, None)
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
class AutoEnum(object):
def __init__(self):
self.counter = 0
self.d = {}
def __getattr__(self, c):
if c not in self.d:
self.d[c] = self.counter
self.counter += 1
return self.d[c]
where DictObj is a dictionary that can be accessed via dot notation:
d = DictObj()
d.something = 'one'
I find it more aesthetically pleasing than d['something']
. Note that accessing an undefined key returns None instead of raising an exception, which is also nice.
Update: Smashery makes a good point, which mhawke expands on for an easier solution. I’m wondering if there are any undesirable side effects of using dict instead of defining a new dictionary; if not, I like mhawke’s solution a lot.
AutoEnum is an auto-incrementing Enum, used like this:
CMD = AutoEnum()
cmds = {
"peek": CMD.PEEK,
"look": CMD.PEEK,
"help": CMD.HELP,
"poke": CMD.POKE,
"modify": CMD.POKE,
}
Both are working well for me, but I’m feeling unpythonic about them.
Are these in fact bad constructs?
Answers:
As far as I know, Python classes use dictionaries to store their attributes anyway (that’s hidden from the programmer), so it looks to me that what you’ve done there is effectively emulate a Python class… using a python class.
I like dot notation a lot better than dictionary fields personally. The reason being that it makes autocompletion work a lot better.
With regards to the DictObj
, would the following work for you? A blank class will allow you to arbitrarily add to or replace stuff in a container object.
class Container(object):
pass
>>> myContainer = Container()
>>> myContainer.spam = "in a can"
>>> myContainer.eggs = "in a shell"
If you want to not throw an AttributeError when there is no attribute, what do you think about the following? Personally, I’d prefer to use a dict for clarity, or to use a try/except clause.
class QuietContainer(object):
def __getattr__(self, attribute):
try:
return object.__getattr__(self,attribute)
except AttributeError:
return None
>>> cont = QuietContainer()
>>> print cont.me
None
Right?
This is a simpler version of your DictObj class:
class DictObj(object):
def __getattr__(self, attr):
return self.__dict__.get(attr)
>>> d = DictObj()
>>> d.something = 'one'
>>> print d.something
one
>>> print d.somethingelse
None
>>>
It’s not bad if it serves your purpose. “Practicality beats purity”.
I saw such approach elserwhere (eg. in Paver), so this can be considered common need (or desire).
Your DictObj example is actually quite common. Object-style dot-notation access can be a win if you are dealing with ‘things that resemble objects’, ie. they have fixed property names containing only characters valid in Python identifiers. Stuff like database rows or form submissions can be usefully stored in this kind of object, making code a little more readable without the excess of [‘item access’].
The implementation is a bit limited – you don’t get the nice constructor syntax of dict, len(), comparisons, ‘in’, iteration or nice reprs. You can of course implement those things yourself, but in the new-style-classes world you can get them for free by simply subclassing dict:
class AttrDict(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
To get the default-to-None behaviour, simply subclass Python 2.5’s collections.defaultdict class instead of dict.
The one major disadvantage of using something like your DictObj is you either have to limit allowable keys or you can’t have methods on your DictObj such as .keys()
, .values()
, .items()
, etc.
It’s not “wrong” to do this, and it can be nicer if your dictionaries have a strong possibility of turning into objects at some point, but be wary of the reasons for having bracket access in the first place:
- Dot access can’t use keywords as keys.
- Dot access has to use Python-identifier-valid characters in the keys.
- Dictionaries can hold any hashable element — not just strings.
Also keep in mind you can always make your objects access like dictionaries if you decide to switch to objects later on.
For a case like this I would default to the “readability counts” mantra: presumably other Python programmers will be reading your code and they probably won’t be expecting dictionary/object hybrids everywhere. If it’s a good design decision for a particular situation, use it, but I wouldn’t use it without necessity to do so.
Because you ask for undesirable side-effects:
A disadvantage is that in visual editors like eclipse+pyDev, you will see many undefined variable errors on lines using the dot notation. Pydef will not be able to find such runtime “object” definitions. Whereas in the case of a normal dictionary, it knows that you are just getting a dictionary entry.
You would need to 1) ignore those errors and live with red crosses; 2) suppress those warnings on a line by line basis using #@UndefinedVariable or 3) disable undefined variable error entirely, causing you to miss real undefined variable definitions.
There’s a symmetry between this and this answer:
class dotdict(dict):
__getattr__= dict.__getitem__
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
The same interface, just implemented the other way round…
class container(object):
__getitem__ = object.__getattribute__
__setitem__ = object.__setattr__
__delitem__ = object.__delattr__
Don’t overlook Bunch.
It is a child of dictionary and can import YAML or JSON, or convert any existing dictionary to a Bunch and vice-versa. Once “bunchify”‘d, a dictionary gains dot notations without losing any other dictionary methods.
If you’re looking for an alternative that handles nested dicts:
Recursively transform a dict to instances of the desired class
import json
from collections import namedtuple
class DictTransformer():
@classmethod
def constantize(self, d):
return self.transform(d, klass=namedtuple, klassname='namedtuple')
@classmethod
def transform(self, d, klass, klassname):
return self._from_json(self._to_json(d), klass=klass, klassname=klassname)
@classmethod
def _to_json(self, d, access_method='__dict__'):
return json.dumps(d, default=lambda o: getattr(o, access_method, str(o)))
@classmethod
def _from_json(self, jsonstr, klass, klassname):
return json.loads(jsonstr, object_hook=lambda d: klass(klassname, d.keys())(*d.values()))
Ex:
constants = {
'A': {
'B': {
'C': 'D'
}
}
}
CONSTANTS = DictTransformer.transform(d, klass=namedtuple, klassname='namedtuple')
CONSTANTS.A.B.C == 'D'
Pros:
- handles nested dicts
- can potentially generate other classes
- namedtuples provide immutability for constants
Cons:
- may not respond to
.keys
and .values
if those are not provided on your klass (though you can sometimes mimic with ._fields
and list(A.B.C)
)
Thoughts?
h/t to @hlzr for the original class idea
I’ve started to use constructs like these:
class DictObj(object):
def __init__(self):
self.d = {}
def __getattr__(self, m):
return self.d.get(m, None)
def __setattr__(self, m, v):
super.__setattr__(self, m, v)
Update: based on this thread, I’ve revised the DictObj implementation to:
class dotdict(dict):
def __getattr__(self, attr):
return self.get(attr, None)
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
class AutoEnum(object):
def __init__(self):
self.counter = 0
self.d = {}
def __getattr__(self, c):
if c not in self.d:
self.d[c] = self.counter
self.counter += 1
return self.d[c]
where DictObj is a dictionary that can be accessed via dot notation:
d = DictObj()
d.something = 'one'
I find it more aesthetically pleasing than d['something']
. Note that accessing an undefined key returns None instead of raising an exception, which is also nice.
Update: Smashery makes a good point, which mhawke expands on for an easier solution. I’m wondering if there are any undesirable side effects of using dict instead of defining a new dictionary; if not, I like mhawke’s solution a lot.
AutoEnum is an auto-incrementing Enum, used like this:
CMD = AutoEnum()
cmds = {
"peek": CMD.PEEK,
"look": CMD.PEEK,
"help": CMD.HELP,
"poke": CMD.POKE,
"modify": CMD.POKE,
}
Both are working well for me, but I’m feeling unpythonic about them.
Are these in fact bad constructs?
As far as I know, Python classes use dictionaries to store their attributes anyway (that’s hidden from the programmer), so it looks to me that what you’ve done there is effectively emulate a Python class… using a python class.
I like dot notation a lot better than dictionary fields personally. The reason being that it makes autocompletion work a lot better.
With regards to the DictObj
, would the following work for you? A blank class will allow you to arbitrarily add to or replace stuff in a container object.
class Container(object):
pass
>>> myContainer = Container()
>>> myContainer.spam = "in a can"
>>> myContainer.eggs = "in a shell"
If you want to not throw an AttributeError when there is no attribute, what do you think about the following? Personally, I’d prefer to use a dict for clarity, or to use a try/except clause.
class QuietContainer(object):
def __getattr__(self, attribute):
try:
return object.__getattr__(self,attribute)
except AttributeError:
return None
>>> cont = QuietContainer()
>>> print cont.me
None
Right?
This is a simpler version of your DictObj class:
class DictObj(object):
def __getattr__(self, attr):
return self.__dict__.get(attr)
>>> d = DictObj()
>>> d.something = 'one'
>>> print d.something
one
>>> print d.somethingelse
None
>>>
It’s not bad if it serves your purpose. “Practicality beats purity”.
I saw such approach elserwhere (eg. in Paver), so this can be considered common need (or desire).
Your DictObj example is actually quite common. Object-style dot-notation access can be a win if you are dealing with ‘things that resemble objects’, ie. they have fixed property names containing only characters valid in Python identifiers. Stuff like database rows or form submissions can be usefully stored in this kind of object, making code a little more readable without the excess of [‘item access’].
The implementation is a bit limited – you don’t get the nice constructor syntax of dict, len(), comparisons, ‘in’, iteration or nice reprs. You can of course implement those things yourself, but in the new-style-classes world you can get them for free by simply subclassing dict:
class AttrDict(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
To get the default-to-None behaviour, simply subclass Python 2.5’s collections.defaultdict class instead of dict.
The one major disadvantage of using something like your DictObj is you either have to limit allowable keys or you can’t have methods on your DictObj such as .keys()
, .values()
, .items()
, etc.
It’s not “wrong” to do this, and it can be nicer if your dictionaries have a strong possibility of turning into objects at some point, but be wary of the reasons for having bracket access in the first place:
- Dot access can’t use keywords as keys.
- Dot access has to use Python-identifier-valid characters in the keys.
- Dictionaries can hold any hashable element — not just strings.
Also keep in mind you can always make your objects access like dictionaries if you decide to switch to objects later on.
For a case like this I would default to the “readability counts” mantra: presumably other Python programmers will be reading your code and they probably won’t be expecting dictionary/object hybrids everywhere. If it’s a good design decision for a particular situation, use it, but I wouldn’t use it without necessity to do so.
Because you ask for undesirable side-effects:
A disadvantage is that in visual editors like eclipse+pyDev, you will see many undefined variable errors on lines using the dot notation. Pydef will not be able to find such runtime “object” definitions. Whereas in the case of a normal dictionary, it knows that you are just getting a dictionary entry.
You would need to 1) ignore those errors and live with red crosses; 2) suppress those warnings on a line by line basis using #@UndefinedVariable or 3) disable undefined variable error entirely, causing you to miss real undefined variable definitions.
There’s a symmetry between this and this answer:
class dotdict(dict):
__getattr__= dict.__getitem__
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
The same interface, just implemented the other way round…
class container(object):
__getitem__ = object.__getattribute__
__setitem__ = object.__setattr__
__delitem__ = object.__delattr__
Don’t overlook Bunch.
It is a child of dictionary and can import YAML or JSON, or convert any existing dictionary to a Bunch and vice-versa. Once “bunchify”‘d, a dictionary gains dot notations without losing any other dictionary methods.
If you’re looking for an alternative that handles nested dicts:
Recursively transform a dict to instances of the desired class
import json
from collections import namedtuple
class DictTransformer():
@classmethod
def constantize(self, d):
return self.transform(d, klass=namedtuple, klassname='namedtuple')
@classmethod
def transform(self, d, klass, klassname):
return self._from_json(self._to_json(d), klass=klass, klassname=klassname)
@classmethod
def _to_json(self, d, access_method='__dict__'):
return json.dumps(d, default=lambda o: getattr(o, access_method, str(o)))
@classmethod
def _from_json(self, jsonstr, klass, klassname):
return json.loads(jsonstr, object_hook=lambda d: klass(klassname, d.keys())(*d.values()))
Ex:
constants = {
'A': {
'B': {
'C': 'D'
}
}
}
CONSTANTS = DictTransformer.transform(d, klass=namedtuple, klassname='namedtuple')
CONSTANTS.A.B.C == 'D'
Pros:
- handles nested dicts
- can potentially generate other classes
- namedtuples provide immutability for constants
Cons:
- may not respond to
.keys
and.values
if those are not provided on your klass (though you can sometimes mimic with._fields
andlist(A.B.C)
)
Thoughts?
h/t to @hlzr for the original class idea