Should I use a class or dictionary?

Question:

I have a class that contains only fields and no methods, like this:

class Request(object):

    def __init__(self, environ):
        self.environ = environ
        self.request_method = environ.get('REQUEST_METHOD', None)
        self.url_scheme = environ.get('wsgi.url_scheme', None)
        self.request_uri = wsgiref.util.request_uri(environ)
        self.path = environ.get('PATH_INFO', None)
        # ...

This could easily be translated to a dict. The class is more flexible for future additions and could be fast with __slots__. So would there be a benefit of using a dict instead? Would a dict be faster than a class? And faster than a class with slots?

Asked By: deamon

||

Answers:

Use a dictionary unless you need the extra mechanism of a class. You could also use a namedtuple for a hybrid approach:

>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'

Performance differences here will be minimal, although I would be surprised if the dictionary wasn’t faster.

Answered By: Katriel

A class in python is a dict underneath. You do get some overhead with the class behavior, but you won’t be able to notice it without a profiler. In this case, I believe you benefit from the class because:

  • All your logic lives in a single function
  • It is easy to update and stays encapsulated
  • If you change anything later, you can easily keep the interface the same
Answered By: Bruce Armstrong

I would recommend a class, as it is all sorts of information involved with a request. Were one to use a dictionary, I’d expect the data stored to be far more similar in nature. A guideline I tend to follow myself is that if I may want to loop over the entire set of key->value pairs and do something, I use a dictionary. Otherwise, the data apparently has far more structure than a basic key->value mapping, meaning a class would likely be a better alternative.

Hence, stick with the class.

Answered By: Stigma

Why would you make this a dictionary? What’s the advantage? What happens if you later want to add some code? Where would your __init__ code go?

Classes are for bundling related data (and usually code).

Dictionaries are for storing key-value relationships, where usually the keys are all of the same type, and all the values are also of one type. Occasionally they can be useful for bundling data when the key/attribute names are not all known up front, but often this a sign that something’s wrong with your design.

Keep this a class.

Answered By: adw

I agree with @adw. I would never represent an “object” (in an OO sense) with a dictionary. Dictionaries aggregate name/value pairs. Classes represent objects. I’ve seen code where the objects are represented with dictionaries and it’s unclear what the actual shape of the thing is. What happens when certain name/values aren’t there? What restricts the client from putting anything at all in. Or trying to get anything at all out. The shape of the thing should always be clearly defined.

When using Python it is important to build with discipline as the language allows many ways for the author to shoot him/herself in the foot.

Answered By: jaydel

It may be possible to have your cake and eat it, too. In other words you can create something that provides the functionality of both a class and dictionary instance. See the ActiveState’s Dɪᴄᴛɪᴏɴᴀʀʏ ᴡɪᴛʜ ᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇ ᴀᴄᴄᴇss recipe and comments on ways of doing that.

If you decide to use a regular class rather than a subclass, I’ve found the Tʜᴇ sɪᴍᴘʟᴇ ʙᴜᴛ ʜᴀɴᴅʏ “ᴄᴏʟʟᴇᴄᴛᴏʀ ᴏғ ᴀ ʙᴜɴᴄʜ ᴏғ ɴᴀᴍᴇᴅ sᴛᴜғғ” ᴄʟᴀss recipe (by Alex Martelli) to be very flexible and useful for the sort of thing it looks like you’re doing (i.e. create a relative simple aggregator of information). Since it’s a class you can easily extend its functionality further by adding methods.

Lastly it should be noted that the names of class members must be legal Python identifiers, but dictionary keys do not—so a dictionary would provide greater freedom in that regard because keys can be anything hashable (even something that’s not a string).

Update

A class object (which doesn’t have a __dict__) subclass named SimpleNamespace (which does have one) was added to the types module Python 3.3, and is yet another alternative.

Answered By: martineau

If all that you want to achive is syntax candy like obj.bla = 5 instead of obj['bla'] = 5, especially if you have to repeat that a lot, you maybe want to use some plain container class as in martineaus suggestion. Nevertheless, the code there is quite bloated and unnecessarily slow. You can keep it simple like that:

class AttrDict(dict):
    """ Syntax candy """
    __getattr__ = dict.__getitem__
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__

Another reason to switch to namedtuples or a class with __slots__ could be memory usage. Dicts require significantly more memory than list types, so this could be a point to think about.

Anyways, in your specific case, there doesn’t seem to be any motivation to switch away from your current implementation. You don’t seem to maintain millions of these objects, so no list-derived-types required. And it’s actually containing some functional logic within the __init__, so you also shouldn’t got with AttrDict.

Answered By: Michael

I think that the usage of each one is way too subjective for me to get in on that, so i’ll just stick to numbers.

I compared the time it takes to create and to change a variable in a dict, a new_style class and a new_style class with slots.

Here’s the code i used to test it(it’s a bit messy but it does the job.)

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

print 'Creating...n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))

print 'nChanging a variable...n'

print 'Dict: ' + str((tmit('create_dict()['foo3'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))

And here is the output…

Creating…

Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003

Changing a variable…

Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142

So, if you’re just storing variables, you need speed, and it won’t require you to do many calculations, i recommend using a dict(you could always just make a function that looks like a method). But, if you really need classes, remember – always use __slots__.

Note:

I tested the ‘Class’ with both new_style and old_style classes. It turns out that old_style classes are faster to create but slower to modify(not by much but significant if you’re creating lots of classes in a tight loop (tip: you’re doing it wrong)).

Also the times for creating and changing variables may differ on your computer since mine is old and slow. Make sure you test it yourself to see the ‘real’ results.

Edit:

I later tested the namedtuple: i can’t modify it but to create the 10000 samples (or something like that) it took 1.4 seconds so the dictionary is indeed the fastest.

If i change the dict function to include the keys and values and to return the dict instead of the variable containing the dict when i create it it gives me 0.65 instead of 0.8 seconds.

class Foo(dict):
    pass

Creating is like a class with slots and changing the variable is the slowest (0.17 seconds) so do not use these classes. go for a dict (speed) or for the class derived from object (‘syntax candy’)

Answered By: alexpinho98
class ClassWithSlotBase:
    __slots__ = ('a', 'b',)

def __init__(self):
    self.a: str = "test"
    self.b: float = 0.0


def test_type_hint(_b: float) -> None:
    print(_b)


class_tmp = ClassWithSlotBase()

test_type_hint(class_tmp.a)

I recommend a class. If you use a class, you can get type hint as shown. And Class support auto complete when class is argument of function.

enter image description here

Answered By: LowQualityDelivery

If the data, I mean set of fields, is not to be changed or extended in the future i would choose a class for representation such data. Why?

  1. It’s a little more clean and readable.
  2. It’s faster in terms of using it, which is much more important than creating it, which happens generally only once.

Even faster seems using just class as container for fields not object of the class.

extending alexpinho98 example:

import timeit

class Foo(object):

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

class FooClass:
        foo1 = 'test'
        foo2 = 'test'
        foo3 = 'test'

def create_dict():

    foo_dict = {}
    foo_dict['foo1'] = 'test'
    foo_dict['foo2'] = 'test'
    foo_dict['foo3'] = 'test'

    return foo_dict

class Bar(object):
    __slots__ = ['foo1', 'foo2', 'foo3']

    def __init__(self):

        self.foo1 = 'test'
        self.foo2 = 'test'
        self.foo3 = 'test'

tmit = timeit.timeit

dict = create_dict()
def testDict():
    a = dict['foo1']
    b = dict['foo2']
    c = dict['foo3']

dict_obj = Foo()
def testObjClass():
    a = dict_obj.foo1
    b = dict_obj.foo2
    c = dict_obj.foo3

def testClass():
    a = FooClass.foo1
    b = FooClass.foo2
    c = FooClass.foo3


print ('Creating...n')
print ('Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict')))
print ('Class: ' + str(tmit('Foo()', 'from __main__ import Foo')))
print ('Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar')))

print ('=== Testing usage 1 ===')
print ('Using dict  : ' + str(tmit('testDict()', 'from __main__ import testDict')))
print ('Using object: ' + str(tmit('testObjClass()', 'from __main__ import testObjClass')))
print ('Using class : ' + str(tmit('FooClass()', 'from __main__ import FooClass')))

Results are:

Creating...

Dict: 0.185864600000059
Class: 0.30627199999980803
Class with slots: 0.2572166999998444
=== Testing usage 1 ===
Using dict  : 0.16507520000050135
Using object: 0.1266871000007086
Using class : 0.06327920000148879
Answered By: Zbyszek
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.