Python: Quick and dirty datatypes (DTO)

Question:

Very often, I find myself coding trivial datatypes like

class Pruefer:
    def __init__(self, ident, maxNum=float('inf'), name=""):
        self.ident  = ident
        self.maxNum = maxNum
        self.name   = name

While this is very useful (Clearly I don’t want to replace the above with anonymous 3-tuples), it’s also very boilerplate.

Now for example, when I want to use the class in a dict, I have to add more boilerplate like

    def __hash__(self):
        return hash(self.ident, self.maxNum, self.name)

I admit that it might be difficult to recognize a general pattern amongst all my boilerplate classes, but nevertheless I’d like to as this question:

  • Are there any
    popular idioms in python to derive quick and dirty datatypes with named accessors?

  • Or maybe if there are not, maybe a Python guru might want to show off some metaclass hacking or class factory to make my life easier?

Asked By: Jo So

||

Answers:

>>> from collections import namedtuple
>>> Pruefer = namedtuple("Pruefer", "ident maxNum name")
>>> pr = Pruefer(1,2,3)
>>> pr.ident
1
>>> pr.maxNum
2
>>> pr.name
3
>>> hash(pr)
2528502973977326415

To provide default values, you need to do little bit more… Simple solution is to write subclass with redefinition for __new__ method:

>>> class Pruefer(namedtuple("Pruefer", "ident maxNum name")):
...     def __new__(cls, ident, maxNum=float('inf'), name=""):
...         return super(Pruefer, cls).__new__(cls, ident, maxNum, name)
... 
>>> Pruefer(1)
Pruefer(ident=1, maxNum=inf, name='')
Answered By: Alexey Kachayev

I don’t have much to add to the already excellent answer by Alexey Kachayev — However, one thing that may be useful is the following pattern:

Pruefer.__new__.func_defaults = (1,float('inf'),"")

This would allow you to create a factory function which returns a new named-tuple which can have default arguments:

def default_named_tuple(name,args,defaults=None):
    named_tuple = collections.namedtuple(name,args)
    if defaults is not None:
        named_tuple.__new__.func_defaults = defaults
    return named_tuple

This may seem like black magic — It did to me at first, but it’s all documented in the Data Model and discussed in this post.

In action:

>>> default_named_tuple("Pruefer", "ident maxNum name",(1,float('inf'),''))
<class '__main__.Pruefer'>
>>> Pruefer = default_named_tuple("Pruefer", "ident maxNum name",(1,float('inf'),''))
>>> Pruefer()
Pruefer(ident=1, maxNum=inf, name='')
>>> Pruefer(3)
Pruefer(ident=3, maxNum=inf, name='')
>>> Pruefer(3,10050)
Pruefer(ident=3, maxNum=10050, name='')
>>> Pruefer(3,10050,"cowhide")
Pruefer(ident=3, maxNum=10050, name='cowhide')
>>> Pruefer(maxNum=12)
Pruefer(ident=1, maxNum=12, name='')

And only specifying some of the arguments as defaults:

>>> Pruefer = default_named_tuple("Pruefer", "ident maxNum name",(float('inf'),''))
>>> Pruefer(maxNum=12)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __new__() takes at least 2 arguments (2 given)
>>> Pruefer(1,maxNum=12)
Pruefer(ident=1, maxNum=12, name='')

Note that as written, It’s probably only safe to pass a tuple in as defaults. However, you could easily get more fancy by ensuring you have a reasonable tuple object within the function.

Answered By: mgilson

An alternate approach which might help you to make your boiler plate code a little more generic is the iteration over the (local) variable dicts. This enables you to put your variables in a list and the processing of these in a loop. E.g:

class Pruefer:
     def __init__(self, ident, maxNum=float('inf'), name=""):
         for n in "ident maxNum name".split():
             v = locals()[n]  # extract value from local variables
             setattr(self, n, v)  # set member variable

     def printMemberVars(self):
         print("Member variables are:")
         for k,v in vars(self).items():
             print("  {}: '{}'".format(k, v))


P = Pruefer("Id", 100, "John")
P.printMemberVars()

gives:

Member Variables are:
  ident: 'Id'
  maxNum: '100'
  name: 'John'

From the viewpoint of efficient resource usage, this approach is of course suboptimal.

Answered By: Dietrich

One of the most promising things from with Python 3.6 is variable annotations. They allow to define namedtuple as class in next way:

In [1]: from typing import NamedTuple

In [2]: class Pruefer(NamedTuple):
   ...:     ident: int
   ...:     max_num: int
   ...:     name: str
   ...:     

In [3]: Pruefer(1,4,"name")
Out[3]: Pruefer(ident=1, max_num=4, name='name')

It same as a namedtuple, but is saves annotations and allow to check type with some static type analyzer like mypy.

Update: 15.05.2018

Now, in Python 3.7 dataclasses are present so this would preferable way of defining DTO, also for backwardcompatibility you could use attrs library.

Answered By: skhalymon

if using Python 3.7 you can use Data Classes; Data Classes can be thought of as “mutable namedtuples with defaults”

https://docs.python.org/3/library/dataclasses.html

https://www.python.org/dev/peps/pep-0557/

Answered By: Enrique G

Are there any popular idioms in python to derive quick … datatypes with named accessors?

Dataclases. They accomplish this exact need.

Some answers have mentioned dataclasses, but here is an example.

Code

import dataclasses as dc


@dc.dataclass(unsafe_hash=True)
class Pruefer:
    ident : int
    maxnum : float = float("inf")
    name : str  = ""

Demo

pr = Pruefer(1, 2.0, "3")

pr
# Pruefer(ident=1, maxnum=2.0, name='3')

pr.ident
# 1

pr.maxnum
# 2.0

pr.name
# '3'

hash(pr)
# -5655986875063568239

Details

You get:

  • pretty reprs
  • default values
  • hashing
  • dotted attribute-access
  • … much more

You don’t (directly) get:

  • tuple unpacking (unlike namedtuple)

Here’s a guide on the details of dataclasses.

Answered By: pylang
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.