defaultdict and tuples

Question:

I wanted to do the following:

d = defaultdict((int,float))
for z in range( lots_and_lots):
  d['operation one'] += (1,5.67)
  ...
  ...
  d['operation two'] += (1,4.56)

And then output the number of times each operation was called and the total of float value.

for k,v in d.items():
  print k, 'Called', v[0], 'times, total =', v[1] 

But I don’t know how to achieve this as not only can’t you use a tuple as a parameter to defaultdict you can’t add a tuple to a tuple and total the values in the tuple you just get extra values in your tuple. i.e:

>>> x = (1,0)
>>> x+= (2,3)
>>> x
(1, 0, 2, 3)

and not

>>> x = (1,0)
>>> x+= (2,3)
>>> x
(3,3)

How can I get what I want?

Asked By: Martlark

||

Answers:

the argument to defaultdict must be a “callable” that returns a default value. define your default dict like so:

d = defaultdict(lambda: (0, 0.0))

The fact that int and float types can be called and return zero’s is a convenience, but not in any way crucial to the way defaultdict works.

getting the += to work is going to cause some trouble; addition across tuples is the concatantion of the tuples, so you’ll have to do it the long way:

left, right = d["key"]
d["key"] = (left + 2, right + 3)

Edit: if you just must use +=, you can do so, so long as you have a collection type that has the desired operations. fileoffset suggests using a numpy array type, and that’s probably a nice idea, but you can get a close approximation just by subclassing tuple and overriding the operators you need: Here’s a rough sketch of one:

class vector(tuple):
    def __add__(self, other):
        return type(self)(l+r for l, r in zip(self, other))
    def __sub__(self, other):
        return type(self)(l-r for l, r in zip(self, other))
    def __radd__(self, other):
        return type(self)(l+r for l, r in zip(self, other))
    def __lsub__(self, other):
        return type(self)(r-l for l, r in zip(self, other))

from collections import defaultdict

d = defaultdict(lambda:vector((0, 0.0)))
for k in range(5):
    for j in range(5):
        d[k] += (j, j+k)

print d

we don’t need (or want) to actually overload the += operator itself (spelled __iadd__) because tuple is immutable. Python will correctly replace the old value with new if you supply addition.

You could do it with collections.Counter to accumulate the results:

>>> from collections import Counter, defaultdict
>>> d = defaultdict(Counter)
>>> d['operation_one'].update(ival=1, fval=5.67)
>>> d['operation_two'].update(ival=1, fval=4.56)
Answered By: Raymond Hettinger

Try this:

a = (1,0)
b = (2,3)

res = tuple(sum(x) for x in zip(a,b)

e.g.

d = defaultdict((int,float))
for z in range( lots_and_lots):
  d['operation one'] = tuple(sum(x) for x in zip(d['operation one'], (1,5.67))
  ...
  ...
Answered By: Artsiom Rudzenka

If you use numpy array’s you can get the desired output:

Link

Answered By: fileoffset

I assuming you have too many operations to simply store the list of values in each entry?

d = defaultdict(list)
for z in range(lots_and_lots):
  d['operation one'].append(5.67)
  ...
  ...
  d['operation two'].append(4.56)
for k,v in d.items():
  print k, 'Called', len(v), 'times, total =', sum(v)

One thing you could do is make a custom incrementor:

class Inc(object):
    def __init__(self):
        self.i = 0
        self.t = 0.0
    def __iadd__(self, f):
        self.i += 1
        self.t += f
        return self

and then

d = defaultdict(Inc)
for z in range(lots_and_lots):
  d['operation one'] += 5.67
  ...
  ...
  d['operation two'] += 4.56
for k,v in d.items():
  print k, 'Called', v.i, 'times, total =', v.t
Answered By: David Z

Write a class that you can pass into defaultdict that accumulates values as you pass them in:

class Tracker(object):
    def __init__(self):
        self.values = None
        self.count = 0

    def __iadd__(self, newvalues):
        self.count += 1
        if self.values is None:
            self.values = newvalues
        else:
            self.values = [(old + new) for old, new in zip(self.values, newvalues)]
        return self

    def __repr__(self):
        return '<Tracker(%s, %d)>' % (self.values, self.count)

That’s a drop-in replacement for (int, float) in your original post. Change your output loop to print the instance attributes like so:

for k,v in d.items():
    print k, 'Called', v.count, 'times, total =', v.values

…and you’re done!

Answered By: Kirk Strauser
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.