Python dataclass to dictionary of lists

Question:

From a list of dataclasses (or a dataclass B containing a list):

import dataclasses
from typing import List

@dataclasses.dataclass
class A:
    a: str
    b: int
  
@dataclasses.dataclass
class B:
  l: List[A]
  
da = B([A("a", 3), A("b", 4)])
# or
da = [A("a", 3), A("b", 4)]

I’d like to get to a dictionary of lists:

# {'a': ['a', 'b'], 'b': [3, 4]}

The only way I found was with an ugly loop:

from collections import defaultdict
res = defaultdict(list)
for item in da:
  for field in dataclasses.fields(item):
    res[field.name].append(getattr(item, field.name))
print(res) # defaultdict(<class 'list'>, {'a': ['a', 'b'], 'b': [3, 4]})

Seems like such a simple thing to do that there must be an easier way and more pythonic.

Asked By: Claudiu Creanga

||

Answers:

It’s a few other ways to do this but all of them will be about iterating through the list of values. Your solution for this result is pretty fine.

dataclasses.asdict might help you but you can’t avoid iteration through the list.

Answered By: andmed

The ugly loop is actually not that bad. If you want to make it a one-liner, use functools.reduce:

import dataclasses
from collections import defaultdict
from functools import reduce
from typing import Any, List, Protocol, Sequence, cast


@dataclasses.dataclass
class A:
    a: str
    b: int


@dataclasses.dataclass
class B:
    l: List[A]


class Dataclass(Protocol):
    __dataclass_fields__: dict[str, Any]


def to_dict_of_lists(_dataclasses: Sequence[Dataclass]) -> dict[str, list[Any]]:
    return reduce(
        lambda acc, x: {field.name: acc[field.name] + [getattr(x, field.name)] for field in dataclasses.fields(x)},
        _dataclasses,
        cast(dict[str, list[Any]], defaultdict(list)),
    )


da = [A("a", 3), A("b", 4)]

print(to_dict_of_lists(da))  # {'a': ['a', 'b'], 'b': [3, 4]}

Diclaimer: this is inefficient and less readable that the simple loop. Treat it rather as a fun fact.

Answered By: Paweł Rubin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.