`pyyaml` can't parse `pydantic` object if `typing` module is used

Question:

Let me start off by saying I wanted to open an issue in pydantic repo. Once I started rubber duck debugging I came up to the conclusion it’s actually pyyaml that isn’t working right but I’m not so sure anymore.

from dataclasses import dataclass
from functools import partial
from typing import List, Type

import yaml
from pydantic import BaseModel

yaml_input = """
!Foo
name: foo
bar:
    - !Bar
      name: bar
  """


def get_loader():
    loader = yaml.SafeLoader
    for tag_name, tag_constructor in tag_model_map.items():
        loader.add_constructor(tag_name, tag_constructor)
    return loader


def dynamic_constructor_mapping(model_class: Type[BaseModel], loader: yaml.SafeLoader,
                        node: yaml.nodes.MappingNode) -> BaseModel:
    return model_class(**loader.construct_mapping(node))


def get_constructor_for_mapping(model_class: Type[BaseModel]):
    return partial(dynamic_constructor_mapping, model_class)


class Bar(BaseModel):
    name: str


class Foo1(BaseModel):
    name: str
    bar: list


class Foo2(BaseModel):
    name: str
    bar: List


class Foo3(BaseModel):
    name: str
    bar: List[Bar]


@dataclass
class Foo4:
    name: str
    bar: List[Bar]


foos = [Foo1, Foo2, Foo3, Foo4]

for foo_cls in foos:
    tag_model_map = {
        "!Foo": get_constructor_for_mapping(foo_cls),
        "!Bar": get_constructor_for_mapping(Bar),
    }
    print(f"{foo_cls.__qualname__} loaded {yaml.load(yaml_input, Loader=get_loader())}")

which prints

Foo1 loaded name='foo' bar=[Bar(name='bar')]
Foo2 loaded name='foo' bar=[]
Foo3 loaded name='foo' bar=[]
Foo4 loaded Foo4(name='foo', bar=[Bar(name='bar')])
  • list of pydantic objects is parsed correctly if list is used in static typing
  • list of pydantic objects is NOT parsed correctly if List is used in static typing
  • list of pydantic objects is NOT parsed correctly if List[Bar] is used in static typing
  • list of dataclass objects is always parsed correctly

The constructor seems to be returning the correct object in all examples so I don’t understand where the problem lies.

pydantic==1.8.2
Python 3.8.10 
Asked By: Tom Wojcik

||

Answers:

So this is just a problem I’ve noticed with YAML in general, but it seems to me that code for de/serializing YAML to dataclasses is overall more complicated than needed.

If you don’t need the data validation features that pydantic provides, you could also check out the dataclass-wizard, which provides a helper YAMLWizard Mixin class that could be used for working with YAML data — note that this does rely on the pyyaml library as well.

Here is a simple example:

from __future__ import annotations

from dataclasses import dataclass
from dataclass_wizard import YAMLWizard


yaml_input = """
name: foo
bar:
    - name: bar
"""


@dataclass
class Foo(YAMLWizard):
    name: str
    bar: list[Bar]


@dataclass
class Bar:
    name: str


instance = Foo.from_yaml(yaml_input)
print(f'Loaded: {instance}')

To install dataclass-wizard along with pyyaml, you can include the yaml extra:

pip install dataclass-wizard[yaml]
Answered By: rv.kvetch

I had the same exact issue as you.

The thing that solved the issue for me was setting the deep to true in the construct_mapping method.

Example:

fields = loader.construct_mapping(node, deep=True)
Answered By: Jakub Pulaczewski
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.