How can I create an Avro schema from a python class?

Question:

How can I transform my simple python class like the following into a avro schema?

class Testo(SQLModel):
    name: str
    mea: int

This is the Testo.schema() output

{
    "title": "Testo",
    "type": "object",
    "properties": {
        "name": {
            "title": "Name",
            "type": "string"
        },
        "mea": {
            "title": "Mea",
            "type": "integer"
        }
    },
    "required": [
        "name",
        "mea"
    ]
}

from here I would like to create an Avro record. This can be converted online on konbert.com (select JSON to AVRO Schema) and it results in the Avro schema below. (all valid despite the name field which should be "Testo" instead of "Record".)

{
  "type": "record",
  "name": "Record",
  "fields": [
    {
      "name": "title",
      "type": "string"
    },
    {
      "name": "type",
      "type": "string"
    },
    {
      "name": "properties.name.title",
      "type": "string"
    },
    {
      "name": "properties.name.type",
      "type": "string"
    },
    {
      "name": "properties.mea.title",
      "type": "string"
    },
    {
      "name": "properties.mea.type",
      "type": "string"
    },
    {
      "name": "required",
      "type": {
        "type": "array",
        "items": "string"
      }
    }
  ]
}

Anyhow, if they can do it, there certainly must be a way to convert it with current python libraries. Which library can do a valid conversion (and also complex python models/classes?

If there is an opinion of that this is a wrong approach, that is also welcome – if – pointing out a better way how this translation process can be done.

Asked By: feder

||

Answers:

I didn’t find a python library doing this, thus I’ve wrote it my self.

I loop over all the types and translate them one by one. I go recursive where there is a class reference in a field.

e.g. here example of the start of the method.

fields = []
        for field in model_class.__fields__.values():
            if issubclass(field.type_, SQLModel):
                # Recursively generate schema for nested models
                fields.append({"name": field.name, "type": self.create_field_array(self, field.type_)})
            elif field.type_ == str:
                fields.append({"name": field.name, "type": "string"})

... etc

For details on this method, check out our repository fa-models for trading on GitHub. This class may translate regular python classes as well as pydantic, sqlalchemy and SQLModel classes. Help to increase the test cases for the library. We accept pull requests.

Answered By: feder
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.