Getting error "value is not a valid dict" when using Pydantic models in FastAPI for model-based predictions

Question:

I’m trying to use Pydantic models with FastAPI to make multiple predictions (for a list of inputs). The problem is that one can’t pass Pydantic models directly to model.predict() function, so I converted it to a dictionary, however, I’m getting the following error:

AttributeError: 'list' object has no attribute 'dict'

My code:

from fastapi import FastAPI
import uvicorn
from pydantic import BaseModel
import pandas as pd
from typing import List

app = FastAPI()

class Inputs(BaseModel):
    id: int
    f1: float
    f2: float
    f3: str

class InputsList(BaseModel):
    inputs: List[Inputs]

@app.post('/predict')
def predict(input_list: InputsList):
    df = pd.DataFrame(input_list.inputs.dict())
    prediction = classifier.predict(df.loc[:, df.columns != 'id'])
    probability = classifier.predict_proba(df.loc[:, df.columns != 'id'])
    return {'id': df["id"].tolist(), 'prediction': prediction.tolist(), 'probability': probability.tolist()}

I have also a problem with the return, I need the output to be something like :

    [
      {
        "id": 123,
        "prediction": "class1",
        "probability": 0.89
      },
      {
        "id": 456,
        "prediction": "class3",
        "probability": 0.45
      }
    ]

PS: the id in Inputs class doesn’t take place in the prediction (is not a feature), but I need it to be shown next to its prediction (to reference it).

Request:
enter image description here

Asked By: Legna

||

Answers:

Your definition of the input schema for the view function does not match the content you’re sending:

class Inputs(BaseModel):
    id: int
    f1: float
    f2: float
    f3: str

class InputsList(BaseModel):
    inputs: List[Inputs]

This matches a request body in the format of:

{
  "inputs": [
    {
      "id": 1,
      "f1": 1.0,
      "f2": 1.0,
      "f3": "foo"
    }, {
      "id": 2,
      "f1": 2.0,
      "f2": 2.0,
      "f3": "bar"
    }
  ]
}

The request body you’re sending does not match the expected format, and thus, you get an 422 response back.

Either change the object you’re sending to match the format expected by FastAPI or drop the InputsList wrapper and set the input as input_list: List[Inputs] instead.

Answered By: MatsLindh

First, there are unecessary commas , at the end of both f1 and f2 attributes of your schema, as well as in the JSON payload you are sending. Hence, your schema should be:

class Inputs(BaseModel):
    id: int
    f1: float
    f2: float
    f3: str

Second, the 422 error is due to that the JSON payload you are sending does not match your schema. As noted by @MatsLindh your JSON payload should look like this:

{
  "inputs": [
    {
      "id": 1,
      "f1": 1.0,
      "f2": 1.0,
      "f3": "text"
    },
    {
      "id": 2,
      "f1": 2.0,
      "f2": 2.0,
      "f3": "text"
    }
  ]
}

Third, you are creating the DataFrame in the worng way. You are attempting to call the dict() method on a list object; hence, the AttributeError: 'list' object has no attribute 'dict'. Instead, as shown here, you should call the .dict() method on each item in the list, as shown below:

df = pd.DataFrame([i.dict() for i in input_list.inputs])

Finally, to return the results in the output format mentioned in your question, use the below. Note predict_proba() returns an array of lists containing the class probabilities for the input. If you would like to return only the probability for a specific class, use the index for that class instead, e.g., prob[0].

results = []
for (id, pred, prob) in zip(df["id"].tolist(), prediction.tolist(), probability.tolist()):
    results.append({"id": id, "prediction": pred, "probability": prob})
return results

alternatively, you can use a DataFrame and call its to_dict() method to convert it into a dictionary, as shown below. If you have a large amount of data and find the approach below being quite slow in returning the results, please have a look at this answer for alternative approaches.

results = pd.DataFrame({'id': df["id"].tolist(),'prediction': prediction.tolist(),'probability': probability.tolist()})
return results.to_dict(orient="records") 

If you would like to return only the probability for a specific class when using DataFrame, you could extract it and add it to a new list like this prob_list = [item[0] for item in probability.tolist()] or using operator.itemgetter() like this prob_list = list(map(itemgetter(0), probability.tolist())), and use that list instead when creating the DataFrame.

Answered By: Chris