Polars read AWS RDS DB with a table containing column of type jsonb


I’m trying to read AWS RDS DB using the following method through polars:

df_rds_table_test = pl.read_database_uri(sql_query, uri)

Postgres DB contains a table with column name ‘json_message’ of type jsonb and other columns of type String. When the table is read by polars, it treats data type of json_message column as String.

Further, i’m using DuckDB on the polars dataframe to perform SQL operations.

SQL_QUERY_SRC_JOIN = "select id,json_message.amount as amount from df_rds_table_test where id=10020240501"

src_df = duckdb.sql(SQL_QUERY_SRC_JOIN).pl()

I’m getting an exception that states

duckdb.duckdb.BinderException: Binder Error: Cannot extract field 'amount' from expression "json_message" because it is not a struct, union, or json

Not sure if there’s a way that we can cast the datatype to jsonb in polars as its not supported. I tried casting json_message to struct but i’m getting error.

Also i tried casting json_message to type JSON in duckdb query which didn’t help neither.

Sample json_message:

  "amount": 0,
  "c_id": null,
  "branch": "0502",
  "user_id": "U999999"}

Great if someone could please help me to access the json string in the polars dataframe using duckdb.


Are you sure that casting to json doesn’t work? It works in this example:

tbl = duckdb.sql("""
from (values
   ('{"amount": 0,"c_id": null, "branch": "0502", "user_id": "U999999"}')
select col0 as json_message

│                            json_message                            │
│                              varchar                               │
│ {"amount": 0,"c_id": null, "branch": "0502", "user_id": "U999999"} │

    select (json_message::json).amount as amount from tbl

│ amount │
│  json  │
│ 0      │

or with polars:

df = tbl.pl()

│ amount │
│ ---    │
│ i64    │
│ 0      │
Answered By: Roman Pekar