After upgrade, raw sql queries return json fields as strings on postgres
Question:
I am upgrading a Django app from 2.2.7 to 3.1.3. The app uses Postgres 12 & psycopg2 2.8.6.
I followed the instructions and changed all my django.contrib.postgres.fields.JSONField
references to django.db.models.JSONField
, and made and ran the migrations. This produced no changes to my schema (which is good.)
However, when I execute a raw query the data for those jsonb
columns is returned as text, or converted to text, at some point. I don’t see this issue when querying the models directly using Model.objects.get(...)
.
import os, django
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "big_old_project.settings")
django.setup()
with connection.cursor() as c:
c.execute("select name, data from tbl where name=%s", ("rex",))
print(c.description)
for row in c.fetchall():
for col in row:
print(f"{type(col)} => {col!r}")
(Column(name='name', type_code=1043), Column(name='data', type_code=3802))
<class 'str'> => 'rex'
<class 'str'> => '{"toy": "bone"}'
[edit] Using a raw connection gives the expected results:
conn = psycopg2.connect("dbname=db user=x password=z")
with conn.cursor() as c:
...
<class 'str'> => 'rex'
<class 'dict'> => {'toy': 'bone'}
Trying the old trick of "registering" the adapter doesn’t work, and shouldn’t be needed anyway.
import psycopg2.extras
psycopg2.extras.register_json(oid=3802, array_oid=3807, globally=True)
This app has a lot of history, so maybe something is stepping on psycopg2’s toes? I can’t find anything so far, and have commented out everything that seems tangentially related.
Going through the release notes didn’t help. I do use other postgres fields so I can’t delete all references to contrib.postgres.fields
from my models.
Any ideas as to why this is happening would be greatly appreciated.
Answers:
Ok, so this seems to be a changed they introduced for some reason in django 3.1.1 to fix some other bug. It de-registers the jsonb
converter from the underlying connection, which IMO is terrible.
Django issues: The first "fix" broke this basic functionality along with a denial of the use case of raw sql, the second is declaring the breakage invalid.
- QuerySet.order_by() chained with values() crashes on JSONField
- TypeError loading data in JSONField if DB has native JSON support
To fix this you can either make your own raw cursor, which django doesn’t screw with, or cast your fields in the raw sql. At least, that is, until they break that too!
SELECT
data::json, -- db type is jsonb
other_fields
FROM
table
To add to @Andrew Backer’s helpful answer, this is apparently intentional. From the 3.1.1 release notes:
Fixed a QuerySet.order_by()
crash on PostgreSQL when ordering and grouping by JSONField
with a custom decoder
(#31956). As a consequence, fetching a JSONField
with raw SQL now returns a string instead of pre-loaded data. You will need to explicitly call json.loads()
in such cases.
It’s surprising to find an API-incompatible change as an aside in a bugfix release. For now I’ll be adding json.loads()
calls since, as already mentioned, there’s no guarantee the ::json
workaround doesn’t break as well!
Thank you for this helpful post! My solution for this bug looks like this:
def call_database(query, vars=None):
conn = connections['default'] #database name
conn.ensure_connection()
with conn.connection.cursor() as cursor:
psycopg2.extras.register_default_jsonb(conn_or_curs=cursor)
cursor.execute(query, vars)
row = cursor.fetchall()
return row
@Sascha Rau, your post was really helpful. To possibly save others a bit of time I note that this is critical to the solution you propose:
conn = connections[settings.UX_DATABASE_NAME]
conn.ensure_connection()
Also, for this settings configuration for your Django project database:
DATABASES = {
"default": {...}
}
you can refer to the connection to your default project database with conn = connections['default']
The previous solutions posted here were helpful for pointing me in the right direction, but in the end, my preferred solution came from a recent comment posted in one of the originally reported Django issues.
https://code.djangoproject.com/ticket/31956#comment:18
This solution may be a bit hacky, but it avoids having to change and maintain workarounds in multiple places.
In core/apps.py
file:
from django.apps import AppConfig
from django.db import models
import psycopg2
def from_db_value(self, value, expression, connection):
return value
# monkey patch Django JSONField for ORM jsonb results
models.JSONField.from_db_value = from_db_value
def _handle_jsonb(*args, **kwargs):
pass
class CoreConfig(AppConfig):
name = "core"
verbose_name = "Core Config"
def ready(self):
# register empty decoder for raw sql jsonb results
psycopg2.extras.register_default_jsonb = _handle_jsonb
I am upgrading a Django app from 2.2.7 to 3.1.3. The app uses Postgres 12 & psycopg2 2.8.6.
I followed the instructions and changed all my django.contrib.postgres.fields.JSONField
references to django.db.models.JSONField
, and made and ran the migrations. This produced no changes to my schema (which is good.)
However, when I execute a raw query the data for those jsonb
columns is returned as text, or converted to text, at some point. I don’t see this issue when querying the models directly using Model.objects.get(...)
.
import os, django
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "big_old_project.settings")
django.setup()
with connection.cursor() as c:
c.execute("select name, data from tbl where name=%s", ("rex",))
print(c.description)
for row in c.fetchall():
for col in row:
print(f"{type(col)} => {col!r}")
(Column(name='name', type_code=1043), Column(name='data', type_code=3802))
<class 'str'> => 'rex'
<class 'str'> => '{"toy": "bone"}'
[edit] Using a raw connection gives the expected results:
conn = psycopg2.connect("dbname=db user=x password=z")
with conn.cursor() as c:
...
<class 'str'> => 'rex'
<class 'dict'> => {'toy': 'bone'}
Trying the old trick of "registering" the adapter doesn’t work, and shouldn’t be needed anyway.
import psycopg2.extras
psycopg2.extras.register_json(oid=3802, array_oid=3807, globally=True)
This app has a lot of history, so maybe something is stepping on psycopg2’s toes? I can’t find anything so far, and have commented out everything that seems tangentially related.
Going through the release notes didn’t help. I do use other postgres fields so I can’t delete all references to contrib.postgres.fields
from my models.
Any ideas as to why this is happening would be greatly appreciated.
Ok, so this seems to be a changed they introduced for some reason in django 3.1.1 to fix some other bug. It de-registers the jsonb
converter from the underlying connection, which IMO is terrible.
Django issues: The first "fix" broke this basic functionality along with a denial of the use case of raw sql, the second is declaring the breakage invalid.
- QuerySet.order_by() chained with values() crashes on JSONField
- TypeError loading data in JSONField if DB has native JSON support
To fix this you can either make your own raw cursor, which django doesn’t screw with, or cast your fields in the raw sql. At least, that is, until they break that too!
SELECT
data::json, -- db type is jsonb
other_fields
FROM
table
To add to @Andrew Backer’s helpful answer, this is apparently intentional. From the 3.1.1 release notes:
Fixed a
QuerySet.order_by()
crash on PostgreSQL when ordering and grouping byJSONField
with a customdecoder
(#31956). As a consequence, fetching aJSONField
with raw SQL now returns a string instead of pre-loaded data. You will need to explicitly calljson.loads()
in such cases.
It’s surprising to find an API-incompatible change as an aside in a bugfix release. For now I’ll be adding json.loads()
calls since, as already mentioned, there’s no guarantee the ::json
workaround doesn’t break as well!
Thank you for this helpful post! My solution for this bug looks like this:
def call_database(query, vars=None):
conn = connections['default'] #database name
conn.ensure_connection()
with conn.connection.cursor() as cursor:
psycopg2.extras.register_default_jsonb(conn_or_curs=cursor)
cursor.execute(query, vars)
row = cursor.fetchall()
return row
@Sascha Rau, your post was really helpful. To possibly save others a bit of time I note that this is critical to the solution you propose:
conn = connections[settings.UX_DATABASE_NAME]
conn.ensure_connection()
Also, for this settings configuration for your Django project database:
DATABASES = {
"default": {...}
}
you can refer to the connection to your default project database with conn = connections['default']
The previous solutions posted here were helpful for pointing me in the right direction, but in the end, my preferred solution came from a recent comment posted in one of the originally reported Django issues.
https://code.djangoproject.com/ticket/31956#comment:18
This solution may be a bit hacky, but it avoids having to change and maintain workarounds in multiple places.
In core/apps.py
file:
from django.apps import AppConfig
from django.db import models
import psycopg2
def from_db_value(self, value, expression, connection):
return value
# monkey patch Django JSONField for ORM jsonb results
models.JSONField.from_db_value = from_db_value
def _handle_jsonb(*args, **kwargs):
pass
class CoreConfig(AppConfig):
name = "core"
verbose_name = "Core Config"
def ready(self):
# register empty decoder for raw sql jsonb results
psycopg2.extras.register_default_jsonb = _handle_jsonb