Let JSON object accept bytes or let urlopen output strings

Question:

With Python 3 I am requesting a json document from a URL.

response = urllib.request.urlopen(request)

The response object is a file-like object with read and readline methods. Normally a JSON object can be created with a file opened in text mode.

obj = json.load(fp)

What I would like to do is:

obj = json.load(response)

This however does not work as urlopen returns a file object in binary mode.

A work around is of course:

str_response = response.read().decode('utf-8')
obj = json.loads(str_response)

but this feels bad…

Is there a better way that I can transform a bytes file object to a string file object? Or am I missing any parameters for either urlopen or json.load to give an encoding?

Asked By: Peter Smit

||

Answers:

HTTP sends bytes. If the resource in question is text, the character encoding is normally specified, either by the Content-Type HTTP header or by another mechanism (an RFC, HTML meta http-equiv,…).

urllib should know how to encode the bytes to a string, but it’s too naïve—it’s a horribly underpowered and un-Pythonic library.

Dive Into Python 3 provides an overview about the situation.

Your “work-around” is fine—although it feels wrong, it’s the correct way to do it.

Answered By: Humphrey Bogart

Python’s wonderful standard library to the rescue…

import codecs

reader = codecs.getreader("utf-8")
obj = json.load(reader(response))

Works with both py2 and py3.

Docs: Python 2, Python3

Answered By: jbg

I have come to opinion that the question is the best answer 🙂

import json
from urllib.request import urlopen

response = urlopen("site.com/api/foo/bar").read().decode('utf8')
obj = json.loads(response)
Answered By: SergO

Just found this simple method to return HttpResponse content as a json:

import json

request = RequestFactory() # ignore this, this just like your request object

response = MyView.as_view()(request) # got response as HttpResponse object

response.render() # call this so we could call response.content after

json_response = json.loads(response.content.decode('utf-8'))

print(json_response) # {"your_json_key": "your json value"}

Hope that helps you.

For anyone else trying to solve this using the requests library:

import json
import requests

r = requests.get('http://localhost/index.json')
r.raise_for_status()
# works for Python2 and Python3
json.loads(r.content.decode('utf-8'))
Answered By: Luke Yeager

If you’re experiencing this issue whilst using the flask microframework, then you can just do:

data = json.loads(response.get_data(as_text=True))

From the docs: “If as_text is set to True the return value will be a decoded unicode string”

Answered By: cs_stackX

This one works for me, I used ‘request’ library with json() check out the doc in requests for humans

import requests

url = 'here goes your url'

obj = requests.get(url).json() 
Answered By: Sarthak Gupta

I ran into similar problems using Python 3.4.3 & 3.5.2 and Django 1.11.3. However, when I upgraded to Python 3.6.1 the problems went away.

You can read more about it here:
https://docs.python.org/3/whatsnew/3.6.html#json

If you’re not tied to a specific version of Python, just consider upgrading to 3.6 or later.

Answered By: PaulMest

Your workaround actually just saved me. I was having a lot of problems processing the request using the Falcon framework. This worked for me. req being the request form curl pr httpie

json.loads(req.stream.read().decode('utf-8'))
Answered By: thielyrics

I used below program to use of json.loads()

import urllib.request
import json
endpoint = 'https://maps.googleapis.com/maps/api/directions/json?'
api_key = 'AIzaSyABbKiwfzv9vLBR_kCuhO7w13Kseu68lr0'
origin = input('where are you ?').replace(' ','+')
destination = input('where do u want to go').replace(' ','+')
nav_request = 'origin={}&destination={}&key={}'.format(origin,destination,api_key)
request = endpoint + nav_request
response = urllib.request.urlopen(request).read().decode('utf-8')
directions = json.loads(response)
print(directions)
Answered By: jayesh

This will stream the byte data into json.

import io

obj = json.load(io.TextIOWrapper(response))

io.TextIOWrapper is preferred to the codecs module reader. https://www.python.org/dev/peps/pep-0400/

Answered By: Collin Anderson

As of Python 3.6, you can use json.loads() to deserialize a bytesobject directly (the encoding must be UTF-8, UTF-16 or UTF-32). So, using only modules from the standard library, you can do:

import json
from urllib import request

response = request.urlopen(url).read()
data = json.loads(response)
Answered By: Eugene Yarmash