JSONDecodeError: using googletrans module

Question:

I’m trying to translate 100,000 English words to Korean by using ‘googletrans’ module. But after some iteration, it raises

‘JSONDecodeError: Expecting value: line 1 column 1 (char 0)’.

I tried to figure it out, but solutions on the web didn’t work for me.
What I’ve tried is

  1. Re-initializing Translator() every iteration,
  2. Time-sleep(.4) every iteration.

The most weird part is that it happens randomly. Sometimes it raises after a few hundreds iteration, sometimes after a few iteration.

I checked manual of this module and the only limit I could find was that it only limits the length of the word. However, all words that I’m trying to translate are just literally words, not phrase or sentences.

key_list = list(senticnet.keys())
for key in key_list:
  translator = Translator()
  time.sleep(.4)
  print(translator.translate(key, dest='ko'))

In the above code, senticnet is a dictionary variable. It looks like

senticnet['abusive_conduct'] = ['0', '0', '0.853', '-0.84', '#anger', '#disgust', 'negative', '-0.84', 'flagrant', 'cry', 'glaring', 'gross', 'rank']
senticnet['abusive_father'] = ['0', '0', '0.821', '-0.95', '#anger', '#disgust', 'negative', '-0.88', 'student', 'serious_student', 'addiction', 'graduate_student', 'hard_worker']

Here is the error message

/content/gdrive/My Drive/Colab Notebooks/py-googletrans/googletrans/client.py in translate(self, text, dest, src)
170 
171         origin = text
--> 172         data = self._translate(text, dest, src)
173 
174         # this code will be updated when the format is changed.

/content/gdrive/My Drive/Colab Notebooks/py-googletrans/googletrans/client.py in _translate(self, text, dest, src)
 79         r = self.session.get(url, params=params)
 80 
---> 81         data = utils.format_json(r.text)
 82         return data
 83 

/content/gdrive/My Drive/Colab Notebooks/py-googletrans/googletrans/utils.py in format_json(original)
 60         converted = json.loads(original)
 61     except ValueError:
---> 62         converted = legacy_format_json(original)
 63 
 64     return converted

/content/gdrive/My Drive/Colab Notebooks/py-googletrans/googletrans/utils.py in legacy_format_json(original)
 52             text = text[:p] + states[j][1] + text[nxt:]
 53 
---> 54     converted = json.loads(text)
 55     return converted
 56 

/usr/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352             parse_int is None and parse_float is None and
353             parse_constant is None and object_pairs_hook is None and not kw):
--> 354         return _default_decoder.decode(s)
355     if cls is None:
356         cls = JSONDecoder

/usr/lib/python3.6/json/decoder.py in decode(self, s, _w)
337 
338         """
--> 339         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340         end = _w(s, end).end()
341         if end != len(s):

/usr/lib/python3.6/json/decoder.py in raw_decode(self, s, idx)
355             obj, end = self.scan_once(s, idx)
356         except StopIteration as err:
--> 357             raise JSONDecodeError("Expecting value", s, err.value) from None
358         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Asked By: zzaebok

||

Answers:

Question: I checked manual of this module and the only limit I could find was that it only limits the length of the word.


Ther are a few other limits!

From Googletrans 2.3.0 documentation

Note on library usage

  • Due to limitations of the web version of google translate, this API does not guarantee that the library would work properly at all times. (so please use this library if you don’t care about stability.)
  • If you want to use a stable API, I highly recommend you to use Google’s official translate API.
  • If you get HTTP 5xx error or errors like #6, it’s probably because Google has banned your client IP address.
Answered By: stovfl

The error is misleading, it happens because you get rejected by google due to numerous requests. I could solve this problem in two different ways:

  1. Use a VPN and change your IP. It will work again.
  2. Go to google and search for something. You will receive a message and a reCAPTCHA to solve. After you do this, it will work again for some time.
Answered By: Aneho