Unable to parse TAB in JSON files

Question:

I am running into a parsing problem when loading JSON files that seem to have the TAB character in them.

When I go to http://jsonlint.com/, and I enter the part with the TAB character:

{
    "My_String": "Foo bar.  Bar foo."
}

The validator complains with:

Parse error on line 2:
{    "My_String": "Foo bar. Bar foo."
------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['

This is literally a copy/paste of the offending JSON text.

I have tried loading this file with json and simplejson without success. How can I load this properly? Should I just pre-process the file and replace TAB by t or by a space? Or is there anything that I am missing here?

Update:

Here is also a problematic example in simplejson:

foo = '{"My_string": "Foo bar.t Bar foo."}'
simplejson.loads(foo)

JSONDecodeError: Invalid control character 't' at: line 1 column 24 (char 23)
Asked By: Josh

||

Answers:

Tabs are legal as delimiting whitespace outside of values, but not within strings. To get a tab inside a JSON string you need to use the sequence t instead.

But beware multiple levels of interpretation. This Python string from your update:

foo = '{"My_string": "Foo bar.t Bar foo."}'

is not valid JSON, because the Python interpreter turns that t sequence into an actual tab character before the JSON processor ever sees it.

You can tell Python to put a literal t in the string instead of a tab character by doubling the backslash:

foo = '{"My_string": "Foo bar.\t Bar foo."}'

Or you can use the "raw" string syntax, which doesn’t interpret any special backslash sequences:

foo = r'{"My_string": "Foo bar.t Bar foo."}'

Either way, the JSON processor will see a string containing a backslash followed by a ‘t’, rather than a string containing a tab.

Answered By: Mark Reed

You can include tabs within values (instead of as whitespace) in JSON files by escaping them. Here’s a working example with the json module in Python2.7:

>>> import json
>>> obj = json.loads('{"MY_STRING": "Foo\tBar"}')
>>> obj['MY_STRING']
u'FootBar'
>>> print obj['MY_STRING']
Foo    Bar

While not escaping the 't' causes an error:

>>> json.loads('{"MY_STRING": "FootBar"}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 1 column 19 (char 18)
Answered By: mdml

From JSON standard:

Insignificant whitespace is allowed before or after any token. The
whitespace characters are: character tabulation (U+0009), line feed
(U+000A), carriage return (U+000D), and space (U+0020). Whitespace is
not allowed within any token, except that space is allowed in
strings.

It means that a literal tab character is not allowed inside a JSON string. You need to escape it as t (in a .json-file):

{"My_string": "Foo bar.t Bar foo."}

In addition if json text is provided inside a Python string literal then you need double escape the tab:

foo = '{"My_string": "Foo bar.\t Bar foo."}' # in a Python source

Or use a Python raw string literal:

foo = r'{"My_string": "Foo bar.t Bar foo."}' # in a Python source
Answered By: jfs

Just to share my experience:

I am using snakemake and a config file written in Json. There are tabs in the json file for indentation. TAB are legal for this purpose. But I am getting error message: snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. I believe this is a bug of snakemake; but I could be wrong. Please comment. After replacing all TABs with spaces the error message is gone.

Answered By: Kemin Zhou

In node-red flow i facing same type of problem:

flow.set("delimiter",'"t"');

error:

{ "status": "ERROR", "result": "Cannot parse config: String: 1: in value for key 'delimiter': JSON does not allow unescaped tab in quoted strings, use a backslash escape" }  

solution:

i added in just \t in the code.

 flow.set("delimiter",'"\t"');
Answered By: KARTHIKEYAN.A
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.