How to parse json file with c-style comments?

Question:

I have a json file, such as the following:

    { 
       "author":"John",
       "desc": "If it is important to decode all valid JSON correctly  
and  speed isn't as important, you can use the built-in json module,   
 orsimplejson.  They are basically the same but sometimes simplej 
further along than the version of it that is included with 
distribution."
       //"birthday": "nothing" //I comment this line
    }

This file is auto created by another program. How do I parse it with Python?

Asked By: BollMose

||

Answers:

I have not personally used it, but the jsoncomment python package supports parsing a JSON file with comments.

You use it in place of the JSON parser as follows:

parser = JsonComment(json)
parsed_object = parser.loads(jsonString)
Answered By: studgeek

I can not imagine a json file “auto created by other program” would contain comments inside. Because json spec defines no comment at all, and that is by design, so no json library would output a json file with comment.

Those comments are usually added later, by a human. No exception in this case. The OP mentioned that in his post: //"birthday": "nothing" //I comment this line.

So the real question should be, how do I properly comment some content in a json file, yet maintaining its compliance with spec and hence its compatibility with other json libraries?

And the answer is, rename your field to another name. Example:

{
    "foo": "content for foo",
    "bar": "content for bar"
}

can be changed into:

{
    "foo": "content for foo",
    "this_is_bar_but_been_commented_out": "content for bar"
}

This will work just fine most of the time because the consumer will very likely ignore unexpected fields (but not always, it depends on your json file consumer’s implementation. So YMMV.)

UPDATE: Apparently some reader was unhappy because this answer does not give the “solution” they expect. Well, in fact, I did give a working solution, by implicitly linking to the JSON designer’s quote:

Douglas Crockford Public Apr 30, 2012 Comments in JSON

I removed comments from JSON because I saw people were using them to
hold parsing directives, a practice which would have destroyed
interoperability. I know that the lack of comments makes some people
sad, but it shouldn’t.

Suppose you are using JSON to keep configuration files, which you
would like to annotate. Go ahead and insert all the comments you like.
Then pipe it through JSMin before handing it to your JSON parser.

So, yeah, go ahead to use JSMin. Just keep in mind that when you are heading towards “using comments in JSON”, that is a conceptually uncharted territory. There is no guarantee that whatever tools you choose would handle: inline [1,2,3,/* a comment */ 10], Python style [1, 2, 3] # a comment (which is a comment in Python but not in Javascript), INI style [1, 2, 3] ; a comment, …, you get the idea.

I would still suggest to NOT adding noncompliant comments in JSON in the first place.

Answered By: RayLuo

jsoncomment is good, but inline comment is not supported.

Check out jstyleson, which support

  • inline comment
  • single-line comment
  • multi-line comment
  • trailing comma.

Comments are NOT preserved. jstyleson first removes all comments and trailing commas, then uses the standard json module. It seems like function arguments are forwarded and work as expected. It also exposes dispose to return the cleaned string contents without parsing.

Example

Install

pip install jstyleson

Usage

import jstyleson
result_dict = jstyleson.loads(invalid_json_str) # OK
jstyleson.dumps(result_dict)
Answered By: Jackson Lin

How about commentjson?

http://commentjson.readthedocs.io/en/latest/

This can parse something like below.

{
    "name": "Vaidik Kapoor", # Person's name
    "location": "Delhi, India", // Person's location

    # Section contains info about
    // person's appearance
    "appearance": {
        "hair_color": "black",
        "eyes_color": "black",
        "height": "6"
    }
}

Likely elasticsearch, some products’ REST API do not accept comment field. Therefore, I think comment inside json is necessary for a client in order to maintain such as a json template.


EDITED

jsmin seems to be more common.

https://pypi.python.org/pypi/jsmin

Answered By: tabata

If you are like me who prefers avoiding external libraries, this function I wrote will read json from a file and remove "//" and "/* */" type comments:

def GetJsonFromFile(filePath):
    contents = ""
    fh = open(filePath)
    for line in fh:
        cleanedLine = line.split("//", 1)[0]
        if len(cleanedLine) > 0 and line.endswith("n") and "n" not in cleanedLine:
            cleanedLine += "n"
        contents += cleanedLine
    fh.close
    while "/*" in contents:
        preComment, postComment = contents.split("/*", 1)
        contents = preComment + postComment.split("*/", 1)[1]
    return contents

Limitations: As David F. brought up in the comments, this will break beautifully (ie: horribly) with // and /* inside string literals. Would need to write some code around it if you want to support //, /*, */ within your json string contents.

Answered By: deleb

You might look at Json5, if you’re not really caring about strict by-the-book JSON formatting and just want something that allows you to have comments in JSON. For example, this library will let you parse JSON5: https://pypi.org/project/json5/

Answered By: GuestMcGuestFace

in short: use jsmin

pip install jsmin

import json
from jsmin import jsmin

with open('parameters.jsonc') as js_file:
    minified = jsmin(js_file.read())
parameters  = json.loads(minified)
Answered By: Pablo

I recommend everyone switch to a JSON5 library instead. JSON5 is JSON with JavaScript features/support. It’s the most popular JSON language extension in the world. It has comments, support for trailing commas in objects/arrays, support for single-quoted keys/strings, support for unquoted object keys, etc. And there’s proper parser libraries with deep test suites and everything working perfectly.

There are two different, high-quality Python implementations:

Here’s the JSON5 spec: https://json5.org/

Answered By: Mitch McMabers

Here’s a small standalone wrapper:

#!/usr/bin/env python3
import json
import re

def json_load_nocomments( filename_or_fp, comment = "//|#", **jsonloadskw ) -> "json dict":
    """ load json, skipping comment lines starting // or #
        or white space //, or white space #
    """
    # filename_or_fp -- lines -- filter out comments -- bigstring -- json.loads

    if hasattr( filename_or_fp, "readlines" ):  # open() or file-like
        lines = filename_or_fp.readlines()
    else:
        with open( filename_or_fp ) as fp:
            lines = fp.readlines()  # with n
    iscomment = re.compile( r"s*(" + comment + ")" ).match
    notcomment = lambda line: not iscomment( line )  # ifilterfalse
    bigstring = "".join( filter( notcomment, lines ))
        # json.load( fp ) does loads( fp.read() ), the whole file in memory

    return json.loads( bigstring, **jsonloadskw )


if __name__ == "__main__":  # sanity test
    import sys
    for jsonfile in sys.argv[1:] or ["test.json"]:
        print( "n-- " + jsonfile )
        jsondict = json_load_nocomments( jsonfile )
            # first few keys, val type --
        for key, val in list( jsondict.items() )[:10]:
            n = (len(val) if isinstance( val, (dict, list, str) )
                else "" )
            print( "%-10s : %s %s" % (
                    key, type(val).__name__, n ))

Answered By: denis

For the [95% of] cases when you just need simple leading // line comments with a simple way to handle them:

import json

class JSONWithCommentsDecoder(json.JSONDecoder):
    def __init__(self, **kw):
        super().__init__(**kw)

    def decode(self, s: str) -> Any:
        s = 'n'.join(l for l in s.split('n') if not l.lstrip(' ').startswith('//'))
        return super().decode(s)

your_obj = json.load(f, cls=JSONWithCommentsDecoder)

Answered By: nivedano

Improving a previous answer to provide correct line number support:

class JSONWithCommentsDecoder(json.JSONDecoder):
    def __init__(self, **kw):
        super().__init__(**kw)

    def decode(self, s: str) -> Any:
        s = 'n'.join(l if not l.lstrip().startswith('//') else '' for l in s.split('n'))
        return super().decode(s)

Answered By: Louis Caron

C-style comments are officially part of the JSON5 specification.

❗️Important: Before you go any further please note that JSON5 and JSON are two different formats although compatible.

From json5.org:

JSON5 is an extension to the popular JSON file format that aims to be easier to write and maintain by hand (e.g. for config files). It is not intended to be used for machine-to-machine communication. (Keep using JSON or other file formats for that. )


  1. Install json5 with:
pip3 install json5
  1. Use json5 instead of json:
import json5

print(json5.loads("""{
 "author": "John",
 "desc": "If it is import..",
 // "birthday": "nothing"
 }"""))
### OUTPUT: {'author': 'John', 'desc': 'If it is import..'}
Answered By: ccpizza
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.