Convert stringified json objects to json

Question:

I’m using MariaDB, and unfortunately, json objects are returned as strings. I want to convert these stringified json objects back to json, but the problem is – I only want to do so if they are actually json objects, and ignore all fields that are, for example, just normal strings (but can convert to json without causing an error).

My approach to do this, is to check if the string contains a double-quote, but this seems a bit too naive, because it will also convert a string which just naturally contains a double quote, but isn’t intended as a json object. Is there a more robust way to achieve this?

import json
results = {"not_string": 1234,
           "not_json": "1234",
           "json": '[{"json": "1234"}]',
           "false_positive": '["actually a quote in brackets"]'}

# load the json fields in the results
for key, value in results.items():
    if isinstance(value, str) and '"' in value:
        try:
            results[key] = json.loads(value)
        except Exception as e:
            pass
for key, value in results.items():
    print(type(value))
<class 'int'>
<class 'str'>
<class 'list'>
<class 'list'>  <--

Expected:

<class 'int'>
<class 'str'>
<class 'list'>
<class 'str'>  <--

basically, I dont want to rely on "asking for forgiveness", because the string can be converted to json without causing an error, but this is a false-positive and shouldn’t be done.

Asked By: c8999c 3f964f64

||

Answers:

I don’t know why you are converting false_positive which is a valid list to a string. Here I am providing a solution where all lists with a single string value will be converted to a string. Try this code:

import json
results = {"not_string": 1234,
           "not_json": "1234",
           "json": '[{"json": "1234"}]',
           "false_positive": '["actually a quote in brackets"]'}

# load the json fields in the results
for key, value in results.items():
    if isinstance(value, str) and '"' in value:
        try:
            temp = json.loads(value)
            if isinstance(temp, list):
                if len(temp) == 1 and isinstance(temp[0], str):
                    results[key] = json.dumps(temp)
                else:
                    results[key] = temp
        except Exception as e:
            pass
for key, value in results.items():
    print(type(value))
    
print(results)

This is output:

<class 'int'>
<class 'str'>
<class 'list'>
<class 'str'>
{'not_string': 1234, 'not_json': '1234', 'json': [{'json': '1234'}], 'false_positive': '["actually a quote in brackets"]'}
Answered By: Masud Dipu

Sorry for this confusing question, I failed to articulate what the goal of this question was.
I want to avoid converting string values to json, even if they would succeed to convert.

for example, i do not want to convert a string of digits ("1234") to json, but I do want to convert json objects, such as dictionaries, or an array of dictionaries to json.

What I’ve come up with is this:

Such json objects contain at least one double-quote (I already defined this in the question)

But, they must also contain an even number of double quotes

And, they should also contain at least one colon -> :

I’ve expanded my check if the code should convert to json based on these definitions:

        if isinstance(value, str) and value.count('"') % 2 == 0 and value.count('"') > 0 and ":" in value:
            try:
                value = json.loads(value)
            except Exception as e:
                # do not change the value
                pass

This avoids converting strings which are intended to remain strings, and focuses on converting more complex json objects, such as dictionaries.

I am aware that this may still produce false-positives, but all of this just depends on my use-case. I’m satisfied with this, because I want to focus on dictionaries, and avoid converting stringified integers and arrays which do not contain dictionaries

Answered By: c8999c 3f964f64
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.