An elegant way to strip (trim) all string values in a JSON

Question:

As an example, I would like to convert this JSON:

{
    "abcd": "  Something here",
    "foo": [
        "ab 123 ",
        " pp wer ",
        {
            "xyq": [[["  going deeper "]]]
        }
    ],
    "nested": {
        "name": "   Santa ",
        "age": 1234
    }   
}

into this one:

{
    "abcd": "Something here",
    "foo": [
        "ab 123",
        "pp wer",
        {
            "xyq": [[["going deeper"]]]
        }
    ],
    "nested": {
        "name": "Santa",
        "age": 1234
    }   
}

Obviously, this can be implemented by parsing the JSON into a dict/list tree structure, recursively walking through that tree and modifying the leafs where needed. I wonder, is there an elegant and easy way to do that?

Asked By: at54321

||

Answers:

If your json data is stored in a file as text, or is available as string, than you can use regex. It can be done with the json-module form the standard library with dump and load. Doc

import re

jsons_text = """"
{
    "abcd": "  Something here",
    "foo": [
        "ab 123 ",
        " pp wer ",
        "hey": {
            "xyq": [[["  going deeper "]]]
        }
    ],
    "nested": {
        "name": "   Santa ",
        "age": 1234
    }   
}"""
# remove space on the left
json1 = re.sub(r'"s+(?=[a-zA-Z])', '"1', json_text, flags=re.MULTILINE)
# remove space on the right
json2 = re.sub(r'(?<=[a-zA-Z])s+"', '1"', json1, flags=re.MULTILINE)

print(json2)

I don’t know if it is elegant enough but at least it doesn’t explicitly need dynamic structures.

Answered By: cards

Here is the simplest solution I could come up with. Would be happy if someone can suggest something simpler.

def trim_dict_or_list(item):
    if type(item) is list:
        for i, v in enumerate(item):
            if type(v) is str:
                item[i] = v.strip()
            else:
                trim_dict_or_list(v)
    elif type(item) is dict:
        for k, v in item.items():
            if type(v) is str:
                item[k] = v.strip()
            else:
                trim_dict_or_list(v)


with open(file_name, 'r', encoding='utf-8') as f:
    jj = json.load(f)
trim_dict_or_list(jj)
Answered By: at54321

I think it can be implemented in a much easier way:

def strip(data):
    if isinstance(data, str):
        return data.strip()
    if isinstance(data, list):
        return [strip(element) for element in data]
    if isinstance(data, dict):
        return {key: strip(value) for key, value in data.items()}
    return data
Answered By: Hazem Elraffiee
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.