An elegant way to strip (trim) all string values in a JSON
Question:
As an example, I would like to convert this JSON:
{
"abcd": " Something here",
"foo": [
"ab 123 ",
" pp wer ",
{
"xyq": [[[" going deeper "]]]
}
],
"nested": {
"name": " Santa ",
"age": 1234
}
}
into this one:
{
"abcd": "Something here",
"foo": [
"ab 123",
"pp wer",
{
"xyq": [[["going deeper"]]]
}
],
"nested": {
"name": "Santa",
"age": 1234
}
}
Obviously, this can be implemented by parsing the JSON into a dict/list tree structure, recursively walking through that tree and modifying the leafs where needed. I wonder, is there an elegant and easy way to do that?
Answers:
If your json data is stored in a file as text, or is available as string, than you can use regex. It can be done with the json
-module form the standard library with dump
and load
. Doc
import re
jsons_text = """"
{
"abcd": " Something here",
"foo": [
"ab 123 ",
" pp wer ",
"hey": {
"xyq": [[[" going deeper "]]]
}
],
"nested": {
"name": " Santa ",
"age": 1234
}
}"""
# remove space on the left
json1 = re.sub(r'"s+(?=[a-zA-Z])', '"1', json_text, flags=re.MULTILINE)
# remove space on the right
json2 = re.sub(r'(?<=[a-zA-Z])s+"', '1"', json1, flags=re.MULTILINE)
print(json2)
I don’t know if it is elegant enough but at least it doesn’t explicitly need dynamic structures.
Here is the simplest solution I could come up with. Would be happy if someone can suggest something simpler.
def trim_dict_or_list(item):
if type(item) is list:
for i, v in enumerate(item):
if type(v) is str:
item[i] = v.strip()
else:
trim_dict_or_list(v)
elif type(item) is dict:
for k, v in item.items():
if type(v) is str:
item[k] = v.strip()
else:
trim_dict_or_list(v)
with open(file_name, 'r', encoding='utf-8') as f:
jj = json.load(f)
trim_dict_or_list(jj)
I think it can be implemented in a much easier way:
def strip(data):
if isinstance(data, str):
return data.strip()
if isinstance(data, list):
return [strip(element) for element in data]
if isinstance(data, dict):
return {key: strip(value) for key, value in data.items()}
return data
As an example, I would like to convert this JSON:
{
"abcd": " Something here",
"foo": [
"ab 123 ",
" pp wer ",
{
"xyq": [[[" going deeper "]]]
}
],
"nested": {
"name": " Santa ",
"age": 1234
}
}
into this one:
{
"abcd": "Something here",
"foo": [
"ab 123",
"pp wer",
{
"xyq": [[["going deeper"]]]
}
],
"nested": {
"name": "Santa",
"age": 1234
}
}
Obviously, this can be implemented by parsing the JSON into a dict/list tree structure, recursively walking through that tree and modifying the leafs where needed. I wonder, is there an elegant and easy way to do that?
If your json data is stored in a file as text, or is available as string, than you can use regex. It can be done with the json
-module form the standard library with dump
and load
. Doc
import re
jsons_text = """"
{
"abcd": " Something here",
"foo": [
"ab 123 ",
" pp wer ",
"hey": {
"xyq": [[[" going deeper "]]]
}
],
"nested": {
"name": " Santa ",
"age": 1234
}
}"""
# remove space on the left
json1 = re.sub(r'"s+(?=[a-zA-Z])', '"1', json_text, flags=re.MULTILINE)
# remove space on the right
json2 = re.sub(r'(?<=[a-zA-Z])s+"', '1"', json1, flags=re.MULTILINE)
print(json2)
I don’t know if it is elegant enough but at least it doesn’t explicitly need dynamic structures.
Here is the simplest solution I could come up with. Would be happy if someone can suggest something simpler.
def trim_dict_or_list(item):
if type(item) is list:
for i, v in enumerate(item):
if type(v) is str:
item[i] = v.strip()
else:
trim_dict_or_list(v)
elif type(item) is dict:
for k, v in item.items():
if type(v) is str:
item[k] = v.strip()
else:
trim_dict_or_list(v)
with open(file_name, 'r', encoding='utf-8') as f:
jj = json.load(f)
trim_dict_or_list(jj)
I think it can be implemented in a much easier way:
def strip(data):
if isinstance(data, str):
return data.strip()
if isinstance(data, list):
return [strip(element) for element in data]
if isinstance(data, dict):
return {key: strip(value) for key, value in data.items()}
return data