Converting elements in a single list to key/value pair using Unicode characters as key

Question:

I have a list (see below) that I want to take any element in the list containing a Unicode character (e.g.,’①’,’②’,’㉖’) as the key/value pair inside a ‘category’ JSON element and the following elements in the list between each Unicode element as the key/value pairs inside a ‘codes’ JSON nested grouping.

What list I have:

['①', 'Type of Care']
['SA', 'Substance use treatment']
['DT', 'Detoxification']
['HH', 'Transitional housing, halfway house, or sober home']
['SUMH', 'Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders']
['②', 'Telemedicine']
['TELE', 'Telemedicine/telehealth']
['③', 'Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)']
['HI', 'Hospital inpatient']
['OP', 'Outpatient']
['RES', 'Residential']
['HID', 'Hospital inpatient detoxification']
['HIT', 'Hospital inpatient treatment']
['OD', 'Outpatient detoxification']
['ODT', 'Outpatient day treatment or partial hospitalization']
['OIT', 'Intensive outpatient treatment']
['ORT', 'Regular outpatient treatment']
['RD', 'Residential detoxification']
['RL', 'Long-term residential']
['RS', 'Short-term residential']
['⑰', 'Assessment/Pre-treatment']
['CMHA', 'Comprehensive mental health assessment']
['CSAA', 'Comprehensive substance use assessment']
['ISC', 'Interim services for clients']
['OPC', 'Outreach to persons in the community']
['㉖', 'Facility Smoking Policy']
['SMON', 'Smoking not permitted']
['SMOP', 'Smoking permitted without restriction']
['SMPD', 'Smoking permitted in designated area']

The key/value pair JSON I want to create:

{
    "codekey": [
        {
            "category": {
                "key": "①",
                "value": "Type of Care"
            },
            "codes": [
                {
                    "key": "SA",
                    "value": "Substance use treatment"
                },
                {
                    "key": "SA",
                    "value": "Substance use treatment"
                },
                {
                    "key": "DT",
                    "value": "Detoxification"
                },
                {
                    "key": "HH",
                    "value": "Transitional housing, halfway house, or sober home"
                },
                {
                    "key": "SUMH",
                    "value": "Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders"
                }
            ]
        },
        {
            "category": {
                "key": "②",
                "value": "Telemedicine"
            },
            "codes": [
                {
                    "key": "TELE",
                    "value": "Telemedicine/telehealth"
                }
            ]
        },
        {
            "category": {
                "key": "③",
                "value": "Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)"
            },
            "codes": [
                {
                    "key": "HI",
                    "value": "Hospital inpatient"
                },
                {
                    "key": "OP",
                    "value": "Outpatient"
                },
                {
                    "key": "RES",
                    "value": "Residential"
                },
                {
                    "key": "HID",
                    "value": "Hospital inpatient detoxification"
                },
                {
                    "key": "HIT",
                    "value": "Hospital inpatient treatment"
                },
                {
                    "key": "OD",
                    "value": "Outpatient detoxification"
                },
                {
                    "key": "ODT",
                    "value": "Outpatient day treatment or partial hospitalization"
                },
                {
                    "key": "OIT",
                    "value": "Intensive outpatient treatment"
                },
                {
                    "key": "ORT",
                    "value": "Regular outpatient treatment"
                },
                {
                    "key": "RD",
                    "value": "Residential detoxification"
                },
                {
                    "key": "RL",
                    "value": "Long-term residential"
                },
                {
                    "key": "RS",
                    "value": "Short-term residential"
                }
            ]
        },
        {
            "category": {
                "key": "⑰",
                "value": "Assessment/Pre-treatment"
            },
            "codes": [
                {
                    "key": "CMHA",
                    "value": "Comprehensive mental health assessment"
                },
                {
                    "key": "CSAA",
                    "value": "Comprehensive substance use assessment"
                },
                {
                    "key": "ISC",
                    "value": "Interim services for clients"
                },
                {
                    "key": "OPC",
                    "value": "Outreach to persons in the community"
                }
            ]
        },
        {
            "category": {
                "key": "㉖",
                "value": "Facility Smoking Policy"
            },
            "codes": [
                {
                    "key": "SMON",
                    "value": "Smoking not permitted"
                },
                {
                    "key": "SMOP",
                    "value": "Smoking permitted without restriction"
                },
                {
                    "key": "SMPD",
                    "value": "Smoking permitted in designated area"
                }
            ]
        }
    ]
}
Asked By: Lee Whieldon

||

Answers:

Hope the following helps – this code iterates through the items in a variable named src_list, and uses a dictionary to create a JSON output like you describe.

import json

src_list = [['①', 'Type of Care'], ['SA', 'Substance use treatment'], ... ]
output_dicts = {"codekey": [] }
current_dict = None

for pair in src_list:
    # Is unicode character outside ASCII range? If so, it's defining a category
    if all(ord(c) > 128 for c in pair[0]):
        # If the current_dict is not None, we're onto a new category, so should add the last category to the output
        if (current_dict is not None):
            output_dicts["codekey"].append(current_dict)
        
        # Define new dict for this category
        current_dict = {
            "category": {
                "key": pair[0],
                "value": pair[1]
            },
            "codes": []
        }
    else:
        if (current_dict is not None):
            current_dict["codes"].append({
                "key": pair[0],
                "value": pair[1]
            })

output_dicts["codekey"].append(current_dict)
output_json = json.dumps(output_dicts, indent = 4, ensure_ascii=False)

print(output_json)
Answered By: Peter Warrington

This version allows only the circled number set as category keys (in a set for fast lookup). This also allows code keys and values to use other Unicode characters if desired.

codes is mutable, so once placed in a category codes can be directly appended to add more key/value dictionaries.

Output is to a JSON file.

import json

CIRCLED = set('⓪①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵㊶㊷㊸㊹㊺㊻㊼㊽㊾㊿')

data = [['①', 'Type of Care'], ['SA', 'Substance use treatment'], ['DT', 'Detoxification'], ['HH', 'Transitional housing, halfway house, or sober home'], ['SUMH', 'Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders'], ['②', 'Telemedicine'], ['TELE', 'Telemedicine/telehealth'], ['③', 'Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)'], ['HI', 'Hospital inpatient'], ['OP', 'Outpatient'], ['RES', 'Residential'], ['HID', 'Hospital inpatient detoxification'], ['HIT', 'Hospital inpatient treatment'], ['OD', 'Outpatient detoxification'], ['ODT', 'Outpatient day treatment or partial hospitalization'], ['OIT', 'Intensive outpatient treatment'], ['ORT', 'Regular outpatient treatment'], ['RD', 'Residential detoxification'], ['RL', 'Long-term residential'], ['RS', 'Short-term residential'], ['⑰', 'Assessment/Pre-treatment'], ['CMHA', 'Comprehensive mental health assessment'], ['CSAA', 'Comprehensive substance use assessment'], ['ISC', 'Interim services for clients'], ['OPC', 'Outreach to persons in the community'], ['㉖', 'Facility Smoking Policy'], ['SMON', 'Smoking not permitted'], ['SMOP', 'Smoking permitted without restriction'], ['SMPD', 'Smoking permitted in designated area']]

result = []
for key, value in data:
    if key in CIRCLED:
        codes = []
        result.append({'category': {'key': key, 'value': value}, 'codes': codes})
    else:
        codes.append({'key': key, 'value': value})

with open('output.json', 'w', encoding='utf8') as file:
    json.dump({'codekey': result}, file, indent=2, ensure_ascii=False)

output.json

{
  "codekey": [
    {
      "category": {
        "key": "①",
        "value": "Type of Care"
      },
      "codes": [
        {
          "key": "SA",
          "value": "Substance use treatment"
        },
        {
          "key": "DT",
          "value": "Detoxification"
        },
        {
          "key": "HH",
          "value": "Transitional housing, halfway house, or sober home"
        },
        {
          "key": "SUMH",
          "value": "Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders"
        }
      ]
    },
    {
      "category": {
        "key": "②",
        "value": "Telemedicine"
      },
      "codes": [
        {
          "key": "TELE",
          "value": "Telemedicine/telehealth"
        }
      ]
    },
    {
      "category": {
        "key": "③",
        "value": "Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)"
      },
      "codes": [
        {
          "key": "HI",
          "value": "Hospital inpatient"
        },
        {
          "key": "OP",
          "value": "Outpatient"
        },
        {
          "key": "RES",
          "value": "Residential"
        },
        {
          "key": "HID",
          "value": "Hospital inpatient detoxification"
        },
        {
          "key": "HIT",
          "value": "Hospital inpatient treatment"
        },
        {
          "key": "OD",
          "value": "Outpatient detoxification"
        },
        {
          "key": "ODT",
          "value": "Outpatient day treatment or partial hospitalization"
        },
        {
          "key": "OIT",
          "value": "Intensive outpatient treatment"
        },
        {
          "key": "ORT",
          "value": "Regular outpatient treatment"
        },
        {
          "key": "RD",
          "value": "Residential detoxification"
        },
        {
          "key": "RL",
          "value": "Long-term residential"
        },
        {
          "key": "RS",
          "value": "Short-term residential"
        }
      ]
    },
    {
      "category": {
        "key": "⑰",
        "value": "Assessment/Pre-treatment"
      },
      "codes": [
        {
          "key": "CMHA",
          "value": "Comprehensive mental health assessment"
        },
        {
          "key": "CSAA",
          "value": "Comprehensive substance use assessment"
        },
        {
          "key": "ISC",
          "value": "Interim services for clients"
        },
        {
          "key": "OPC",
          "value": "Outreach to persons in the community"
        }
      ]
    },
    {
      "category": {
        "key": "㉖",
        "value": "Facility Smoking Policy"
      },
      "codes": [
        {
          "key": "SMON",
          "value": "Smoking not permitted"
        },
        {
          "key": "SMOP",
          "value": "Smoking permitted without restriction"
        },
        {
          "key": "SMPD",
          "value": "Smoking permitted in designated area"
        }
      ]
    }
  ]
}
Answered By: Mark Tolonen
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.