Conversion of dataframe to JSON

Question:

I want to convert a dataframe from this format into JSON

Dataframe Table

Region Province Municipality Barangay
Region 1 Province 1 Municipality 1 Barangay 1
Region 1 Province 1 Municipality 1 Barangay 2
Region 1 Province 1 Municipality 2 Barangay 3
Region 1 Province 1 Municipality 2 Barangay 4
Region 1 Province 2 Municipality 3 Barangay 5
Region 1 Province 2 Municipality 3 Barangay 6
Region 1 Province 2 Municipality 4 Barangay 7
Region 1 Province 2 Municipality 4 Barangay 8

JSON Format:

regions = [
  {
    name: 'Region 1',
    provinces: [
      {
        name: 'Province 1',
        municipalities: [
          {
            name: 'Municipality 1',
            barangays: [
              'Barangay 1',
              'Barangay 2',
              // Add more barangays here
            ]
          },
          {
            name: 'Municipality 2',
            barangays: [
              'Barangay 3',
              'Barangay 4',
              // Add more barangays here
            ]
          },
          // Add more municipalities here
        ]
      },
      {
        name: 'Province 2',
        municipalities: [
          {
            name: 'Municipality 3',
            barangays: [
              'Barangay 5',
              'Barangay 6',
              // Add more barangays here
            ]
          },
          {
            name: 'Municipality 4',
            barangays: [
              'Barangay 7',
              'Barangay 8',
              // Add more barangays here
            ]
          },
          // Add more municipalities here
        ]
      },
      // Add more provinces here
    ]
  }
  // Add more regions here
];

I tried df.to_json(orient="records"). And I tried split, records, index, columns, values, table parameters but not the way I need. It will be used for dependent dropdown in form.

Asked By: Sir_Chief

||

Answers:

As far as I know, native pandas functions does not provide this type of breakdown. You can however loop through the regions, provinces and municipalities with groupby, and build a dictionary manually (which then can be translated to json with something like json.dumps

out = []
for region, region_data in df.groupby("Region"):
    region_dct = {"name": region, "provinces": []}
    for province, province_data in region_data.groupby("Province"):
        province_dct = {"name": province, "municipalities": []}
        for municipality, municipality_data in province_data.groupby("Municipality"):
            municipality_dct = {"name": municipality, "barangays": municipality_data["Barangay"].tolist()}
            province_dct["municipalities"].append(municipality_dct)
        region_dct["provinces"].append(province_dct)
    out.append(region_dct)
print(out)
[{'name': 'Region 1',
  'provinces': [{'municipalities': [{'barangays': ['Barangay 1', 'Barangay 2'],
                                     'name': 'Municipality 1'},
                                    {'barangays': ['Barangay 3', 'Barangay 4'],
                                     'name': 'Municipality 2'}],
                 'name': 'Province 1'},
                {'municipalities': [{'barangays': ['Barangay 5', 'Barangay 6'],
                                     'name': 'Municipality 3'},
                                    {'barangays': ['Barangay 7', 'Barangay 8'],
                                     'name': 'Municipality 4'}],
                 'name': 'Province 2'}]}]
Answered By: oskros