Parsing nested JSON from API in Python

Question:

I’m working on JSON data from this API call:
https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page=1&limit=10&format=json&api-version=1.2

This is page 1, but there are 49 pages in total, therefore a part of my code deals (successfully) with pagination. I don’t want to save this JSON in a file and, if I can avoid it, don’t really want to import the ‘json’ package – but will do if necessary.

A variation of this code works correctly if I’m pulling entire [‘data’][‘agreements’] dictionary (or is it a list…).
But I don’t want that, I want individual parameters for all the ‘attributes’ of each ‘agreement’. In my code below I’m trying to pull the ‘provider-name’ attribute, and would like to get a list of all the provider names, without any other data there.

But I keep getting the "list indices must be integers or slices, not str" error in line 18. I’ve tried many ways to get this data which is nested within a list nested within a dictionary, etc. like splitting it further into another ‘for’ loop, but no success.

import requests
import math
import pandas as pd


baseurl = 'https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page=1&limit=10&format=json&api-version=1.2'

def main_request(baseurl, x):
    r = requests.get(baseurl + f'&page={x}')
    return r.json()

def get_pages(response):
    return math.ceil(response['meta']['count'] / 10)

def get_names(response):
    providerlist = []
    all_data = response['data']['agreements']
    for attributes1 in all_data ['data']['agreements']:
        item = attributes1['attributes']['provider-name']
        providers = {
            'page1': item,
        }

    providerlist.append(providers)
    return providerlist

mainlist = []
data = main_request(baseurl, 1)
for x in range(1,get_pages(data)+1):
    mainlist.extend(get_names(main_request(baseurl, x)))

mydataframe = pd.DataFrame(mainlist)

print(mydataframe)
Asked By: Michael Wiz

||

Answers:

To get the data from the Json to the dataframe you can use next example:

import requests
import pandas as pd


api_url = "https://api.nfz.gov.pl/app-umw-api/agreements?year=2022&branch=01&productCode=01.0010.094.01&page={}&limit=10&format=json&api-version=1.2"

all_data = []
for page in range(1, 5): # <-- increase page numbers here
    data = requests.get(api_url.format(page)).json()

    for a in data["data"]["agreements"]:
        all_data.append({"id": a["id"], **a["attributes"], "link": a["links"]['related']})

df = pd.DataFrame(all_data)
print(df.head().to_markdown(index=False))

Prints:

id code technical-code origin-code service-type service-name amount updated-at provider-code provider-nip provider-regon provider-registry-number provider-name provider-place year branch link
75f1b5a0-34d1-d827-8970-89b6b593be86 0113/3202010/01/2022/01 0113/3202010/01/2022/01 0113/3202010/01/2022/01 01 Podstawowa Opieka Zdrowotna 14583.7 2022-07-11T20:04:39 3202010 8851039259 89019398100026 000000001951-W-02 NZOZ PRAKTYKA LEKARZA RODZINNEGO JAN WOLAŃCZYK JEDLINA-ZDRÓJ 2022 01 https://api.nfz.gov.pl/app-umw-api/agreements/75f1b5a0-34d1-d827-8970-89b6b593be86?format=json&api-version=1.2
1840cf6e-10ba-33a1-81f1-9f58c613d705 0113/3302665/01/2022/01 0113/3302665/01/2022/01 0113/3302665/01/2022/01 01 Podstawowa Opieka Zdrowotna 1479 2022-08-03T20:00:22 3302665 9281731555 390737391 000000023969-W-02 NZOZ "MEDICA" PĘCŁAW 2022 01 https://api.nfz.gov.pl/app-umw-api/agreements/1840cf6e-10ba-33a1-81f1-9f58c613d705?format=json&api-version=1.2
954eb365-e232-fd29-10f7-c8af21c07470 0113/3402005/01/2022/01 0113/3402005/01/2022/01 0113/3402005/01/2022/01 01 Podstawowa Opieka Zdrowotna 1936 2022-09-02T20:01:17 3402005 6121368883 23106871400021 000000002014-W-02 PRZYCHODNIA OGÓLNA TSARAKHOV OLEG BOLESŁAWIEC 2022 01 https://api.nfz.gov.pl/app-umw-api/agreements/954eb365-e232-fd29-10f7-c8af21c07470?format=json&api-version=1.2
7dd72607-ab9f-7217-87b9-8e4ed2bc5537 0113/3202025/01/2022/01 0113/3202025/01/2022/01 0113/3202025/01/2022/01 01 Podstawowa Opieka Zdrowotna 0 2022-04-14T20:01:42 3202025 8851557014 891487450 000000002063-W-02 "PRZYCHODNIA LEKARSKA ZDROWIE BIELAK, PIEC I SZYMANIAK SPÓŁKA PARTNERSKA" NOWA RUDA 2022 01 https://api.nfz.gov.pl/app-umw-api/agreements/7dd72607-ab9f-7217-87b9-8e4ed2bc5537?format=json&api-version=1.2
bb60b21d-38da-1f2e-a7fd-5a45453e7370 0113/3102115/01/2022/01 0113/3102115/01/2022/01 0113/3102115/01/2022/01 01 Podstawowa Opieka Zdrowotna 414 2022-10-18T20:01:17 3102115 8941504470 93009444900038 000000001154-W-02 PRAKTYKA LEKARZA RODZINNEGO WALDEMAR CHRYSTOWSKI WROCŁAW 2022 01 https://api.nfz.gov.pl/app-umw-api/agreements/bb60b21d-38da-1f2e-a7fd-5a45453e7370?format=json&api-version=1.2
Answered By: Andrej Kesely
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.