Iterate through JSONE FILE IN Python

Question:

I’m iterating through JSON file and I would like to return average number of documents per day in last three periods for package in [‘ENTERPRISE’, ‘FLEXIBLE’].

ex. 64

HINT:

  1. Changes out of functions body are not allowed.

  2. Additional imports are not allowed.

  3. Data could have wholes in periods range ex. 2022-01,2022-03 (missing February), then we assume that item has 0 documents in period

Sample of dict looks like below:

import json
data_in =   [
    {
        "package": "FLEXIBLE",
        "created": "2020-03-10T00:00:00",
        "summary": [
            {
                "period": "2019-12",
                "documents": {
                    "incomes": 63,
                    "expenses": 13
                }
            },
            {
                "period": "2020-02",
                "documents": {
                    "incomes": 45,
                    "expenses": 81
                }
            }
        ]
    },
    {
        "package": "ENTERPRISE",
        "created": "2020-04-19T00:00:00",
        "summary": [
            {
                "period": "2020-01",
                "documents": {
                    "incomes": 15,
                    "expenses": 52
                }
            },
            {
                "period": "2020-02",
                "documents": {
                    "incomes": 76,
                    "expenses": 47
                }
            }
        ]
    },
    {
        'package': 'FLEXIBLE',
        'created': '2020-01-15T00:00:00',
        'summary': [
            {
                'period': '2020-03',
                'documents': {
                    'incomes': 39, 
                    'expenses': 48
                }
            },
            {
                'period': '2020-04', 
                'documents': {
                    'incomes': 76, 
                    'expenses': 20
                }
            }
        ]
    },
    
    {
        'package': 'INTERNAL',
        'created': '2020-01-07T00:00:00',
        'summary': [
            {
                'period': '2019-12',
                'documents': {
                    'incomes': 4, 
                    'expenses': 53
                }
            },
            {
                'period': '2020-01', 
                'documents': {
                    'incomes': 60, 
                    'expenses': 48
                }
            },
            {
                'period': '2020-02', 
                'documents': {
                    'incomes': 88, 
                    'expenses': 85
                }
            },
            {
                'period': '2020-03', 
                'documents': {
                    'incomes': 84, 
                    'expenses': 81
                }
            }
        ]
    },
    
#  {'package': 'ENTERPRISE',
#   'created': '2020-01-03T00:00:00',
#   'summary': [{'period': '2020-04',
#     'documents': {'incomes': 27, 'expenses': 13}}]},
#  {'package': 'TRIAL',
#   'created': '2019-12-30T00:00:00',
#   'summary': [{'period': '2019-12',
#     'documents': {'incomes': 89, 'expenses': 21}},
#    {'period': '2020-01', 'documents': {'incomes': 55, 'expenses': 32}},
#    {'period': '2020-02', 'documents': {'incomes': 17, 'expenses': 73}},
#    {'period': '2020-03', 'documents': {'incomes': 30, 'expenses': 47}},
#    {'period': '2020-04', 'documents': {'incomes': 16, 'expenses': 42}}]},
   ]

Code for the task is here:

def task_3(data_in):
    '''
        Return average(integer) number of documents per day
        in last three periods
        for package in ['ENTERPRISE', 'FLEXIBLE']
        ex. 64
    '''
result = {}
iii = []
for record in data_in:
    if record['package']=='FLEXIBLE'or record['package']=='ENTERPRISE':
#         iii.append(record['created'][:10])
# print(iii)
        for period_data in record["summary"]:
            print(period_data)
        
#         if period_data["period"] not in result:
#             result[period_data["period"]] = {"incomes": 0, "expenses": 0, "total": 0}
#         result[period_data["period"]]["incomes"] += period_data["documents"]["incomes"]
#         result[period_data["period"]]["expenses"] += period_data["documents"]["expenses"]

# result = {key: {**value, "total": sum(value.values())/(len(value)-1)} for key, value in result.items()}
# result

So I don’t know how to return average number of documents per day in last three periods for package in [‘ENTERPRISE’, ‘FLEXIBLE’].

Asked By: Marek Grzesiak

||

Answers:

calculate the no of package you got for each data and then from there you can get the avg no of packages

result = {}
for record in data_in:
    if record['package'] in ['FLEXIBLE', 'ENTERPRISE']:
        for data in record['summary']:
            if data['period'] not in result: 
                 result[data['period']] = 0
            result[data['period']] += data['documents']['income'] + data['documents']['expenses']
no_of_days = len(result)
avg_package_per_day = sum(result.values())/no_of_days
Answered By: sahasrara62

Try this

result = {}
for record in data_in:
    if record['package'] in ['FLEXIBLE', 'ENTERPRISE']:
        for period_data in record["summary"]:
            if period_data["period"] not in result:
                result[period_data["period"]] = 0
            result[period_data['period']] += period_data['documents']['incomes'] + period_data['documents']['expenses']
no_of_days = len(result)
print(len(result))
avg_package_per_day = sum(result.values())/no_of_days
avg_package_per_day
Answered By: Mar3eczek17
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.