Convert a print statement to a dictionary Pandas

Question:

Here I am comparing a data frame to a list of standard values (seen below). Instead of the print statement I would like it convert it to a dictionary. Here is the code I have so far:

valid= {'Industry': ['Automotive', 'Banking / Finance','Biotech / Pharma','Commercial Buildings','Construction / Distribution',
                  'Consumer Products','Education','Education - K-12','Education - University / Higher','Entertainment / Media','Financial',
                  'Food & Beverage','Gas','Government','Government - Federal','Government - State / Local','Healthcare','High Security',
                  'Hospitality / Entertainment','Manufacturing / Communications','Other','Petrochem / Energy',
                  'Property Management / Real Estate','Public Facility / Non-Profit','Residential','Restaurant','Retail','Services - B2B',
                  'Technology','Telecom / Utilities','Transportation','Utilities','Food Retail','Specialized Retail','IT','Corrections',
                  'Core Commercial (SME)'],
        'SME Vertical': ['Agriculture, Food and Manufacturing','Architectural services','Arts, entertainment and recreation','Automobile',
                'Chemistry / Pharmacy','Construction','Education','Hotels','Offices','Other Industries','Other Services',
                'Project management and design','Real Estate and promotion','Restaurants, Café and Bars',
                'Energy, Infrastructure, Environment and Mining','Financial and Insurance Services',
                'Human health and social work activities','Professional, scientific, technical and communication activities',
                'Public administration and defence, compulsory social security','Retail/Wholesale','Transport, Logistics and Storage'],
        'System Type': ['Access','Access Control','Alarm Systems','Asset Tracking','Banking','Commander','EAS','Financial products','Fire',
                    'Fire Alarm','Integrated Solution','Intercom','Intercom systems','Intrusion - Traditional','Locking devices & Systems',
                    'Locks & Safes','Paging','Personal Safety','Retail & EAS Products','SaaS','SATS','Services',
                    'Sonitrol Integrated Solution','Sonitrol - Integrated Solution','Sonitrol - Managed Access',
                    'Sonitrol - Verified Audio Intrusion','Time & Attendance','TV-Distribution','Unknown','Video','Video Systems'],
        'Account Type': ['Commercial','International','National','Regional','Reseller','Residential','Small']}
 
mask = df1.apply(lambda c: c.isin(valid[c.name]))
df1.mask(mask|df1.eq(' ')).stack()
 
for r, v in df1.mask(mask|df1.eq(' ')).stack().iteritems():
    print(f'error found in row "{r[0]}", column "{r[1]}": "{v}" is invalid')

Here is the current output of the print statements

error found in row "1", column "Industry": "gas" is invalid
error found in row "1", column "SME Vertical": "hotels" is invalid
error found in row "2", column "Industry": "healthcare" is invalid
error found in row "3", column "Industry": "other" is invalid
error found in row "3", column "SME Vertical": "project management and design" is invalid
error found in row "4", column "Account Type": "small" is invalid

This output is good in terms of the format but I can’t get it to write to a dictionary.

Example output from the dictionary:

{row “1”: column: "Industry", message: "gas" is invalid, .... etc}
Asked By: Test Code

||

Answers:

This is straightforward, but YOU need to decide what the format will be. What you have shown above is not a valid dictionary.

Maybe like this, as a list of dictionaries, one for each error?

errors = []
for r, v in df1.mask(mask|df1.eq(' ')).stack().iteritems():
    errors.append({
        "row": r[0],
        "column": r[1],
        "message": v + " is invalid"
    })
Answered By: Tim Roberts

How about something like this (example)?

Code


d = {}
d['error found in row "1", column "Industry"'] = []
d['error found in row "1", column "Industry"'].append('"gas" is invalid')
d['error found in row "1", column "Industry"'].append('"hotels" is invalid')

print(json.dumps(d, indent=4))

Output

$ python test.py
{
    "error found in row "1", column "Industry"": [
        ""gas" is invalid",
        ""hotels" is invalid"
    ]
}
Answered By: Fiddling Bits
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.