Using pandas to parse csv files in python issues

Question:

I’m currently working on a script to help sort items in a csv file. As it can be seen in the code below, I’m employing pandas to read in two specific columns of the csv file and then converting the read data into lists. What I’m trying to accomplish (in a simple, although insufficient, way) is to parse through these two lists which match up according to the following image:

Result of using panda to read in two specific columns of csv file. Note: that on the right column labeled Responsible are department names listed and on the right column labeled Status are status on order put into the given department.

Here is the a copy of the code I’m using to parse through the two lists:

    import pandas as pd

    PHSc = 0
    PHSi = 0
    PHS = 0
    MOB = 0
    MOBc = 0
    MOBi = 0
    PHR = 0
    PHRc = 0
    PHRi = 0
    SYH = 0
    SYHc = 0
    SYHi = 0
    APS = 0
    APSc = 0
    APSi = 0
    ANA = 0
    ANAc = 0
    ANAi = 0
    CKC = 0
    CKCc = 0
    CKCi = 0
    RLG = 0
    RLGc = 0
    RLGi = 0
    DJ = 0
    DJc = 0
    DJi = 0
    EMS = 0
    EMSc = 0
    EMSi = 0
    CTS = 0
    CTSc = 0
    CTSi = 0
    ROT = 0
    ROTc = 0
    ROTi = 0
    SW = 0
    SWc = 0
    SWi = 0
    NU = 0
    NUc = 0
    NUi = 0
    RON = 0
    RONc = 0
    RONi = 0
    MON = 0
    MONc = 0
    MONi = 0
    TJS = 0
    TJSc = 0
    TJSi = 0
    VCH = 0
    VCHc = 0
    VCHi = 0
    NLN = 0
    NLNc = 0
    NLNi = 0
    KND = 0
    KNDc = 0
    KNDi = 0
    SHE = 0
    SHEc = 0
    SHEi = 0
    DOS = 0
    DOSc = 0
    DOSi = 0
    E1 = 0
    E1c = 0
    E1i = 0
    MOC = 0
    MOCc = 0
    MOCi = 0
    AC = 0
    ACc = 0
    ACi = 0
    SL = 0
    SLc = 0
    SLi = 0
    JER = 0
    JERc = 0
    JERi = 0
    BLP = 0
    BLPc = 0
    BLPi = 0

    file = input("Please enter the name of the file (include .csv): ")

    data = pd.read_csv(file, usecols = [23, 26], skiprows = 1, names=['Responsible', 'Status'])
    print(data)

    responsible = data['Responsible'].tolist()
    print(responsible)
    status = data['Status'].tolist()
    print(status)

    for i in responsible:
     for j in status:
      if i == 'PHS' and j == 'Comp':
       PHSc += 1
       PHS += 1
       break
      elif i == 'PHS' and j == 'Active':
       PHSi += 1
       PHS += 1
       break
      elif i == 'MOB' and j == 'Comp':
       MOBc = MOBc + 1
       MOB += 1
       break
      elif i == 'MOB' and j == 'Active':
       MOBi = MOBi + 1
       MOB += 1
       break
      elif i == 'PHR' and j == 'Comp':
       PHRc = PHRc + 1
       PHR += 1
       break
      elif i == 'PHR' and j == 'Active':
       PHRi = PHRi + 1
       PHR += 1
       break
      elif i == 'SYH' and j == 'Comp':
       SYHc = SYHc + 1
       SYH += 1
       break
      elif i == 'SYH' and j == 'Active':
       SYHi = SYHi + 1
       SYH += 1
       break
      elif i == 'APS' and j == 'Comp':
       APSc = APSc + 1
       APS += 1
       break
      elif i == 'APS' and j == 'Active':
       APSi = APSi + 1
       APS += 1
       break
      elif i == 'ANA' and j == 'Comp':
       ANAc = ANAc + 1
       ANA += 1
       break
      elif i == 'ANA' and j == 'Active':
       ANAi = ANAi + 1
       ANA += 1
       break
      elif i == 'CKC' and j == 'Comp':
       CKCc = CKCc + 1
       CKC += 1
       break
     elif i == 'CKC' and j == 'Active':
       CKCi = CKCi + 1
       CKC += 1
       break
     elif i == 'RLG' and j == 'Comp':
      RLGc = RLGc + 1
      RLG += 1
      break
     elif i == 'RLG' and j == 'Active':
      RLGi = RLGi + 1
      RLG += 1
      break
     elif i == 'DJ' and j == 'Comp':
      DJc = DJc + 1
      DJ += 1
      break
     elif i == 'DJ' and j == 'Active':
      DJi = DJi + 1
      DJ += 1
      break
     elif i == 'EMS' and j == 'Comp':
      EMSc = EMSc + 1
      EMS += 1
      break
    elif i == 'EMS' and j == 'Active':
      EMSi = EMSi + 1
      EMS += 1
      break
     elif i == 'CTS' and j == 'Comp':
      CTSc = CTSc + 1
      CTS += 1
      break
     elif i == 'CTS' and j == 'Active':
      CTSi = CTSi + 1
      CTS += 1
      break
     elif i == 'ROT' and j == 'Comp':
      ROTc = ROTc + 1
      ROT += 1
      break
     elif i == 'ROT' and j == 'Active':
      ROTi = ROTi + 1
      ROT += 1
      break
     elif i == 'SW' and j == 'Comp':
      SWc = SWc + 1
      SW += 1
      break
     elif i == 'SW' and j == 'Active':
      SWi = SWi + 1
      SW += 1
      break
     elif i == 'NU' and j == 'Comp':
      NUc = NUc + 1
      NU += 1
      break
     elif i == 'NU' and j == 'Active':
      NUi = NUi + 1
      NU += 1
      break
     elif i == 'RON' and j == 'Comp':
      RONc = RONc + 1
      RON += 1
      break
     elif i == 'RON' and j == 'Active':
      RONi = RONi + 1
      RON += 1
      break
     elif i == 'MON' and j == 'Comp':
      MONc = MONc + 1
      MON += 1
      break
     elif i == 'MON' and j == 'Active':
      MONi = MONi + 1
      MON += 1
      break
     elif i == 'TJS' and j == 'Comp':
      TJSc = TJSc + 1
      TJS += 1
      break
     elif i == 'TJS' and j == 'Active':
      TJSi = TJSi + 1
      TJS += 1
      break
     elif i == 'VCH' and j == 'Comp':
      VCHc = VCHc + 1
      VCH += 1
      break
     elif i == 'VCH' and j == 'Active':
      VCHi = VCHi + 1
      VCH += 1
      break
     elif i == 'NLN' and j == 'Comp':
      NLNc = NLNc + 1
      NLN += 1
      break
     elif i == 'NLN' and j == 'Active':
      NLNi = NLNi + 1
      NLN += 1
      break
     elif i == 'KND' and j == 'Comp':
      KNDc = KNDc + 1
      KND += 1
      break
     elif i == 'KND' and j == 'Active':
      KNDi = KNDi + 1
      KND += 1
      break
     elif i == 'SHE' and j == 'Comp':
      SHEc = SHEc + 1
      SHE += 1
      break
     elif i == 'SHE' and j == 'Active':
      SHEi = SHEi + 1
      SHE += 1
      break
     elif i == 'DOS' and j == 'Comp':
      DOSc = DOSc + 1
      DOS += 1
      break
     elif i == 'DOS' and j == 'Active':
      DOSi = DOSi + 1
      DOS += 1
      break
     elif i == 'E1' and j == 'Comp':
      E1c = E1c + 1
      E1 += 1
      break
     elif i == 'E1' and j == 'Active':
      E1i = E1i + 1
      E1 += 1
      break
     elif i == 'MOC' and j == 'Comp':
      MOCc = MOCc + 1
      MOC += 1
      break
     elif i == 'MOC' and j == 'Active':
      MOCi = MOCi + 1
      MOC += 1
      break
     elif i == 'AC' and j == 'Comp':
      ACc = ACc + 1
      AC += 1
      break
     elif i == 'AC' and j == 'Active':
      ACi = ACi + 1
      AC += 1
      break
     elif i == 'SL' and j == 'Comp':
      SLc = SLc + 1
      SL += 1
      break
     elif i == 'SL' and j == 'Active':
      SLi = SLi + 1
      SL += 1
      break
    elif i == 'JER' and j == 'Comp':
      JERc = JERc + 1
      JER += 1
      break
    elif i == 'JER' and j == 'Active':
      JERi = JERi + 1
      JER += 1
      break
    elif i == 'BLP' and j == 'Comp':
      BLPc = BLPc + 1
      BLP += 1
      break
    elif i == 'BLP' and j == 'Active':
      BLPi = BLPi + 1
      BLP += 1
      break

    print("\nPHS Complete: ", PHSc)
    print("PHS Incomplete: ", PHSi)
    print("PHS Total: ", PHS)

    print("\nMOB Complete: ", MOBc)
    print("MOB Incomplete: ", MOBi)
    print("MOB Total: ", MOB)

    print("\nPHR Complete: ", PHRc)
    print("PHR Incomplete: ", PHRi)
    print("PHR Total: ", PHR)

    print("\nSYH Complete: ", SYHc)
    print("SYH Incomplete: ", SYHi)
    print("SYH Total: ", SYH)

    print("\nAPS Complete: ", APSc)
    print("APS Incomplete: ", APSi)
    print("APS Total: ", APS)

    print("\nANA Complete: ", ANAc)
    print("ANA Incomplete: ", ANAi)
    print("ANA Total: ", ANA)

Code Output

As it can be seen, the value for PHS Complete should be 49 and the value of PHS Incomplete should be 0 as the PHS dept completed all their required work for this given daily report. So I’m unsure what I’m doing wrong here. Same thing is happening with the other print statements. The idea is to parse through both lists and increment an index value every time a dept (for example PHS) has a status of comp. Doing the same and incrementing a different index value every time a dept has a status of Active. A third value is incremented to find the total number of jobs performed in a given day. I’m sure there is a more efficient way to do this but I’m new to python and have a firm background in C++. Any and all suggestions will be appreciated. Thanks in advance!

Asked By: Darsh Dinger

||

Answers:

You can get the counts grouped by the Responsible and Status columns like so:

data.groupby(["Responsible", "Status"]).value_counts().reset_index(name="count")

This will look like:

   Responsible  Status  count
0           AC  Active      6
1           AC    Comp      2
2          ANA  Active      4
3          ANA    Comp      4
4          APS  Active      5
5          APS    Comp      3
6          BLP  Active      3
7          BLP    Comp      5
8          CKC  Active      5
9          CKC    Comp      3
10         CTS  Active      3
11         CTS    Comp      5
12          DJ  Active      5

Since you did not provide sample data that could be copied and pasted, your counts will of course vary.

You can also just get the count for each entry in the Responsible column with:

data["Responsible"].value_counts()

To put it all together and have counts for each Status plus a Total of both, you could do this:

data.groupby(["Responsible", "Status"])
    .value_counts()
    .reset_index(name="Count")
    .pivot_table(values="Count", index="Responsible", columns="Status", aggfunc="sum", margins=True, margins_name="Total")

This will look like:

Status       Active  Comp  Total
Responsible                     
AC                6     2      8
ANA               4     4      8
APS               5     3      8
BLP               3     5      8
CKC               5     3      8
CTS               3     5      8
DJ                5     3      8
DOS               3     5      8
Answered By: Dan