Nested for loop with 2 variables. Ouput to be appended in dataframe column

Question:

check_df has two column one with code and other is blank
in_df has 2 column one is merged column and other is V_ORG_UNIT_NAME_LEVEL14.

I want to check each code of "V_ORG_UNIT_CODE" from check_df inside "merged column from in_df.
If it matches(it may contain that value may not be exact match) i want corresponding "OutputDisplay" in check_df empty column "V_ORG_UNIT_CODE"

check_df

V_ORG_UNIT_CODE V_ORG_UNIT_NAME_LEVEL14
abc
def
gth

in_df

OutputDisplay MergedColumn
123 dasabcraf
456 asfgfdg
567 as0def!gfhg

Expected Output

check_df

V_ORG_UNIT_CODE V_ORG_UNIT_NAME_LEVEL14
abc 123
def 567
gth NA
for x in check_df["V_ORG_UNIT_CODE"]:
    for y,z in zip(in_df["MergedColumn"],in_df["OutputDisplay"]):
        if (y.__contains__(x)):
            print(z)
            check_df['V_ORG_UNIT_NAME_LEVEL14']=check_df['V_ORG_UNIT_NAME_LEVEL14'].append(z)

My print(z) is correct output but I am getting error when i am appending it in a dataframe column

TypeError                                 Traceback (most recent call last)
<ipython-input-6-e4f45d7306ae> in <module>
      3 for x in check_df["V_ORG_UNIT_CODE"]:
      4     for y,z in zip(in_df["MergedColumn"],in_df["OutputDisplay"]):
----> 5         if (y.__contains__(x)):
      6 #            print(z)
      7 #            check_df['V_ORG_UNIT_NAME_LEVEL14']=check_df['V_ORG_UNIT_NAME_LEVEL14'].append(z)

TypeError: 'in <string>' requires string as left operand, not int
Asked By: Saumya Shah

||

Answers:

check ,5 line "y value" type, that must be string type

Answered By: llinux

try the DataFrame class built-in function .insert

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.insert.html

Answered By: Raqun Bob

If I’ve instantiated your dataframe correctly (check below), the following seems to deliver the outcome you’re after:

import pandas as pd

check_df = pd.DataFrame()
in_df = pd.DataFrame()

First we create check_df:

check_df['V_ORG_UNIT_CODE'] = ['abc', 'def', 'gth']
check_df['V_ORG_UNIT_NAME_LEVEL14'] = [None, None, None]

check_df looks like this:

  V_ORG_UNIT_CODE V_ORG_UNIT_NAME_LEVEL14
0             abc                    None
1             def                    None
2             gth                    None

Then we create in_df:

in_df['OutputDisplay'] = [123, 456, 567]
in_df['MergedColumn'] = ['dasabcraf', 'asfgfdg', 'as0def!gfhg']

in_df looks like this:

   OutputDisplay MergedColumn
0            123    dasabcraf
1            456      asfgfdg
2            567  as0def!gfhg

I’ve then kept your code essentially unchanged, except I use enumerate to get both every item in the first column of check_df and also its index as i:

for i, x in enumerate(check_df["V_ORG_UNIT_CODE"]): 
    for y, z in zip(in_df["MergedColumn"], in_df["OutputDisplay"]):
        if x in y:
            check_df['V_ORG_UNIT_NAME_LEVEL14'][i]=z
        
print (check_df)

Which produces this result:

  V_ORG_UNIT_CODE V_ORG_UNIT_NAME_LEVEL14
0             abc                     123
1             def                     567
2             gth                    None

Is that what you were after?

Answered By: Vin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.