data frame, match two columns and remove repeated value from second columns if that exist in first columns, row number or id is not matter

Question

In a data, frame match two-column and if any value from the second column is available in the first column, remove value from the second columns

Output

col1 col2
1
2
3    9
4
5
6

Here, 1 and 2 from col2 are available in col1.
So, this repeated data should be removed

Asked By: parth

||

Source

Answer 1

Using s.mask to value match and replace, we can do something along the likes of:

df['col2'] = df['col2'].mask(pd.to_numeric(df['col2']).isin(df['col1']), "")

col1    col2
0   1   
1   2   
2   3   9.0
3   4   
4   5   
5   6

Answered By: Just James

Answer 2

import pandas as pd
col1= [1,2,3,4,5,6]
col2= [0,0,9.0,0,0,0]

df = pd.DataFrame({'col1':col1, 'col2':col2})
# add column with no of occurrence of Non None values in the column name starts with 'a'

# iterate over columns
for col in df.columns:
    # remove values that are in previous columns
    for prev_col in df.columns[:df.columns.get_loc(col)]:
        df[col] = df[col].where(~df[col].isin(df[prev_col]), None)

# OUTPUT
#    col1  col2
# 0     1   0.0
# 1     2   0.0
# 2     3   9.0
# 3     4   0.0
# 4     5   0.0
# 5     6   0.0

Answered By: Sudhakar

data frame, match two columns and remove repeated value from second columns if that exist in first columns, row number or id is not matter

Question:

Answers: