Extract unique items in a column and map with all items in another column in pandas

Question:

I have a pandas dataframe df which looks like this:

Col1    Col2    Label
0   D1  C38 1
1   D1  C65 1
2   D1  C53 1
3   D2  C02 1
4   D2  C01 1
5   D4  C73 1

I want to first extract all the unqiue items from Col1 and each unique item in Col1 needs to be mapped to all items in Col2 except for those corresponding items that are already having a connection with label as 1 in third column.

For example, if we take D1 in Col1 it is having three repetitions with label as 1 in third column Label. Now, map D1 with remaining items in col2 i.e., C02 C01 C73 C61 C03 and add these new connections in the same Col1 and Col2 with label as 0.

The output of dataframe needs to be like this:

Col1    Col2    Label
0   D1  C38 1
1   D1  C65 1
2   D1  C53 1
3   D1  C02 0
4   D1  C01 0
5   D1  C73 0

6   D2  C02 1
7   D2  C01 1
8   D2  C38 0
9   D2  C65 0
10  D2  C53 0
11  D2  C73 0

12  D4  C73 1
13  D4  C02 0
14  D4  C01 0
15  D4  C38 0
16  D4  C65 0
17  D4  C53 0

Is there a way to do this? Appreciate your suggestions

Asked By: botloggy

||

Answers:

Here is one option to keep the order:

cols = ['Col1', 'Col2']
idx = pd.MultiIndex.from_product([df[c].unique() for c in cols], names=cols)

out = (df
 .set_index(cols).reindex(idx, fill_value=0).reset_index()
 .sort_values(by=['Col1', 'Label'], ascending=[True, False],
              kind='stable', ignore_index=True)  
)

Or, if the groups cannot be sorted:

cols = ['Col1', 'Col2']
idx = pd.MultiIndex.from_product([df[c].unique() for c in cols], names=cols)

out = (df
 .set_index(cols).reindex(idx, fill_value=0).reset_index()
 .groupby('Col1', group_keys=False)
 .apply(lambda g: g.sort_values(by='Label', ascending=False, kind='stable'))
 .reset_index(drop=True)
)

output:

   Col1 Col2  Label
0    D1  C38      1
1    D1  C65      1
2    D1  C53      1
3    D1  C02      0
4    D1  C01      0
5    D1  C73      0
6    D2  C02      1
7    D2  C01      1
8    D2  C38      0
9    D2  C65      0
10   D2  C53      0
11   D2  C73      0
12   D4  C73      1
13   D4  C38      0
14   D4  C65      0
15   D4  C53      0
16   D4  C02      0
17   D4  C01      0
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.