Pandas dataframe – Removing repeated/duplicate column in dataframe but keep the values

Question:

I have this dataframe that have duplicate column name, I want to remove the remove the repeated column but I need to keep the values.

enter image description here

I want to remove the C and D column at the end but move the values on the same row in the first C and D column.

df = df.loc[:,~df.columns.duplicated(keep='first')]

Tried this code but it remove the duplicate column and keeping the first but it also remove the values

Asked By: jellobeann

||

Answers:

Example

make minimal and reproducible example for answer

data = [[0, 1, 2, 3, None, None], 
        [1, None, 3, None, 2, 4], 
        [2, 3, 4, 5, None, None]]
df = pd.DataFrame(data, columns=list('ABCDBD'))

df

    A   B   C   D   B   D
0   0   1.0 2   3.0 NaN NaN
1   1   NaN 3   NaN 2.0 4.0
2   2   3.0 4   5.0 NaN NaN

Code

df.groupby(level=0, axis=1).first()

result:

    A   B   C   D
0   0.0 1.0 2.0 3.0
1   1.0 2.0 3.0 4.0
2   2.0 3.0 4.0 5.0
Answered By: Panda Kim
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.