Merge row if cels are equals pandas

Question:

I have this df:

import pandas as pd
df = pd.DataFrame({'Time' : ['s_1234','s_1234', 's_1234', 's_5678', 's_8998','s_8998' ],
                   'Control' : ['A', '', '','B', 'C', ''],
                   'tot_1' : ['1', '1', '1','1', '1', '1'],
                   'tot_2' : ['2', '2', '2','2', '2', '2']})
--------
   Time Control tot_1 tot_2
0  1234       A     1     2
1  1234       A     1     2
2  1234             1     2
3  5678       B     1     2
4  8998       C     1     2
5  8998             1     2

I would like each time an equal time value to be merged into one column. I would also like the "tot_1" and "tot_2" columns to be added together. And finally I would like to keep checking if present. Like:

   Time Control tot_1 tot_2
0  1234       A     3     6
1  5678       B     1     2
2  8998       C     2     4
Asked By: Jake85

||

Answers:

Your data is different then the example df.

construct df:

import pandas as pd
df = pd.DataFrame({'Time' : ['s_1234','s_1234', 's_1234', 's_5678', 's_8998','s_8998' ],
                   'Control' : ['A', '', '','B', 'C', ''],
                   'tot_1' : ['1', '1', '1','1', '1', '1'],
                   'tot_2' : ['2', '2', '2','2', '2', '2']})

df.Time = df.Time.str.split("_").str[1]
df = df.astype({"tot_1": int, "tot_2": int})

Group by Time and aggregate the values.

df.groupby('Time').agg({"Control": "first", "tot_1": "sum", "tot_2": "sum"}).reset_index()


   Time Control  tot_1  tot_2
0  1234       A      3      6
1  5678       B      1      2
2  8998       C      2      4

EDIT for comment: Not sure if thats the best way to do it, but you could construct your agg information like this:

n = 2
agg_ = {"Control": "first"} | {f"tot_{i+1}": "sum" for i in range(n)}
df.groupby('Time').agg(agg_).reset_index()
Answered By: bitflip
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.