Pandas data frame new column as sum of multiple columns time scalar decremental
Question:
I have a data frame with 10 columns of integer. I would like to create new column like this:
df["newcol"] = df[col1]*(11-1)+df["col2"]*(11-2)+..+df[col10]*(11-10)
I’ve already done the above and it works but I just wonder if there is a pythonic way to do it. Not writing it all down like that because sometime my data frame would have over 100 columns.
Answers:
Easy way
df['newcol'] = df.mul(11 - np.r_[1: df.shape[1] + 1]).sum(1)
Example
>>> df
0 1 2 3 4 5 6 7 8 9
0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
>>> df.mul(11 - np.r_[1: df.shape[1] + 1])
0 1 2 3 4 5 6 7 8 9
0 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0
1 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0
>>> df.mul(11 - np.r_[1: df.shape[1] + 1]).sum(1)
0 55.0
1 55.0
dtype: float64
I got it working with this:
df[col_list].mul(afloat-np.r_[1:df[col_list].shape[1]+1]).sum(1)
I have a data frame with 10 columns of integer. I would like to create new column like this:
df["newcol"] = df[col1]*(11-1)+df["col2"]*(11-2)+..+df[col10]*(11-10)
I’ve already done the above and it works but I just wonder if there is a pythonic way to do it. Not writing it all down like that because sometime my data frame would have over 100 columns.
Easy way
df['newcol'] = df.mul(11 - np.r_[1: df.shape[1] + 1]).sum(1)
Example
>>> df
0 1 2 3 4 5 6 7 8 9
0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
>>> df.mul(11 - np.r_[1: df.shape[1] + 1])
0 1 2 3 4 5 6 7 8 9
0 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0
1 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0
>>> df.mul(11 - np.r_[1: df.shape[1] + 1]).sum(1)
0 55.0
1 55.0
dtype: float64
I got it working with this:
df[col_list].mul(afloat-np.r_[1:df[col_list].shape[1]+1]).sum(1)