Calculate RMS, Count, SUM to array inside all columns of pandas dataframe

Question:

I would like to calculate RMS, Count, SUM to array inside all columns of pandas dataframe and then fulfill outputs into new three dataframes as shown below

enter image description here

P.S > solution should deal with N numbers of columns, in my case, I have around 300 columns x,y,z,a,b,c ……. etc …… N

ID x y z …..
EF407412 [471, 1084, 1360, 2284] [1408, 1572, 2277] [977, 1003, 1493, 1519, 1650, 1676, 2804] …..
KM043272 [2173] [1293, 2354,] [1200] …..
KM043273 ….. ….. ….. …..
….. ….. ….. ….. …..

Dataframe_RMS
ID x y z …..
EF407412 1454.5749 1792.4263 1685.6893 …..
KM043272 2173 1899.1004 1200 …..
KM043273 ….. ….. ….. …..
….. ….. ….. ….. …..

Dataframe_Count
ID x y z …..
EF407412 4 3 7 …..
KM043272 1 2 1 …..
KM043273 ….. ….. ….. …..
….. ….. ….. ….. …..

Asked By: Mohamed Elhefnawy

||

Answers:

Updating answer as per the OP’s comment – for any number of Columns Check Below code:

import pandas as pd 
from ast import literal_eval
import numpy as np 

df = pd.DataFrame({'ID':['EF407412','KM043272']
                   , 'x': ['[471, 1084, 1360, 2284]','[2173]']
                   , 'y': ['[1408, 1572, 2277]','[1293, 2354,]']
                   , 'z': ['[977, 1003, 1493, 1519, 1650, 1676, 2804]','[1200]']} )

col_num = df.shape[1]

COUNT

df[[i+"_count" for i in df.columns[1:col_num]]] = df.apply(lambda x: ("{},"*(col_num-1))[:-1].
                                                    format( *(tuple([len(literal_eval(x[col])) for col in df.columns[1:col_num]] ))),axis=1).
                                                    astype('str').str.split(',', expand=True).values

df[['ID']+([ col for col in df.columns if col.endswith('count')])]

OUTPUT:

enter image description here

SUM

df[[i+"_sum" for i in df.columns[1:col_num]]] = df.apply(lambda x: ("{},"*(col_num-1))[:-1].
                                                    format( *(tuple([sum(literal_eval(x[col])) for col in df.columns[1:col_num]] ))),axis=1).
                                                    astype('str').str.split(',', expand=True).values

df[['ID']+([ col for col in df.columns if col.endswith('sum')])]

Output:

enter image description here

RMS

df[[i+"_rms" for i in df.columns[1:col_num]]] = df.apply(lambda x: ("{},"*(col_num-1))[:-1].
                                                    format( *(tuple([np.sqrt(np.mean(np.square(literal_eval(x[col])))) for col in df.columns[1:col_num]] ))),axis=1).
                                                    astype('str').str.split(',', expand=True).values

df[['ID']+([ col for col in df.columns if col.endswith('rms')])]

Output:

enter image description here

Answered By: Abhishek
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.