Calculate average between columns in dataset

Question:

I have a dataset like this:

State %internet2016 %internet2018 %white 2016 %white 2018
QLD 85 90 85 82
NSW 83 92 80 72
WA 82 88 82 78
SA 86 86 88 84
TAS 78 80 92 88
NT 48 65 76 75

My goal is to create a dataset like this

State %internet2016-2018 %white2016-2018
QLD 87.5 83.5
NSW 87.5 76
WA 85 80
SA 86 86
TAS 79 90
NT 56.5 75.5

I would like to have the mean of %internet and %white.

I tried this but it was not what I want:

df = df["%internet2016","%internet2018"].mean()
Asked By: Paddy D

||

Answers:

You can easily get the mean using this

df['%internet2016-2018']=(df['%internet2016']+df['%internet2018'])/2
df['%white2016-2018']= (df['%white 2016']+df['%white 2018'])/2
df=df.drop(['%internet2016', '%internet2018', '%white 2016', '%white 2018'],axis=1)

Result

TransformedDataFrame

Answered By: Joevanie

Concatenating means of %internet and %white groups of columns:

pd.concat([df['State'], 
           df.filter(like='%internet').pipe(lambda df: df.mean(axis=1)
                                            .rename('%internet'+'-'.join(c[-4:] for c in df.columns))),
           df.filter(like='%white').pipe(lambda df: df.mean(axis=1)
                                         .rename('%white'+'-'.join(c[-4:] for c in df.columns)))], axis=1)

  State  %internet2016-2018  %white2016-2018
0  QLD                 87.5             83.5
1  NSW                 87.5             76.0
2   WA                 85.0             80.0
3   SA                 86.0             86.0
4  TAS                 79.0             90.0
5   NT                 56.5             75.5
Answered By: RomanPerekhrest
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.