divide columns vertically by max value in a dataframe

Question

I have a following table:

id	summary	summary_len	apple	book	computer
1	….	210	2	1	0
2	…	120	3	0	1
3	…	50	2	2	1

summary is basically some sort of description, summary_len <- the length of those descriptions and the rest – apple/book/computer and the keywords and the values presented in the table – those are the occurrences in each description.

I need to normalize this table, in a way to find max value – PER COLUMN (vertically) and then divide by this value, so the output will be as below (I put it in a format 2/3 – just to emphasis max value per column):

id	summary	summary_len	apple	book	computer
1	….	210	2/3	1/2	0/1
2	…	120	3/3	0/2	1/1
3	…	50	2/3	2/2	1/1

My problem here is that I don’t have to find max in each columns – only for those keywords, which I am checking the occurrences for. I stored them in a list and got max value per column:

max_per_col = df_freq[keywords].max()
max_per_col

this is how it looks (with the original data):

Could you help me apply it "back" to the former dataframe and divide vertically each column by the max value?

Asked By: Kas

||

Source

Answer 1

You can divide only filtered columns by maximal values:

keywords = ['apple','book','computer']

df_freq[keywords] /= df_freq[keywords].max()

#working like
#df_freq[keywords] = df_freq[keywords] / df_freq[keywords].max()
print (df_freq)
   id  summary_len     apple  book  computer
0   1          210  0.666667   0.5       0.0
1   2          120  1.000000   0.0       1.0
2   3           50  0.666667   1.0       1.0

Answered By: jezrael

divide columns vertically by max value in a dataframe

Question:

Answers: