Initial value of multiple variables dataframe for time dilation

Question

Dataframe:

product1	product2	product3	product4	product5
straws	orange	melon	chair	bread
melon	milk	book	coffee	cake
bread	melon	coffe	chair	book

CountProduct1	CountProduct2	CountProduct3	Countproduct4	Countproduct5
1	1	1	1	1
2	1	1	1	1
2	3	2	2	2

RatioProduct1	RatioProduct2	RatioProduct3	Ratioproduct4	Ratioproduct5
0.28	0.54	0.33	0.35	0.11
0.67	0.25	0.13	0.11	0.59
2.5	1.69	1.9	2.5	1.52

I want to create five others columns that keep my initial ratio of each item along the dataframe.

Output:

InitialRatio1	InitialRatio2	InitialRatio3	InitialRatio4	InitialRatio5
0.28	0.54	0.33	0.35	0.11
0.33	0.25	0.13	0.31	0.59
0.11	0.33	0.31	0.35	0.13

Asked By: Bry Sab

||

Source

Answer 1

If you’re after code to create the init_rateX columns then the following will work

pd.DataFrame(
    np.divide(
        df[["ratio1", "ratio2", "ratio3", "ratio4", "ratio5"]].to_numpy(),
        df[["Count1", "Count2", "Count3", "Count4", "Count5"]].to_numpy(),
    ),
    columns=["init_rate1", "init_rate2", "init_rate3", "init_rate4", "init_rate5"],
)

which gives

   init_rate1  init_rate2  init_rate3  init_rate4  init_rate5
0        0.28        0.25        0.33        0.57       0.835
1        0.33        0.13        0.97        0.65       0.760
2        0.54        0.11        0.45        0.95       1.160
3        0.35        0.59        0.34        1.25       1.650

However it does not agree with your calcs for init_rate4 or init_rate5 so some clarification might be needed.

Answered By: Riley

Answer 2

Check the code again. Do you have an error in product3 = coffe and product4 = coffee? Fixed coffe to coffee. As a result, 0.31 should not be.

import pandas as pd
pd.set_option('display.max_rows', None)  # print everything rows
pd.set_option('display.max_columns', None)  # print everything columns

df = pd.DataFrame(
{
    'product1':['straws', 'melon', 'bread'],
    'product2':['orange', 'milk', 'melon'],
    'product3':['melon', 'book', 'coffee'],
    'product4':['chair', 'coffee', 'chair'],
    'product5':['bread', 'cake', 'book'],
    'time':[1,2,3],
    'Count1':[1,2,2],
    'Count2':[1,1,3],
    'Count3':[1,1,2],
    'Count4':[1,1,2],
    'Count5':[1,1,2],
    'ratio1':[0.28, 0.67, 2.5],
    'ratio2':[0.54, 0.25, 1.69],
    'ratio3':[0.33, 0.13, 1.9],
    'ratio4':[0.35, 0.11, 2.5],
    'ratio5':[0.11, 0.59, 1.52],

})

print(df)

product = df[['product1', 'product2', 'product3', 'product4', 'product5']].stack().reset_index()
count = df[['Count1',  'Count2',  'Count3', 'Count4',  'Count5']].stack().reset_index()
ratio = df[['ratio1',  'ratio2',  'ratio3',  'ratio4',  'ratio5']].stack().reset_index()
print(ratio)


arr = pd.unique(product[0])
aaa = [i for i in range(len(arr)) if product[product[0] == arr[i]].count()[0] > 1]
for i in aaa:
    prod_ind = product[product[0] == arr[i]].index
    val_ratio = ratio.loc[prod_ind[0], 0]
    ratio.loc[prod_ind, 0] = val_ratio

print(ratio.pivot_table(index='level_0', columns='level_1', values=[0]))

Output:

level_1 ratio1 ratio2 ratio3 ratio4 ratio5
level_0                                   
0         0.28   0.54   0.33   0.35   0.11
1         0.33   0.25   0.13   0.11   0.59
2         0.11   0.33   0.11   0.35   0.13

To work with data, they need to be turned into one column using stack().reset_index(). Create a list of unique products arr. Further in the list aaa I get indexes of arr, which are more than one.

prod_ind = product[product[0] == arr[i]].index

In a loop, I get indexes of products that are more than one.

val_ratio = ratio.loc[prod_ind[0], 0]

Get the first value of the product.

ratio.loc[prod_ind, 0] = val_ratio

Set this value for all products.
To access the values, explicit loc indexing is used, where the row indices are in square brackets on the left, and the names of the columns on the right. Read more here.

In pivot_table I create back the table.
To insert the processed data into the original dataframe, simply use the following:

table = ratio.pivot_table(index='level_0', columns='level_1', values=[0])
df[['ratio1',  'ratio2',  'ratio3',  'ratio4',  'ratio5']] = table
print(df)

Answered By: inquirer

Answer 3

index	0	1	2	3	4
0	0.0625	0.034482758620689655	0.03125	0.027777777777777776	0.024390243902439025
1	0.2857142857142857	0.15384615384615385	0.05128205128205128	0.0425531914893617	0.04
2	0.21428571428571427	0.16666666666666666	0.15789473684210525	0.0967741935483871	0.08108108108108109

This is a sample of columns of ratio from my original df, where columns count == 1, which means the initial ratio.

And that’s what happened when I used your code.

variable	k1	k2	k3	k4	k5
1	0.062500	8.827586	8.093750	0.166667	7.439024
2	0.285714	16.461538	8.615385	1.829787	0.040000
3	0.214286	16.888889	12.631579	16.129032	3.567568

It completely changes variables of columns except the first one.
Well thanks for your enormous help as well as @Riley. I’ll try to find another way, maybe pandas is just not good enough for such tasks.
Thanks a lot for your help and time you put on work.

Answered By: Bry Sab

Initial value of multiple variables dataframe for time dilation

Question:

Answers: