FIFO distribution pandas

Question:

I have a dataframe with residuals and a dataframe with orders

ln [65]: residuals = pd.DataFrame({'area': ['pnos', 'vnp'], 'number': [3, 4], 'balances':[20,95]})
​
In [67]: orders = pd.DataFrame({'area': ['pnos', 'vnp','vnp','vnp', 'vnp'], 'number': [3, 4, 4, 4 1], 'requires':[40,70,20,10,25]})
In [68]:orders
Out[68]:
      area  number      requires
0      pnos    3         40
1      vnp     4         70
2      vnp     4         20
3      vnp     4         10        
4      vnp     1         25
In [69]: residuals
Out[69]:
    area      number    balances
0   pnos       3        20
1   vnp        4        95

You can see area and number product that is a keys
in orders by area – ‘pnos’ and number ‘3’ requires 40 kg
and we can see in dataframe ‘residuals’ and if it have area – ‘pnos’ and number – ‘3’ and its balances more or equal than requires, so we set 20 and reduce in balances by this amount, else we just set 0

Visually, the result I want would look like this:

      area  number      requires   ready
0      pnos    3         40         20
1      vnp     4         70         70
2      vnp     4         20         20
3      vnp     4         10         5
4      vnp     1         25         0

In the third line (area-pnos,number-4) we set 5 because we have distributed the balance to the previous lines and now the balance is less than required

I solved it, but not in a real elegant way.

def distr(number, area, requiers):
    if residuals[(residuals['number']==number) & (residuals['area']==area)].empty:
        return 0
    elif requiers==0:
        return 0
    elif requiers>0:
        if residuals[(residuals['number']==number) & (residuals['area']==area)]['balances'].iloc[0]  >= requiers:    
            residuals.loc[(residuals['number']==number) & (residuals['area']==area), 'balances'] -= requiers
            return requiers
        else:
 
            result = residuals[(residuals['number']==number) & 
                               (residuals['area']==area)]['balances'].iloc[0]
            residuals.loc[(residuals['number']==number) & (residuals['area']==area), 'balances'] = 0
            
            return result
In [118]: orders['ready'] = orders.apply(lambda x: distr(x['number'], x['area'], x['requires']), axis=1)

how I can distribute data more optimize? I have more than 20000 rows in excel and it calculate for quite a long time

Asked By: Valerie

||

Answers:

  • First, you need to join the residuals to give the upper limit per areanumber pairs.
  • Second, calculated the requires_cum as the cumulative sum of the requires, which is the desired amount from the orders
  • Third, calculate the ready as the requires adjusted with the appropriate upper and lower bounds.
import pandas as pd
residuals = pd.DataFrame({'area': ['pnos', 'vnp'], 'number': [3, 4], 'balances':[20,95]})
orders = pd.DataFrame({'area': ['pnos', 'vnp','vnp','vnp', 'vnp'], 'number': [3, 4, 4, 4, 1], 'requires':[40,70,20,10,25]})

orders = pd.merge(orders, residuals, how="left")
orders.balances.fillna(0, inplace=True)
orders["requires_cum"] = orders.groupby(["area", "number"]).requires.cumsum()
orders["ready"] = orders.requires.clip(upper=orders.balances - orders.requires_cum + orders.requires).clip(lower=0)
orders

yields:

   area  number  requires  balances  requires_cum  ready
0  pnos       3        40      20.0            40     20
1   vnp       4        70      95.0            70     70
2   vnp       4        20      95.0            90     20
3   vnp       4        10      95.0           100      5
4   vnp       1        25       0.0            25      0
Answered By: Kota Mori
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.