Pandas – Sum across n columns based on value in another column

Question:

How can I sum across n number of columns where n is a value in another column. In this example, ‘heldInventory’ should be the sum of (wk1Sales + wk2Sales … + n) where n is the value in ‘weeksInventory’.

>>> df
     product  weeksInventory  wk1Sales  wk2Sales  wk3Sales  wk4Sales  wk5Sales  wk6Sales  heldInventory
0  Product 1               1        25        31        28        32        33        27             25
1  Product 2               2        18        16        12        14        19        17             34
2  Product 3               3         5         4         2         3         4         6             11
3  Product 4               4        38        42        48        41        45        44            169

I want something like the similar:

df['heldInventory'] = df.iloc[:,2:(3 + df['weeksInventory'])].sum(axis=1)

but df['weeksInventory'] passes in the series, not the value at that row.

I’ve achieved the result through nested for loops, but it is very slow – looking for a vectorized approach.

Asked By: by8504

||

Answers:

You can use apply method.

def func(row):
  n = row.weeksInventory
  return (row.iloc[2:3+n]).sum()

df['heldInventory'] = df.apply(func,axis=1)

Or, in one line:

df['heldInventory'] = df.apply(lambda row: (row.iloc[2:3+row.weeksInventory]).sum(),axis=1)
Answered By: rajkumar_data
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.