Is that possible to use python pandas to construct portfolios by a loop?

Question:

Now I have a pandas dataframe that is similar to the table

index Date return
1 2017-1 0.1
1 2017-2 0.2
1 2017-3 0.3
2 2017-1 0.4
2 2017-2 0.5
2 2017-3 0.6
3 2017-1 0.7
3 2017-2 0.8
3 2017-3 0.9
4 2017-1 1.0
4 2017-2 1.1
4 2017-3 1.2

And now the table contains 4 indexs in such order. I want constuct portfolio on every 2 index, keeping the date unchanged but portfolio return is the average of 2 index. So that the result will be 2 portfolios.

The target result is

portfolio_index Date portfolio return
portfolio 1 2017-1 0.25
portfolio 1 2017-2 0.35
portfolio 1 2017-3 0.45
portfolio 2 2017-1 0.85
portfolio 2 2017-2 0.95
portfolio 2 2017-3 1.05

For example, the portfolio 1 should contains index 1 and 2, and have return on separate months.
Is there any possible solution?

import pandas as pd
db = pd.read_csv('17base.csv')
db = db.groupby(db['index'])['return'].mean()
Asked By: Shiraz

||

Answers:

IIUC, you need to craft a custom dataframe, with help of numpy.reshape:

N = 2 # number of values to group
M = 3 # number of Dates per portfolio
df2 = pd.DataFrame({'portfolio_index': df['index'].iloc[:len(df)//2],
                    'Date': df['Date'].iloc[:len(df)//2],
                    'portfolio return': df['return'].to_numpy().reshape(N,-1,M).mean(1).ravel()
                   })

output:

   portfolio_index    Date  portfolio return
0                1  2017-1              0.25
1                1  2017-2              0.35
2                1  2017-3              0.45
3                2  2017-1              0.85
4                2  2017-2              0.95
5                2  2017-3              1.05

Alternatively, with pandas:

N = 2
M = 3
group = 'portfolio_'+pd.Series(np.arange(len(df)//M)//N+1).astype(str)
(df
 .pivot(index='index', columns='Date', values='return')
 .groupby(group.values).mean()
 .rename_axis('portfolio_index')
 .stack()
 .reset_index(name='portfolio return')
)

output:

     portfolio    Date  portfolio return
0  portfolio_1  2017-1              0.25
1  portfolio_1  2017-2              0.35
2  portfolio_1  2017-3              0.45
3  portfolio_2  2017-1              0.85
4  portfolio_2  2017-2              0.95
5  portfolio_2  2017-3              1.05
Answered By: mozway

You could construct a custom group index that reflects which rows you want to aggregate together:

import numpy as np
import pandas as pd

n = 12
df = pd.DataFrame({'return': np.linspace(0.1, 1.2, num=n)})

df['group'] = (np.tile([0, 1, 2, 0, 1, 2], n // 6) 
              + np.repeat(np.arange(n // 2, step=3), 6))

df
    return  group
0   0.1     0
1   0.2     1
2   0.3     2
3   0.4     0
4   0.5     1
5   0.6     2
6   0.7     3
7   0.8     4
8   0.9     5
9   1.0     3
10  1.1     4
11  1.2     5

So that now you can group accordingly:

df.groupby('group').mean()
    return
group   
0   0.25
1   0.35
2   0.45
3   0.85
4   0.95
5   1.05
Answered By: Arne

answer above has hard code more or less, here is mine

def function1(dd:pd.DataFrame):
    return dd.groupby('Date',as_index=False)['return'].mean()

df11=df1.assign(col1=df1[['index']].add(1).floordiv(2).astype(str)).groupby('col1').apply(function1).droplevel(1).reset_index().assign(portfolio_index="portfolio ")

df11.assign(portfolio_index=df11.portfolio_index.str.cat(df11.col1)).drop('col1',axis=1)

out

     Date  return portfolio_index
0  2017-1    0.25     portfolio 1
1  2017-2    0.35     portfolio 1
2  2017-3    0.45     portfolio 1
3  2017-1    0.85     portfolio 2
4  2017-2    0.95     portfolio 2
5  2017-3    1.05     portfolio 2
Answered By: G.G
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.