Reorganizing a pandas dataframe by a repeating value

Question:

Thanks for the help! I’ve been scouring the site for similar questions, so sorry if this is a repeat, but I haven’t found anything similar.

But I have a dataframe coming down from a query with a ton of securities and a single data point over a number of dates. As you can see in the picture, the way the raw data comes down repeats the security over all the available dates with the results all in the same column at the end of the dataframe.

I want to see if I can transform the dataframe to make a column for each security with the dates as the index. I can do this with a for loop, but I was hoping there’d be something more elegant within the raw dataframe that someone might have an idea for.

I was trying some groupbys and some data slices on the ID column, but couldn’t think of a good way to transform the slices.

Thanks!

          ID       DATE SOURCE    ID_DATE  
0     NVTS US Equity 2022-03-15    ETF 2023-03-10   
1     NVTS US Equity 2022-03-31    ETF 2023-03-10   
2     NVTS US Equity 2022-04-14    ETF 2023-03-10   
3     NVTS US Equity 2022-04-29    ETF 2023-03-10   
4     NVTS US Equity 2022-05-13    ETF 2023-03-10   
...              ...        ...    ...        ...   
1762  BEEM US Equity 2023-01-13    ETF 2023-03-10   
1763  BEEM US Equity 2023-01-31    ETF 2023-03-10   
1764  BEEM US Equity 2023-02-15    ETF 2023-03-10   
1765  BEEM US Equity 2023-02-28    ETF 2023-03-10   
Asked By: helpfulhelp1000

||

Answers:

You can try the pivot function from pandas.

Something like this

pivot_df = df.pivot(columns='ID', index='DATE', values='SOURCE')
Answered By: P. Shroff
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.