How to convert columns to rows Datewise in Python

Question

I have a data in the following format

Date	AA_ZZ_CR	AA_ZZ_BT	AA_XX_CR	AA_XX_BT	BB_ZZ_CR	BB_ZZ_BT	BB_XX_CR	BB_XX_BT
20230202	20	56	34	556	29	59	32	559
20230203	21	45	54	423	28	48	53	426
20230204	22	78	23	790	27	76	29	794
20230205	23	78	56	778	26	72	51	771
20230206	24	89	78	855	25	81	79	850
20230207	25	56	89	545	24	54	86	543

Want it converted into Date format.

Date	Data	ZZ_CR	ZZ_BT	XX_CR	XX_BT
20230202	AA	20	56	34	556
20230202	BB	29	59	32	559
20230203	AA	21	45	54	423
20230203	BB	28	48	53	426

Is there any way of doing that?

Asked By: Divyansh Kumar Singh

||

Source

Answer 1

You can easily get you desire output within one line if you have one similar category but here we have multiple so I have used loop, to just need to merge new column with Date.

With Pandas:

##Get the list of columns except unnecessary ones like date
columns = df.columns[1:]

#Get the set new columns headers like from AA_ZZ_BT split and get ZZ_BT 
uni = set([i.split('_',1)[1] for i in df.columns[1:]])

#Now we know how many new column will need to loop over that set
for idx, u in enumerate(uni):
    
    #Step 1 find the similar columns
    find_col = columns[[u in i for i in columns]]

    #Step 2 create new df with and include date columns as well
    d = df[['Date'] + list(find_col)]

    #step 3 using melt function pivot the table
    d = d.melt(id_vars=['Date'], value_vars=find_col, var_name='Data', value_name=u)

    #Now still we have to clean Data column by splitting AA from AA_ZZ_BT
    d['Data'] = d['Data'].str.slice(stop=2)
    
    #Final step if its first time then take whole d as new_df else concat the last col with new_df
    new_df=d if idx==0 else pd.concat([new_df, d[u]], axis=1)

        
new_df  #output

Answered By: R. Baraiya

Answer 2

The problem consists mostly of manipulating and renaming some columns if you take the looping through columns approach. I am assuming your data is in a dataframe called df.

data_entries = list(set(col[:2] for col in df.columns[1:]))
data_entries.sort()

dfs_split = []
for entry in data_entries: 
    # Get the Date and Data columns 
    cols = ['Date'] + [col for col in df.columns if entry in col]    
    df_data = df[cols]
    # Add the Data column 
    df_data.insert(1, 'Data', entry)
    # Take out the Data prefix on the columns
    df_data = df_data.rename(lambda x: x.replace(f'{entry}_', ''), axis=1)
    dfs_split.append(df_data)
        
df = pd.concat(dfs_split, axis=0)    
df = df.sort_values(by='Date')
df

Answered By: Brener Ramos

How to convert columns to rows Datewise in Python

Question:

Answers: