Need specific sets of columns to be converted into a row and the rest of columns to repeat values

Question

I have data in the following format

ID	SCHOOL	Name1	Name1 Subject1	Name1 Grade1	Name1 Subject2	Name1 Grade2	Name2	Name2 Subject1	Name2 Grade1	Name2 Subject2	Name2 Grade2
1	S1	Mr. ABC	Math	6	Science	7	Mr. XYZ	Social	8	EVS	9
2	S2	Mr. PQR	Math	10	Science	11	Mr. KLM	Social	8	EVS	9

Can I transform it in the following format using Python

ID	SCHOOL	Name	Subject	Grade
1	S1	Mr. ABC	Math	6
1	S1	Mr. ABC	Science	7
1	S1	Mr. XYZ	Social	8
1	S1	Mr. XYZ	EVS	9
2	S2	Mr. PQR	Math	10
2	S2	Mr. PQR	Science	11
2	S2	Mr. KLM	Social	8
2	S2	Mr. KLM	EVS	9

Asked By: Harshit Jain

||

Source

Answer 1

there might be a nicer solution but this also works:

df_1=df[['ID', 'SCHOOL','Name1', 'Name1 Subject1',
       'Name1 Grade1']]
df_2=df[['ID', 'SCHOOL','Name1', 'Name1 Subject2',
       'Name1 Grade2']]

df_3=df[['ID', 'SCHOOL','Name2', 'Name2 Subject1',
       'Name2 Grade1']]

df_4=df[['ID', 'SCHOOL','Name2', 'Name2 Subject2',
       'Name2 Grade2']]

df_list=[df_1,df_2,df_3,df_4]
for i in df_list:
    i.columns=['ID','SCHOOL','Name','Subject','Grade']
final=pd.concat(df_list)
print(final)
'''
    ID  SCHOOL  Name    Subject Grade
0   1   S1      Mr. ABC Math    6
1   2   S2      Mr. PQR Math    10
0   1   S1      Mr. ABC Science 7
1   2   S2      Mr. PQR Science 11
0   1   S1      Mr. XYZ Social  8
1   2   S2      Mr. KLM Social  8
0   1   S1      Mr. XYZ EVS     9
1   2   S2      Mr. KLM EVS     9

'''

Answered By: Clegane

Answer 2

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor


(df
.pivot_longer(
    index = ['ID', 'SCHOOL', 'Name1', 'Name2'], 
    names_to = ('name', '.value'), 
    names_pattern=r"(.+)s(.+)d", 
    sort_by_appearance=True)
.assign(Name = lambda df: np.where(df.name.eq('Name1'), df.Name1, df.Name2))
.loc[:, ['ID', 'SCHOOL', 'Name', 'Subject', 'Grade']]
)

   ID SCHOOL     Name  Subject  Grade
0   1     S1  Mr. ABC     Math      6
1   1     S1  Mr. ABC  Science      7
2   1     S1  Mr. XYZ   Social      8
3   1     S1  Mr. XYZ      EVS      9
4   2     S2  Mr. PQR     Math     10
5   2     S2  Mr. PQR  Science     11
6   2     S2  Mr. KLM   Social      8
7   2     S2  Mr. KLM      EVS      9

The names_pattern is a regex to capture groups within the specified columns. Groups are specified in the parenthesis; any group that aligns with .value stays as column header. here we have two groups in the names_pattern and two values in names_to, the second group is paired with .value and stays as a column header.

After reshaping, the where clause helps to ensure correct match and final output

Answered By: sammywemmy

Need specific sets of columns to be converted into a row and the rest of columns to repeat values

Question:

Answers: