Convert subset of columns to rows by combining columns
Question:
Pandas 1.1.4
MRE:
df = pd.DataFrame({"Code":[1,2], "view_A":[3000, 2300], "click_A":[3, 23],
"view_B":[1200, 300], "click_B":[5, 3]})
df.set_index("Code", inplace=True)
>>>
view_A click_A view_B click_B
Code
1 3000 3 1200 5
2 2300 23 300 3
Want to make it into
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
I can do it, but want to explore more (clean) options.
My sol’tn
a_df = df[["view_A", "click_A"]].rename(columns={"view_A":"view", "click_A":"click"})
a_df["type"] = "A"
b_df = df[["view_B", "click_B"]].rename(columns={"view_B":"view", "click_B":"click"})
b_df["type"] = "B"
final_df = pd.concat([a_df, b_df])
But code is dirty.
Answers:
This is essentially a reshape operation using stack
df.columns = df.columns.str.split('_', expand=True)
df.stack().rename_axis(['code', 'type'])
click view
code type
1 A 3 3000
B 5 1200
2 A 23 2300
B 3 300
try with pd.wide_to_long
out = pd.wide_to_long(df.reset_index(),
['view','click'],
i='Code',
j='type',
sep='_',
suffix='\w+')
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
Another option, with pivot_longer from pyjanitor:
df = pd.DataFrame({"Code":[1,2], "view_A":[3000, 2300], "click_A":[3, 23],
"view_B":[1200, 300], "click_B":[5, 3]})
# pip install pyjanitor
import pandas as pd
import janitor
(df.pivot_longer(index='Code',
names_to=('.value', 'type'),
names_sep='_')
.set_index(['Code', 'type'])
)
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
df.set_axis(df.columns.str.split('_',expand=True),axis=1)
.stack()
click view
Code
1 A 3 3000
B 5 1200
2 A 23 2300
B 3 300
Pandas 1.1.4
MRE:
df = pd.DataFrame({"Code":[1,2], "view_A":[3000, 2300], "click_A":[3, 23],
"view_B":[1200, 300], "click_B":[5, 3]})
df.set_index("Code", inplace=True)
>>>
view_A click_A view_B click_B
Code
1 3000 3 1200 5
2 2300 23 300 3
Want to make it into
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
I can do it, but want to explore more (clean) options.
My sol’tn
a_df = df[["view_A", "click_A"]].rename(columns={"view_A":"view", "click_A":"click"})
a_df["type"] = "A"
b_df = df[["view_B", "click_B"]].rename(columns={"view_B":"view", "click_B":"click"})
b_df["type"] = "B"
final_df = pd.concat([a_df, b_df])
But code is dirty.
This is essentially a reshape operation using stack
df.columns = df.columns.str.split('_', expand=True)
df.stack().rename_axis(['code', 'type'])
click view
code type
1 A 3 3000
B 5 1200
2 A 23 2300
B 3 300
try with pd.wide_to_long
out = pd.wide_to_long(df.reset_index(),
['view','click'],
i='Code',
j='type',
sep='_',
suffix='\w+')
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
Another option, with pivot_longer from pyjanitor:
df = pd.DataFrame({"Code":[1,2], "view_A":[3000, 2300], "click_A":[3, 23],
"view_B":[1200, 300], "click_B":[5, 3]})
# pip install pyjanitor
import pandas as pd
import janitor
(df.pivot_longer(index='Code',
names_to=('.value', 'type'),
names_sep='_')
.set_index(['Code', 'type'])
)
view click
Code type
1 A 3000 3
2 A 2300 23
1 B 1200 5
2 B 300 3
df.set_axis(df.columns.str.split('_',expand=True),axis=1)
.stack()
click view
Code
1 A 3 3000
B 5 1200
2 A 23 2300
B 3 300