Equivalent 'spread' and 'gather' in R/tidyverse in python/pandas?

Question:

for example.
Data A:

y female male
1 2 3
4 5 6

I want to ‘gather’ it to this:

y gender value
1 female 2
1 male 3
4 female 5
4 male 6

It’s easy in R. What about python pandas?

Asked By: Bin

||

Answers:

You should try melt , in the given data , the opposite(spread version is called cast), these melt and cast functions are very similar to R’s reshape2:

import pandas as pd    
pd.melt(dt, id_vars="y")

Where dt is your input table

Output:

#y  variable      value
#1  female          2
#4  female          5
#1  male            3
#4  male            6
Answered By: PKumar

Try out the melt from the pandas (pd.melt).

Use id_vars to define your main gather/melt variable; value_vars to define your value variables; var_name to define the titles of your value-vars variables; and value_name to define the title of your actual values.

Look at this example:

#Import pandas module
import pandas as pd

# Define the dataframe
DF = pd.DataFrame({'y': [1,4], 'female': [2,5], 'male': [3,6]})

# Gather/melt the data frame
pd.melt(DF, id_vars='y', value_vars=['female', 'male'],var_name='gender',
value_name='value')

That is how your output looks like:

    y   gender  value
0   1   female  2
1   4   female  5
2   1   male    3
3   4   male    6
Answered By: Pooya

Gather

df1=df.melt(id_vars='y')
df1

Spread

df2=df1.pivot(index='y', columns='variable')
df2
Answered By: Anshuman Tadavi

How about this:

from datar import f
from datar.tibble import tribble
from datar.tidyr import pivot_longer

df = tribble(
  f.y, f.female, f.male,
  1,   2,        3,
  4,   5,        6
)

pivot_longer(df, [f.female, f.male], names_to="gender")

#    y    name  value
# 0  1  female      2
# 1  4  female      5
# 2  1    male      3
# 3  4    male      6

I am the author of the datar package. Please feel free to submit issues if you have any questions about using it.

Answered By: Panwen Wang
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.