pandas displaying one column's value in the order of top score

Question:

I have a data like

deviceType category date score
A soccer 07012022 10
A baseball 07012022 8
B basketball 07012022 9
A soccer 07022022 9

and I would like to convert it to

category A B C…(device types so on)
soccer basketball
baseball

which displays category values under each device type in descending order of average of that category scores (soccer is above baseball for device type A as its average score, 9.5, is higher than average score of baseball, 8)

I was trying this with using groupby with two columns (deviceType and category), but I couldn’t figure out how to put one column’s value under another.

I will appreciate any help.

Asked By: Daniel Kim

||

Answers:

You can gropby('device') later work with every group separatelly to create Series with devide as name, and concatenate with other columns.

import pandas as pd

data = {
    'device': ['A', 'A','B','A','C'], 
    'category': ['soccer','basketball','basketball','soccer','tenis'],
    'score': [10,8,9,9,0],
}

df = pd.DataFrame(data)
print(df)

new = pd.DataFrame()

for key, val in df.groupby('device'):
    data = val.groupby('category').mean().sort_values(by='score', ascending=False)
    print(data)
    s = pd.Series(data.index)
    s.name = key
    new = pd.concat([new, s], axis=1)
    
print(new)   

Result:

  device    category  score
0      A      soccer     10
1      A  basketball      8
2      B  basketball      9
3      A      soccer      9
4      C       tenis      0

            score
category         
soccer        9.5
basketball    8.0

            score
category         
basketball    9.0

          score
category       
tenis       0.0

            A           B      C
0      soccer  basketball  tenis
1  basketball         NaN    NaN
Answered By: furas

Using numpy:

# Calculate the mean score per category and device type
tmp = df.pivot_table(
    index="category",
    columns="deviceType",
    values="score",
    aggfunc="mean",
)

# Extract the score and category matrix into numpy
score = tmp.to_numpy()
category = np.tile(tmp.index, (len(tmp.columns), 1)).T

# Sort the score descending. numpy always sort ascending so to do descending
# sort, we need to negate the `score` array
index = np.argsort(-score, axis=0)

# Sort score and category
ordered_score = np.take_along_axis(score, index, axis=0)
ordered_category = np.take_along_axis(category, index, axis=0)

# Result
pd.DataFrame(
    np.where(np.isnan(ordered_score), "", ordered_category), columns=tmp.columns
)

Another solution with pandas:

(
    # Calculate the mean score
    df.groupby(["deviceType", "category"], as_index=False)["score"]
    .mean()
    # Sort by the mean score despondingly and assign an order
    .sort_values("score", ascending=False)
    .assign(order=lambda x: x.groupby("deviceType").cumcount())
    # Pivot for final result
    .pivot_table(index="order", columns="deviceType", values="category", aggfunc="first")
)
Answered By: Code Different
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.