Python pandas sort_values not working properly

Question:

When I try to sort DataFrame by column value and print it white head() function it shows duplicated rows instead of desired result

regions = country_features['world_region']
happines = []
counts = []
reg = []

for region in regions:
    hap = country_features.loc[country_features['world_region'] == region, 'happiness_score'].mean()
    count = len(country_features[country_features['world_region'] == region])
    happines.append(hap)
    counts.append(count)
    reg.append(region)

region_happines = pd.DataFrame({'region':reg,
                                'happiness_score' : happines,
                                'country_count':counts})

region_happines
region_happines.happiness_score = pd.to_numeric(region_happines.happiness_score)
sorted = region_happines.sort_values(by='happiness_score', ascending=False)

sorted.head(5)

I want to sort DataFrame by column value and I expected it to be sorted correctly

Asked By: ISO

||

Answers:

First part of solution should be simplify:

print (country_features)
  world_region  happiness_score
0         reg1                5
1         reg1                1
2         reg2               10
3         reg2                1
4         reg2                3

region_happines = (country_features.groupby('world_region',as_index=False)
                                   .agg(happiness_score= ('happiness_score','mean'),
                                        country_count= ('happiness_score','size'))
                                   .rename(columns={'world_region':'region'}))
print (region_happines)
  region  happiness_score  country_count
0   reg1         3.000000              2
1   reg2         4.666667              3

Because in column happiness_score are averages per groups, not converted to numeric.

out = region_happines.sort_values(by='happiness_score', ascending=False)
Answered By: jezrael
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.