how to fix 'Column not found: score'?

Question

I have used the statement rename( columns={"user_id": "score"},inplace=True) to rename the user_id to score ,but why I get KeyError: 'Column not found: score'
I do not know how to fix that. I used the code from https://www.geeksforgeeks.org/building-recommendation-engines-using-pandas/?ref=rp. why too many website give wrong code example?
small example here :

import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(columns={"A": "arr", "B": "c"},inplace=True)
print(df)

It works .

 import pandas as pd
      
# Get the column names
col_names = ['user_id', 'item_id', 'rating', 'timestamp']
  
# Load the dataset
path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'
  
ratings = pd.read_csv(path, sep='t', names=col_names)
  
# Check the head of the data
print(ratings.head())
  
# Check out all the movies and their respective IDs
movies = pd.read_csv(
    'https://media.geeksforgeeks.org/wp-content/uploads/Movie_Id_Titles.csv')
print(movies.head())
  
# We merge the data
movies_merge = pd.merge(ratings, movies, on='item_id')
movies_merge.head()
pop_movies = movies_merge.groupby("title")
pop_movies["user_id"].count().sort_values(
    ascending=False).reset_index().rename(
  columns={"user_id": "score"},inplace=True)
  
pop_movies['Rank'] = pop_movies['score'].rank(
  ascending=0, method='first')
pop_movies

Asked By: zzzbei

||

Source

Answer 1

Note that movies_merge.groupby("title") does not return a df. Rather it returns a groupby object (see df.groupby):

pop_movies = movies_merge.groupby("title")
print(type(pop_movies))
<class 'pandas.core.groupby.generic.DataFrameGroupBy'>

Hence, the calculate you perform on this object produces a new df, which you first need to assign to a variable for the .rename( columns={"user_id": "score"},inplace=True) operation to be sensical:

pop_movies = pop_movies["user_id"].count().sort_values(
    ascending=False).reset_index()
print(type(pop_movies))
<class 'pandas.core.frame.DataFrame'>

Now, the rest will work:

pop_movies.rename(
  columns={"user_id": "score"},inplace=True)
  
pop_movies['Rank'] = pop_movies['score'].rank(
  ascending=0, method='first')

print(pop_movies.head())
                       title  score  Rank
0           Star Wars (1977)    584   1.0
1             Contact (1997)    509   2.0
2               Fargo (1996)    508   3.0
3  Return of the Jedi (1983)    507   4.0
4           Liar Liar (1997)    485   5.0

Answered By: ouroboros1

how to fix 'Column not found: score'?

Question:

Answers: