how to fix 'Column not found: score'?
Question:
I have used the statement rename( columns={"user_id": "score"},inplace=True)
to rename the user_id to score ,but why I get KeyError: 'Column not found: score'
I do not know how to fix that. I used the code from https://www.geeksforgeeks.org/building-recommendation-engines-using-pandas/?ref=rp. why too many website give wrong code example?
small example here :
import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(columns={"A": "arr", "B": "c"},inplace=True)
print(df)
It works .
import pandas as pd
# Get the column names
col_names = ['user_id', 'item_id', 'rating', 'timestamp']
# Load the dataset
path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'
ratings = pd.read_csv(path, sep='t', names=col_names)
# Check the head of the data
print(ratings.head())
# Check out all the movies and their respective IDs
movies = pd.read_csv(
'https://media.geeksforgeeks.org/wp-content/uploads/Movie_Id_Titles.csv')
print(movies.head())
# We merge the data
movies_merge = pd.merge(ratings, movies, on='item_id')
movies_merge.head()
pop_movies = movies_merge.groupby("title")
pop_movies["user_id"].count().sort_values(
ascending=False).reset_index().rename(
columns={"user_id": "score"},inplace=True)
pop_movies['Rank'] = pop_movies['score'].rank(
ascending=0, method='first')
pop_movies
Answers:
Note that movies_merge.groupby("title")
does not return a df. Rather it returns a groupby object (see df.groupby
):
pop_movies = movies_merge.groupby("title")
print(type(pop_movies))
<class 'pandas.core.groupby.generic.DataFrameGroupBy'>
Hence, the calculate you perform on this object produces a new df, which you first need to assign to a variable for the .rename( columns={"user_id": "score"},inplace=True)
operation to be sensical:
pop_movies = pop_movies["user_id"].count().sort_values(
ascending=False).reset_index()
print(type(pop_movies))
<class 'pandas.core.frame.DataFrame'>
Now, the rest will work:
pop_movies.rename(
columns={"user_id": "score"},inplace=True)
pop_movies['Rank'] = pop_movies['score'].rank(
ascending=0, method='first')
print(pop_movies.head())
title score Rank
0 Star Wars (1977) 584 1.0
1 Contact (1997) 509 2.0
2 Fargo (1996) 508 3.0
3 Return of the Jedi (1983) 507 4.0
4 Liar Liar (1997) 485 5.0
I have used the statement rename( columns={"user_id": "score"},inplace=True)
to rename the user_id to score ,but why I get KeyError: 'Column not found: score'
I do not know how to fix that. I used the code from https://www.geeksforgeeks.org/building-recommendation-engines-using-pandas/?ref=rp. why too many website give wrong code example?
small example here :
import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(columns={"A": "arr", "B": "c"},inplace=True)
print(df)
It works .
import pandas as pd
# Get the column names
col_names = ['user_id', 'item_id', 'rating', 'timestamp']
# Load the dataset
path = 'https://media.geeksforgeeks.org/wp-content/uploads/file.tsv'
ratings = pd.read_csv(path, sep='t', names=col_names)
# Check the head of the data
print(ratings.head())
# Check out all the movies and their respective IDs
movies = pd.read_csv(
'https://media.geeksforgeeks.org/wp-content/uploads/Movie_Id_Titles.csv')
print(movies.head())
# We merge the data
movies_merge = pd.merge(ratings, movies, on='item_id')
movies_merge.head()
pop_movies = movies_merge.groupby("title")
pop_movies["user_id"].count().sort_values(
ascending=False).reset_index().rename(
columns={"user_id": "score"},inplace=True)
pop_movies['Rank'] = pop_movies['score'].rank(
ascending=0, method='first')
pop_movies
Note that movies_merge.groupby("title")
does not return a df. Rather it returns a groupby object (see df.groupby
):
pop_movies = movies_merge.groupby("title")
print(type(pop_movies))
<class 'pandas.core.groupby.generic.DataFrameGroupBy'>
Hence, the calculate you perform on this object produces a new df, which you first need to assign to a variable for the .rename( columns={"user_id": "score"},inplace=True)
operation to be sensical:
pop_movies = pop_movies["user_id"].count().sort_values(
ascending=False).reset_index()
print(type(pop_movies))
<class 'pandas.core.frame.DataFrame'>
Now, the rest will work:
pop_movies.rename(
columns={"user_id": "score"},inplace=True)
pop_movies['Rank'] = pop_movies['score'].rank(
ascending=0, method='first')
print(pop_movies.head())
title score Rank
0 Star Wars (1977) 584 1.0
1 Contact (1997) 509 2.0
2 Fargo (1996) 508 3.0
3 Return of the Jedi (1983) 507 4.0
4 Liar Liar (1997) 485 5.0