How do I change value names to numbers in dataframe?

Question:

I have a dataframe which has 10k movie names and 40k actor names.

enter image description here

The reason is I’m trying to make a graph from nx but the graphic becomes unreadable because of the names of the actor. So I want to change their names to numbers. Some of these actors played on multiple movies which means they are exists more than once. I want to change all these actors to numbers like ‘Leslie Howard’ = ‘1’ and so on. I tried some loops and lists but I failed. I want to make a dictionary to be able to check which number was which actor. Can you help me?

Asked By: Berk Ak

||

Answers:

You can just do factorize

df['Movie_name'] = df['Movie_name'].factorize()[0]
df['Actor_name'] = df['Actor_name'].factorize()[0]
Answered By: BENY

Convert the column into type category and get their unique values with .cat.codes:

df['Actor_Name'] = df['Actor_Name'].astype('category').cat.codes
Answered By: rachwa

You could get all unique names of the column, generate a dictionary and then use map to change the values to the numbers. At the same time you have the dictionary to check to which actor the number refers.

all_names = df['Actor_Name'].unique()
dic = dict((v,k) for k,v in enumerate(all_names))

df['Actor_Name'] = df['Actor_Name'].map(dic)
Answered By: Rabinzel
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.