Pandas – Replace cell values using a conditional (normalising string input for gender)

Question

Example data

id	Gender	Age
1	F	22
2	Fem	18
3	male	45
4	She/Her	30
5	Male	25
6	Non-bianary	26
7	M	18
8	female	20
9	Male	56

I want to be able to standardise this somewhat by replacing all cells with an ‘F’ in them with ‘Female’, and all cells with ‘M’ in them with ‘Male’. I know the first step is to cast the whole column into capitals

df.Gender = df.Gender.str.capitalize()

and I know that I can do it value-by-value with

df['Gender'] = df['Gender'].replace(['F', 'Fem', 'Female'], 'Female')

but is there a way to do this somewhat programmatically?

such as

df.Gender = df.Gender.str.capitalise()

for i in df.Gender:
    if 'F' in str(i):
        #pd.replace call something like...
        df[df.Gender == i] = 'Female'
        #I know that line is very wrong
    elif 'M' in str(i)...

Asked By: KevOMalley743

||

Source

Answer 1

Try using regex:

import re

df["Gender"] = df["Gender"].str.replace(
    r"^FS*$", "Female", flags=re.I, regex=True
)
print(df)

Prints:

   id       Gender  Age
0   1       Female   22
1   2       Female   18
2   3         male   45
3   4      She/Her   30
4   5         Male   25
5   6  Non-bianary   26
6   7            M   18
7   8       Female   20
8   9         Male   56

Answered By: Andrej Kesely

Answer 2

Yes, you can loop through the df like that:

for indx, row in df.iterrows:
    if row["Gender"] == "F": #Or other conditions
        df.loc[index,"Gender"] = "Female"
    else:
        pass #or whatever condition u want to add

Is this what u asked for ?
Although its more efficient to do like that @Andrej Kesely Answered

Answered By: Omar

Answer 3

df['Gender'][df['Gender'].isin(['F', 'Fem', 'Female'])] = 'Female'

Answered By: jkhadka

Pandas – Replace cell values using a conditional (normalising string input for gender)

Question:

Answers: