how to input value based on another column value with python

Question:

My df has the variables ‘Country’ and ‘Region’. A few of the rows are missing a value for ‘Region’. I want to input the value for region based on the country name in the ‘Country’ column.

I tried:

df.loc[df['Country'] == 'Taiwan'], [df['Region'] == 'Eastern Asia']

And I get this error:

AttributeError: 'str' object has no attribute 'loc'

What does this error mean? What else can I try?

Asked By: Sue-Ann Habibe

||

Answers:

The error you are getting, "AttributeError: ‘str’ object has no attribute ‘loc’", means that you are trying to access the "loc" attribute on a string object, which is not supported. This error occurs because you are using square brackets instead of round brackets around your condition for the "loc" method. Square brackets are used to access columns or rows of a dataframe, whereas round brackets are used to pass arguments to a method.

To input the value for ‘Region’ based on the ‘Country’ column, you can try the following code (bash):

df.loc[df[‘Country’] == ‘Taiwan’, ‘Region’] = ‘Eastern Asia’

This code uses the "loc" method to select rows where the ‘Country’ column is equal to ‘Taiwan’ and then assigns the value ‘Eastern Asia’ to the ‘Region’ column for those rows.

If you have multiple missing values for ‘Region’, you can use a loop to iterate over the unique values in the ‘Country’ column and assign the corresponding ‘Region’ value for each row. Here is an example (bash):

for country in df[‘Country’].unique():
region = get_region_for_country(country) # replace with your own function to get the region for a country
df.loc[df[‘Country’] == country, ‘Region’] = region

This code loops over the unique values in the ‘Country’ column and assigns the corresponding ‘Region’ value for each row. You will need to replace the "get_region_for_country" function with your own function that retrieves the correct region for a given country.

Answered By: Leonard Durden
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.