Dynamic for loop function for filtering df based on provided parameters

Question:

If I wanted to create a new df that only had rows from the original df that fall into specified categories, what would be the most efficient way to do that?

df = sns.load_dataset('diamonds')

def makenewdf(cuts=['Ideal','Premium'], df=df):
[some kind of loop to dynamically filter df based on the values of cuts]

what would be the best way to make this function such that I could provide the categories I want to sequester?

ex: makenewdf(cuts = ['Good']) would return a df containing only rows where the cut was ‘Good’ and makenewdf(cuts = ['Good','Ideal','Premium']) would return a df with only rows containing one of the three values in cuts

Asked By: Baaridi

||

Answers:

You can subset the dataframe like this:

filtered_df = df[df['cuts'].isin(['Ideal', 'Premium'])]
Answered By: Christian

You’re searching for the isin() function, you can use something like this:

def makenewdf(cuts, df):
    return df[df.cut.isin(cuts)]

# Example
print(makenewdf(['Good'], df))

# Example
print(makenewdf(['Good','Ideal','Premium'], df))
Answered By: Lahcen YAMOUN
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.