Comparing Pandas Dataframe column with List

Question:

I am trying to compare a Pandas Dataframe with a List. I have extracted IDs to a list, called list_x;

Since I have several rows with the same ID, this is reflected on the list. i.e list_x = [1,1,1,1,2,3, etc.]

I am trying to drop all dataframe entries that have an ID that is also in the list

what I have been trying are variations of:

for j in range(len(dataframe)-1):
    if dataframe.loc(j,"ID") in list_x: dataframe.drop([j], inplace = True)

or variations of

for j in range(len(dataframe)-1):
    for k in range(len(list_x)-1):
        if dataframe.loc(j,"ID") in list_x[k]: dataframe.drop([j], inplace = True)

I get an error which I think comes from the fact I am comparing the list’s index with the dataframe, and not the actual list entry.

Any help would be appreciated.

Asked By: lobsterini

||

Answers:

You want to get the dataframe without rows associated to IDs in list_x.
So you can go for this :

# your df (2 columns : ID and value)
df = pd.DataFrame({'ID': [1,3,5,6,7], 'value' : ['red', 'blue', 'green', 'orange', 'purple']})

# the list of IDs you don't want in your the dataframe
list_x = [1,1,2,3,5]

# the output
df = df[~df.ID.isin(list_x)]
Answered By: koding_buse
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.