Access multiple items with not equal to, !=


I have the following Pandas DataFrame object df. It is a train schedule listing the date of departure, scheduled time of departure, and train company.

import pandas as pd
df = 

            Year  Month DayofMonth  DayOfWeek  DepartureTime Train    Origin
1988-01-01  1988    1     1         5        1457      BritishRail   Leeds
1988-01-02  1988    1     2         6        1458      DeutscheBahn  Berlin
1988-01-03  1988    1     3         7        1459      SNCF           Lyons
1988-01-02  1988    1     2         6        1501      BritishRail   Ipswich
1988-01-02  1988    1     2         6        1503      NMBS          Brussels

Now, let’s say I wanted to select all items “DeutscheBahn” in the column “Train”.

I would use

DB = df[df['Train'] == 'DeutscheBahn']

Now, how can I select all trains except DeutscheBahn and British Rails and SNCF. How can I simultaneously choose the items not these?

notDB = df[df['Train'] != 'DeutscheBahn']


notSNCF = df[df['Train'] != 'SNCF']

but I am not sure how to combine these into one command.

df[df['Train'] != 'DeutscheBahn', 'SNCF']

doesn’t work.

Asked By: JianguoHisiang



df[~df['Train'].isin(['DeutscheBahn', 'SNCF'])]

isin returns the values in df['Train'] that are in the given list, and the ~ at the beginning is essentially a not operator.

Another working but longer syntax would be:

df[(df['Train'] != 'DeutscheBahn') & (df['Train'] != 'SNCF')]
Answered By: DeepSpace

I like using the query method as it’s a bit more clear

df = df.query("Train not in ['DeutscheBahn', 'British Rails', 'SNCF']")
Answered By: DrGabrielA81
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.