Pandas filter dataframe rows with a specific year

Question:

I have a dataframe df and it has a Date column. I want to create two new data frames. One which contains all of the rows from df where the year equals some_year and another data frame which contains all of the rows of df where the year does not equal some_year. I know you can do df.ix['2000-1-1' : '2001-1-1'] but in order to get all of the rows which are not in 2000 requires creating 2 extra data frames and then concatenating/joining them.

Is there some way like this?

include = df[df.Date.year == year]
exclude = df[df['Date'].year != year]

This code doesn’t work, but is there any similar sort of way?

Asked By: user3494047

||

Answers:

You can simplify it by inverting mask by ~ and for condition use Series.dt.year with int for cast string year:

mask = df['Date'].dt.year == int(year)
include = df[mask]
exclude = df[~mask]
Answered By: jezrael

You can use datetime accesor.

import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])

include = df[df['Date'].dt.year == year]
exclude = df[df['Date'].dt.year != year]
Answered By: Vaishali
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.