how can I count the occurrences > than a value for each year of a data frame

Question

I have a data frame with the values of precipitations day per day.
I would like to do a sort of resample, so instead of day per day the data is collected year per year and every year has a column that contains the number of times it rained more than a certain value.

Date	Precipitation
2000-01-01	1
2000-01-03	6
2000-01-03	5
2001-01-01	3
2001-01-02	1
2001-01-03	0
2002-01-01	10
2002-01-02	8
2002-01-03	12

what I want is to count every year how many times Precipitation > 2

Date	Count
2000	2
2001	1
2002	3

I tried using resample() but with no results

Asked By: Tatthew

||

Source

Answer 1

@Tatthew you can do this with GroupBy.apply:

import pandas as pd
df = pd.DataFrame({'Date': ['2000-01-01', '2000-01-03',
                            '2000-01-03', '2001-01-01',
                            '2001-01-02', '2001-01-03',
                            '2002-01-01', '2002-01-02',
                            '2002-01-03'],
                   'Precipitation': [1, 6, 5, 3, 1, 0,
                                     10, 8, 12]})
df = df.astype({'Date': datetime64})
df.groupby(df.Date.dt.year).apply(lambda df: df.Precipitation[df.Precipitation > 2].count())

Answered By: Mahesh Vashishtha

Answer 2

You can use this bit of code:

# convert "Precipitation" and "date" values to proper types
df['Precipitation'] = df['Precipitation'].astype(int)
df["date"] = pd.to_datetime(df["date"])

# find rows that have "Precipitation" > 2
df['Count']= df.apply(lambda x: x["Precipitation"] > 2, axis=1)

# group df by year and drop the "Precipitation" column
df.groupby(df['date'].dt.year).sum().drop(columns=['Precipitation'])

Answered By: amirhossein nazary

Answer 3

@Tatthew you can do this with query and Groupby.size too.

import pandas as pd
df = pd.DataFrame({'Date': ['2000-01-01', '2000-01-03',
                            '2000-01-03', '2001-01-01',
                            '2001-01-02', '2001-01-03',
                            '2002-01-01', '2002-01-02',
                            '2002-01-03'],
                   'Precipitation': [1, 6, 5, 3, 1, 0,
                                     10, 8, 12]})
df = df.astype({'Date': datetime64})
above_threshold = df.query('Precipitation > 2')
above_threshold.groupby(above_threshold.Date.dt.year).size()

Answered By: Mahesh Vashishtha

how can I count the occurrences > than a value for each year of a data frame

Question:

Answers: