Find the average of last 25% from the input range in pandas

Question:

I have successfully imported temperature CSV file to Python Pandas DataFrame. I have also found the mean value of specific range:

df.loc[7623:23235, 'Temperature'].mean()

where ‘Temperature’ is Column title in DataFrame.

I would like to know if it is possible to change this function to find the average of last 25% (or 1/4) from the input range (7623:23235).

Asked By: fred

||

Answers:

To find the average of the last 25% of the values in a specific range of a column in a Pandas DataFrame, you can use the iloc indexer along with slicing and the mean method.

For example, given a DataFrame df with a column ‘Temperature’, you can find the average of the last 25% of the values in the range 7623:23235 like this:

import math

# Find the length of the range
length = 23235 - 7623 + 1

# Calculate the number of values to include in the average
n = math.ceil(length * 0.25)

# Calculate the index of the first value to include in the average
start_index = length - n

# Use iloc to slice the relevant range of values from the 'Temperature' column
# and calculate the mean of those values
mean = df.iloc[7623:23235]['Temperature'].iloc[start_index:].mean()

print(mean)

This code first calculates the length of the range, then calculates the number of values that represent 25% of that range. It then uses the iloc indexer to slice the relevant range of values from the ‘Temperature’ column and calculates the mean of those values using the mean method.

Note that this code assumes that the indices of the DataFrame are consecutive integers starting from 0. If the indices are not consecutive or do not start at 0, you may need to adjust the code accordingly.

Answered By: Rahul Mukati

Yes, you can use the quantile method to find the value that separates the last 25% of the values in the input range and then use the mean method to calculate the average of the values in the last 25%.

Here’s how you can do it:

quantile = df.loc[7623:23235, 'Temperature'].quantile(0.75)


mean = df.loc[7623:23235, 'Temperature'][df.loc[7623:23235, 'Temperature'] >= quantile].mean()
Answered By: SuperStew